ViDoRe Benchmark Collection Benchmark for document retrieval using visual features, introduced in the ColPali paper. Datasets are using the QA format. • 10 items • Updated 24 days ago • 13
view article Article Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth By mlabonne • Jul 29, 2024 • 277
NuminaMath Collection Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize • 7 items • Updated 6 days ago • 74
view article Article Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models Mar 20, 2024 • 78
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 713
Saul-7B: A pioneering Large Language Model for Law Collection We introduce SaulLM-7B, a LLM tailored for the legal domain trained on 30 billion tokens of legal data. Released under MIT License. • 4 items • Updated Mar 7, 2024 • 18
read papers Collection This is a collection of some papers I've read in the past few months • 10 items • Updated Nov 21, 2023 • 47