view article Article Releasing the largest multilingual open pretraining dataset By Pclanglais • 9 days ago • 94
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 10 items • Updated about 17 hours ago • 172
QTIP Quantized Models Collection See https://github.com/Cornell-RelaxML/qtip • 27 items • Updated about 8 hours ago • 5
VILA-U-7B Collection VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation • 2 items • Updated Oct 22 • 5
VPTQ Mistral Large Instruct 2407 without finetune Collection arxiv.org/abs/2409.17066, VPTQ Mistral Large Instruct 2407 without finetune • 8 items • Updated Oct 18 • 1
Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models Paper • 2410.02416 • Published Oct 3 • 25
Gemma-APS Release Collection Gemma models for text-to-propositions segmentation. The models are distilled from fine-tuned Gemini Pro model applied to multi-domain synthetic data. • 3 items • Updated Oct 15 • 19
Scalable and Domain-General Abstractive Proposition Segmentation Paper • 2406.19803 • Published Jun 28 • 2
Falcon Mamba: The First Competitive Attention-free 7B Language Model Paper • 2410.05355 • Published Oct 7 • 29
ProLong Collection ProLong is a family of long-context models that are continued trained and supervised fine-tuned from Llama-3-8B, with a maximum context window of 512K • 7 items • Updated about 1 month ago • 4