LLM Compiler Collection Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. • 4 items • Updated 1 day ago • 97
MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers Paper • 2406.10163 • Published 15 days ago • 26
Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper • 2406.14491 • Published 9 days ago • 74
4M Models Collection Multimodal models from https://4m.epfl.ch/ • 14 items • Updated 14 days ago • 29
VideoLLaMA 2 Collection Optimized VideoLLaMA with improved spatial-temporal modeling and better audio understanding capability • 6 items • Updated 5 days ago • 5
Parakeet Collection NeMo Parakeet ASR Models attain strong speech recognition accuracy while being efficient for inference. Available in CTC and RNN-Transducer variants. • 6 items • Updated 15 days ago • 15
SteerLM Collection A collection of models and datasets relating to SteerLM and HelpSteer. • 7 items • Updated 11 days ago • 11
Nemotron 4 340B Collection Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated 15 days ago • 145
Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 29 items • Updated 23 days ago • 217
Zephyr 7B Collection Models, datasets, and demos associated with Zephyr 7B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 9 items • Updated Apr 12 • 141
Zephyr ORPO Collection Models and datasets to align LLMs with Odds Ratio Preference Optimisation (ORPO). Recipes here: https://github.com/huggingface/alignment-handbook • 3 items • Updated Apr 12 • 15
CommonCatalog Collection Common Catalog, a dataset with Creative Commons licensed images and machine-generated caption pairs • 8 items • Updated May 16 • 12
🕹️ AI Games Collection An ongoing collection of games you can play on HF Spaces • 14 items • Updated 9 days ago • 22
Granite Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 20 items • Updated about 7 hours ago • 145
Llama3-ChatQA-1.5 Collection Llama3-ChatQA-1.5 models excel at conversational question answering (QA) and retrieval-augmented generation (RAG). • 6 items • Updated 15 days ago • 37
Arctic Collection A collection of pre-trained dense-MoE Hybrid transformer models • 2 items • Updated Apr 24 • 20
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15 • 144
Aurora-M models Collection Aurora-M models (base, biden-harris redteams and instruct) • 5 items • Updated May 6 • 17
A little guide to building Large Language Models in 2024 Collection Resources mentioned by @thomwolf in https://x.com/Thom_Wolf/status/1773340316835131757 • 19 items • Updated Apr 1 • 14
The SPRIGHT T2I collection Collection This collection contains the datasets, model, paper, and demo associated with the SPRIGHT (SPatially RIGHT) release. • 5 items • Updated Apr 2 • 4
The Case for Co-Designing Model Architectures with Hardware Paper • 2401.14489 • Published Jan 25 • 2
Qwen1.5 Collection Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated 23 days ago • 198
DBRX Collection DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27 • 89
StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control Paper • 2403.09055 • Published Mar 14 • 24
Wav2Vec 2.0 Collection A collection for the first release of Wav2Vec 2.0, a speech encoder that learns powerful representations from unlabelled audio data. • 8 items • Updated Jan 16 • 12
Load 4bit models 4x faster Collection Native bitsandbytes 4bit pre quantized models • 17 items • Updated 20 days ago • 29
WhisperKit Collection Models, datasets and evaluation results for WhisperKit: https://github.com/argmaxinc/WhisperKit • 3 items • Updated 12 days ago • 5
Long-Form Test Sets Collection A collection of long-form (samples > 30s) datasets used to evaluate the Distil-Whisper models. • 5 items • Updated Mar 21 • 5
Training Datasets Collection A collection of pseudo-labelled datasets used to train the Distil-Whisper model. • 9 items • Updated Mar 21 • 12
distil-large-v3 Collection This collection contains the model repositories for distil-large-v3, which provides support for the most popular Whisper libraries. • 4 items • Updated Mar 21 • 4
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling Paper • 2311.00430 • Published Nov 1, 2023 • 53
VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis Paper • 2403.08764 • Published Mar 13 • 34
Awesome Document AI Collection A collection of open-source document AI 📄 📝 📈 • 27 items • Updated Mar 11 • 40
Beyond Language Models: Byte Models are Digital World Simulators Paper • 2402.19155 • Published Feb 29 • 46
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions Paper • 2402.17485 • Published Feb 27 • 184
Matryoshka Embedding Models Collection https://huggingface.co/blog/matryoshka • 14 items • Updated 25 days ago • 10
OpenMath Collection A collection of models and datasets introduced in "OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset" • 15 items • Updated 15 days ago • 32
InstructRetro Collection InstructRetro is an autoregressive decoder-only language model (LM) with retrieval-augmented pretraining and instruction tuning. • 4 items • Updated 15 days ago • 8