Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities Paper • 2406.14562 • Published 11 days ago • 26
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models 8 days ago • 112
synthetic-data-generation-demos Collection A collection of demos for various approaches to synthetic data generation • 4 items • Updated 6 days ago • 8
TabuLa-8B Collection Training, eval suite, and model from the paper "Large Scale Transfer Learning for Tabular Data via Language Modeling" https://arxiv.org/abs/2406.12031 • 4 items • Updated 12 days ago • 8
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning Paper • 2310.09478 • Published Oct 14, 2023 • 17
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks Paper • 2311.06242 • Published Nov 10, 2023 • 66
view article Article Using 🤗 to Train a GPT-2 Model for Music Generation By juancopi81 • Oct 5, 2023 • 6
view article Article Introducing the Ultimate SEC LLM: Revolutionizing Financial Insights with Llama-3-70B By Crystalcareai • 13 days ago • 6
4M Models Collection Multimodal models from https://4m.epfl.ch/ • 14 items • Updated 17 days ago • 29
Tulu V2.5 Suite Collection A suite of models trained using DPO and PPO across a wide variety (up to 14) of preference datasets. See https://arxiv.org/abs/2406.09279 for more! • 41 items • Updated 17 days ago • 8
Magpie-Pro Collection Dataset built with Meta Llama 3 70B. Models are fine-tuned from Llama 3 8B. • 8 items • Updated 15 days ago • 14
Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 29 items • Updated 25 days ago • 220
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 May 28 • 110
AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct Paper • 2405.14906 • Published May 23 • 21
Sparse Foundational Llama 2 Models Collection Sparse pre-trained and fine-tuned Llama models made by Neural Magic + Cerebras • 27 items • Updated 17 days ago • 7
C4AI Aya 23 Collection Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. • 3 items • Updated May 23 • 38
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 22 items • Updated May 31 • 348
IndicGenBench Collection Datasets released in "IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs" (https://arxiv.org/abs/2404.16816) • 4 items • Updated 4 days ago • 3
Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM Paper • 2401.02994 • Published Jan 4 • 45
view article Article StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation Apr 29 • 70
LLaVA++ (LLaMA-3 and Phi-3-Mini) Collection Extending Visual Capabilities of LLaVA with LLaMA-3 and Phi-3 • 11 items • Updated 20 days ago • 22
view article Article seemore: Implement a Vision Language Model from Scratch By AviSoori1x • 8 days ago • 48
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15 • 144
Quantized-FT-Orca-Math Collection Models trained during quantization aware fine-tuning experiments using PyTorch's FSDP. • 8 items • Updated Apr 16 • 7
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs Paper • 2404.05719 • Published Apr 8 • 57
Eurus Collection Advancing LLM Reasoning Generalists with Preference Trees • 11 items • Updated Apr 15 • 23
Aya Indic Suite Collection An Indic language filtered dataset from the Aya dataset collection. • 9 items • Updated Mar 31 • 1
StarChat2 15B Collection Model, datasets, and demo for StarChat2 15B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 10 items • Updated Apr 12 • 13
Unifying Vision, Text, and Layout for Universal Document Processing Paper • 2212.02623 • Published Dec 5, 2022 • 10
Design2Code: How Far Are We From Automating Front-End Engineering? Paper • 2403.03163 • Published Mar 5 • 92
OpenMath Collection A collection of models and datasets introduced in "OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset" • 15 items • Updated 17 days ago • 32
⛔️🔦 Provenance, Watermarking & Deepfake Detection Collection Technical tools for more control over non-consensual synthetic content • 14 items • Updated Apr 1 • 37
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution Paper • 2401.03065 • Published Jan 5 • 10
Understanding LLMs: A Comprehensive Overview from Training to Inference Paper • 2401.02038 • Published Jan 4 • 60
ControlLLM: Augment Language Models with Tools by Searching on Graphs Paper • 2310.17796 • Published Oct 26, 2023 • 15
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages Paper • 2309.09400 • Published Sep 17, 2023 • 77
Scaling Relationship on Learning Mathematical Reasoning with Large Language Models Paper • 2308.01825 • Published Aug 3, 2023 • 19