A Technical Report for Polyglot-Ko: Open-Source Large-Scale Korean Language Models Paper • 2306.02254 • Published Jun 4, 2023 • 10
view article Article Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models Mar 20 • 31
view article Article Ethics and Society Newsletter #6: Building Better AI: The Importance of Data Quality 5 days ago • 17
LLM Compiler Collection Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. • 4 items • Updated 1 day ago • 97
view article Article Decoding GPT-4'o': In-Depth Exploration of Its Mechanisms and Creating Similar AI. By KingNish • May 21 • 26
Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language Models Paper • 2406.11230 • Published 12 days ago • 34
VideoLLM-online: Online Video Large Language Model for Streaming Video Paper • 2406.11816 • Published 12 days ago • 20
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs Paper • 2406.16860 • Published 5 days ago • 45
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions Paper • 2406.15877 • Published 7 days ago • 39
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation Paper • 2406.16855 • Published 5 days ago • 52
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper • 2406.17557 • Published 4 days ago • 66
mDPO: Conditional Preference Optimization for Multimodal Large Language Models Paper • 2406.11839 • Published 12 days ago • 35
DataComp-LM: In search of the next generation of training sets for language models Paper • 2406.11794 • Published 12 days ago • 39
MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMs Paper • 2406.11833 • Published 12 days ago • 60
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models 5 days ago • 106
A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation Paper • 2310.16656 • Published Oct 25, 2023 • 39
Mixture-of-Agents Enhances Large Language Model Capabilities Paper • 2406.04692 • Published 22 days ago • 49
view article Article BM25 for Python: Achieving high performance while simplifying dependencies with *BM25S*⚡ By xhluca • 3 days ago • 22
Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper • 2406.14491 • Published 9 days ago • 74
view article Article Enhancing Image Model Dreambooth Training Through Effective Captioning: Key Observations By alvdansen • 10 days ago • 11
MosaicBERT Collection A collection of BERT-based models of different sequence lengths trained on the C4 dataset. Details: https://mosaicbert.github.io/ • 5 items • Updated Dec 27, 2023 • 4
Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering Paper • 2406.10208 • Published 15 days ago • 21
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text Paper • 2406.08418 • Published 17 days ago • 28
view article Article BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks 11 days ago • 30
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks Paper • 2311.06242 • Published Nov 10, 2023 • 65
How Do Large Language Models Acquire Factual Knowledge During Pretraining? Paper • 2406.11813 • Published 12 days ago • 28
view article Article From DeepSpeed to FSDP and Back Again with Hugging Face Accelerate 16 days ago • 24
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs Paper • 2406.07476 • Published 18 days ago • 30
Nemotron 4 340B Collection Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated 15 days ago • 145
Magpie-Pro Collection Dataset built with Meta Llama 3 70B. Models are fine-tuned from Llama 3 8B. • 8 items • Updated 13 days ago • 14
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B Paper • 2406.07394 • Published 18 days ago • 16
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers Paper • 2406.05370 • Published 21 days ago • 12
Gemma release Collection Groups the Gemma models released by the Google team. • 40 items • Updated 1 day ago • 317
BERT release Collection Regroups the original BERT models released by the Google team. Except for the models marked otherwise, the checkpoints support English. • 8 items • Updated 1 day ago • 17
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model Paper • 2406.04333 • Published 23 days ago • 36
Audio Mamba: Bidirectional State Space Model for Audio Representation Learning Paper • 2406.03344 • Published 24 days ago • 15
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark Paper • 2406.01574 • Published 26 days ago • 42
MAmmoTH2 Collection Scaling up instruction data from the web for to build better LLMs • 11 items • Updated May 26 • 7
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback Paper • 2406.00888 • Published 26 days ago • 29