view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models 3 days ago • 89
DataComp-LM: In search of the next generation of training sets for language models Paper • 2406.11794 • Published 9 days ago • 37
Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper • 2406.14491 • Published 6 days ago • 64
Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models Paper • 2406.12649 • Published 8 days ago • 14
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence Paper • 2406.11931 • Published 9 days ago • 53
From Pixels to Prose: A Large Dataset of Dense Image Captions Paper • 2406.10328 • Published 12 days ago • 16
MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens Paper • 2406.11271 • Published 10 days ago • 10
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages Paper • 2406.10118 • Published 12 days ago • 25
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text Paper • 2406.08418 • Published 14 days ago • 28
view article Article From DeepSpeed to FSDP and Back Again with Hugging Face Accelerate 14 days ago • 23
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels Paper • 2406.09415 • Published 13 days ago • 47
PowerInfer-2: Fast Large Language Model Inference on a Smartphone Paper • 2406.06282 • Published 16 days ago • 34
What If We Recaption Billions of Web Images with LLaMA-3? Paper • 2406.08478 • Published 14 days ago • 38
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models Paper • 2406.06563 • Published 24 days ago • 17
An Image is Worth 32 Tokens for Reconstruction and Generation Paper • 2406.07550 • Published 15 days ago • 52
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Paper • 2406.06525 • Published 16 days ago • 60
GenAI Arena: An Open Evaluation Platform for Generative Models Paper • 2406.04485 • Published 20 days ago • 19
view article Article Extracting Concepts from LLMs: Anthropic’s recent discoveries 📖 By m-ric • 6 days ago • 25
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model Paper • 2406.04333 • Published 20 days ago • 36
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark Paper • 2406.01574 • Published 23 days ago • 42
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis Paper • 2405.21075 • Published 26 days ago • 15
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality Paper • 2405.21060 • Published 26 days ago • 60
Jina CLIP: Your CLIP Model Is Also Your Text Retriever Paper • 2405.20204 • Published 27 days ago • 27
Contextual Position Encoding: Learning to Count What's Important Paper • 2405.18719 • Published 29 days ago • 3
Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities Paper • 2405.18669 • Published 29 days ago • 11
Nearest Neighbor Speculative Decoding for LLM Generation and Attribution Paper • 2405.19325 • Published 28 days ago • 13
view article Article Train custom AI models with the trainer API and adapt them to 🤗 By not-lain • 25 days ago • 24
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 30 days ago • 103
Aya 23: Open Weight Releases to Further Multilingual Progress Paper • 2405.15032 • Published May 23 • 21
ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models Paper • 2405.15738 • Published May 24 • 43
FAdam: Adam is a natural gradient optimizer using diagonal empirical Fisher information Paper • 2405.12807 • Published May 21 • 1
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention Paper • 2405.12981 • Published May 21 • 26
Imp: Highly Capable Large Multimodal Models for Mobile Devices Paper • 2405.12107 • Published May 20 • 23
FIFO-Diffusion: Generating Infinite Videos from Text without Training Paper • 2405.11473 • Published May 19 • 53
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning Paper • 2405.12130 • Published May 20 • 44
view article Article Decoding GPT-4'o': In-Depth Exploration of Its Mechanisms and Creating Similar AI. By KingNish • May 21 • 25