LLM Compiler Collection Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. • 4 items • Updated 1 day ago • 97
Leveraging Passage Embeddings for Efficient Listwise Reranking with Large Language Models Paper • 2406.14848 • Published 8 days ago • 2
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models 5 days ago • 106
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs Paper • 2406.15319 • Published 8 days ago • 52
Toucan: Token-Aware Character Level Language Modeling Paper • 2311.08620 • Published Nov 15, 2023 • 3
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs Paper • 2406.16860 • Published 5 days ago • 45
Efficient Continual Pre-training by Mitigating the Stability Gap Paper • 2406.14833 • Published 8 days ago • 18
WARP: On the Benefits of Weight Averaged Rewarded Policies Paper • 2406.16768 • Published 5 days ago • 19
view article Article Recommendation to Revisit the Diffuser Default LoRA Parameters By alvdansen • 8 days ago • 9
4M Models Collection Multimodal models from https://4m.epfl.ch/ • 14 items • Updated 14 days ago • 29
Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models Paper • 2406.13542 • Published 10 days ago • 14
Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities Paper • 2406.14562 • Published 9 days ago • 26
GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices Paper • 2406.08451 • Published 17 days ago • 22
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text Paper • 2406.08418 • Published 17 days ago • 28
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning Paper • 2406.08973 • Published 16 days ago • 85
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion Paper • 2406.04338 • Published 23 days ago • 32
view article Article CryptGPT: Privacy-Preserving Language Models Using Vigenere Cipher (Part 1) By diwank • 13 days ago • 4
Discovering Preference Optimization Algorithms with and for Large Language Models Paper • 2406.08414 • Published 17 days ago • 12
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning Paper • 2406.06469 • Published 19 days ago • 22
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild Paper • 2406.04770 • Published 22 days ago • 23
view article Article An Analysis of Chinese LLM Censorship and Bias with Qwen 2 Instruct By leonardlin • 18 days ago • 40
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback Paper • 2406.00888 • Published 26 days ago • 29
Frustratingly Simple Memory Efficiency for Pre-trained Language Models via Dynamic Embedding Pruning Paper • 2309.08708 • Published Sep 15, 2023 • 3
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data Paper • 2405.14333 • Published May 23 • 28
SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound Paper • 2405.00233 • Published Apr 30 • 12
Self-Play Preference Optimization for Language Model Alignment Paper • 2405.00675 • Published May 1 • 19
LLM-AD: Large Language Model based Audio Description System Paper • 2405.00983 • Published May 2 • 13
FLAME: Factuality-Aware Alignment for Large Language Models Paper • 2405.01525 • Published May 2 • 21
Granite Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 20 items • Updated about 8 hours ago • 145
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions Paper • 2404.13208 • Published Apr 19 • 38
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation Paper • 2405.01434 • Published May 2 • 49
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published May 2 • 106
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation Paper • 2402.03216 • Published Feb 5 • 2
Flexibly Scaling Large Language Models Contexts Through Extensible Tokenization Paper • 2401.07793 • Published Jan 15 • 3
INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning Paper • 2401.06532 • Published Jan 12 • 10
Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published Apr 30 • 65
Nomic Embed: Training a Reproducible Long Context Text Embedder Paper • 2402.01613 • Published Feb 2 • 13
🦢SWIM-IR Dataset Collection 29 million Synthetic Wikipedia-based Multilingual Retrieval Training Pairs. • 4 items • Updated Apr 28 • 7
view article Article 🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets By dvilasuero • 25 days ago • 63
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22 • 240
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 22 items • Updated 29 days ago • 346
Eurus Collection Advancing LLM Reasoning Generalists with Preference Trees • 11 items • Updated Apr 15 • 23