zzfive
's Collections
Self-Rewarding Language Models
Paper
•
2401.10020
•
Published
•
145
Orion-14B: Open-source Multilingual Large Language Models
Paper
•
2401.12246
•
Published
•
12
MambaByte: Token-free Selective State Space Model
Paper
•
2401.13660
•
Published
•
52
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper
•
2401.13601
•
Published
•
45
OLMo: Accelerating the Science of Language Models
Paper
•
2402.00838
•
Published
•
82
Dolma: an Open Corpus of Three Trillion Tokens for Language Model
Pretraining Research
Paper
•
2402.00159
•
Published
•
61
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs
Miss
Paper
•
2402.10790
•
Published
•
41
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM
Workflows
Paper
•
2402.10379
•
Published
•
30
Paper
•
2402.13144
•
Published
•
95
How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on
Deceptive Prompts
Paper
•
2402.13220
•
Published
•
13
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper
•
2402.13753
•
Published
•
113
OpenCodeInterpreter: Integrating Code Generation with Execution and
Refinement
Paper
•
2402.14658
•
Published
•
82
Linear Transformers are Versatile In-Context Learners
Paper
•
2402.14180
•
Published
•
6
Watermarking Makes Language Models Radioactive
Paper
•
2402.14904
•
Published
•
23
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and
Two-Phase Partition
Paper
•
2402.15220
•
Published
•
19
Genie: Generative Interactive Environments
Paper
•
2402.15391
•
Published
•
70
ChatMusician: Understanding and Generating Music Intrinsically with LLM
Paper
•
2402.16153
•
Published
•
56
MegaScale: Scaling Large Language Model Training to More Than 10,000
GPUs
Paper
•
2402.15627
•
Published
•
34
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
Paper
•
2402.16840
•
Published
•
23
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper
•
2402.17764
•
Published
•
604
Video as the New Language for Real-World Decision Making
Paper
•
2402.17139
•
Published
•
18
Beyond Language Models: Byte Models are Digital World Simulators
Paper
•
2402.19155
•
Published
•
49
Resonance RoPE: Improving Context Length Generalization of Large
Language Models
Paper
•
2403.00071
•
Published
•
22
DenseMamba: State Space Models with Dense Hidden Connection for
Efficient Large Language Models
Paper
•
2403.00818
•
Published
•
15
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper
•
2403.03507
•
Published
•
183
Gemini 1.5: Unlocking multimodal understanding across millions of tokens
of context
Paper
•
2403.05530
•
Published
•
61
DeepSeek-VL: Towards Real-World Vision-Language Understanding
Paper
•
2403.05525
•
Published
•
40
Synth^2: Boosting Visual-Language Models with Synthetic Captions and
Image Embeddings
Paper
•
2403.07750
•
Published
•
21
MoAI: Mixture of All Intelligence for Large Language and Vision Models
Paper
•
2403.07508
•
Published
•
74
VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision
Understanding
Paper
•
2403.09530
•
Published
•
8
Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling
and Visual-Language Co-Referring
Paper
•
2403.09333
•
Published
•
14
Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for
Large Language Models
Paper
•
2403.12881
•
Published
•
16
HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal
Large Language Models
Paper
•
2403.13447
•
Published
•
18
Mini-Gemini: Mining the Potential of Multi-modality Vision Language
Models
Paper
•
2403.18814
•
Published
•
45
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
Paper
•
2403.18421
•
Published
•
22
Jamba: A Hybrid Transformer-Mamba Language Model
Paper
•
2403.19887
•
Published
•
104
Direct Preference Optimization of Video Large Multimodal Models from
Language Model Reward
Paper
•
2404.01258
•
Published
•
10
WavLLM: Towards Robust and Adaptive Speech Large Language Model
Paper
•
2404.00656
•
Published
•
10
CodeEditorBench: Evaluating Code Editing Capability of Large Language
Models
Paper
•
2404.03543
•
Published
•
15
LVLM-Intrepret: An Interpretability Tool for Large Vision-Language
Models
Paper
•
2404.03118
•
Published
•
23
MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with
Interleaved Visual-Textual Tokens
Paper
•
2404.03413
•
Published
•
25
ReFT: Representation Finetuning for Language Models
Paper
•
2404.03592
•
Published
•
91
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper
•
2404.04167
•
Published
•
12
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
Paper
•
2404.05961
•
Published
•
64
Rho-1: Not All Tokens Are What You Need
Paper
•
2404.07965
•
Published
•
87
RecurrentGemma: Moving Past Transformers for Efficient Open Language
Models
Paper
•
2404.07839
•
Published
•
43
Applying Guidance in a Limited Interval Improves Sample and Distribution
Quality in Diffusion Models
Paper
•
2404.07724
•
Published
•
13
Megalodon: Efficient LLM Pretraining and Inference with Unlimited
Context Length
Paper
•
2404.08801
•
Published
•
65
TriForce: Lossless Acceleration of Long Sequence Generation with
Hierarchical Speculative Decoding
Paper
•
2404.11912
•
Published
•
16
AutoCrawler: A Progressive Understanding Web Agent for Web Crawler
Generation
Paper
•
2404.12753
•
Published
•
41
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your
Phone
Paper
•
2404.14219
•
Published
•
253
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study
Paper
•
2404.14047
•
Published
•
44
FlowMind: Automatic Workflow Generation with LLMs
Paper
•
2404.13050
•
Published
•
33
Multi-Head Mixture-of-Experts
Paper
•
2404.15045
•
Published
•
59
WildChat: 1M ChatGPT Interaction Logs in the Wild
Paper
•
2405.01470
•
Published
•
61
Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model
Paper
•
2405.09215
•
Published
•
18
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Paper
•
2405.11143
•
Published
•
34
Imp: Highly Capable Large Multimodal Models for Mobile Devices
Paper
•
2405.12107
•
Published
•
25
AlignGPT: Multi-modal Large Language Models with Adaptive Alignment
Capability
Paper
•
2405.14129
•
Published
•
12
ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal
Models
Paper
•
2405.15738
•
Published
•
43
Stacking Your Transformers: A Closer Look at Model Growth for Efficient
LLM Pre-Training
Paper
•
2405.15319
•
Published
•
25
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding
Models
Paper
•
2405.17428
•
Published
•
17
Value-Incentivized Preference Optimization: A Unified Approach to Online
and Offline RLHF
Paper
•
2405.19320
•
Published
•
10
Offline Regularised Reinforcement Learning for Large Language Models
Alignment
Paper
•
2405.19107
•
Published
•
14
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback
Paper
•
2406.00888
•
Published
•
30
Xmodel-LM Technical Report
Paper
•
2406.02856
•
Published
•
8
Mixture-of-Agents Enhances Large Language Model Capabilities
Paper
•
2406.04692
•
Published
•
55
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts
Language Models
Paper
•
2406.06563
•
Published
•
17
Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated
Parameters
Paper
•
2406.05955
•
Published
•
22
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs
with Nothing
Paper
•
2406.08464
•
Published
•
65
Discovering Preference Optimization Algorithms with and for Large
Language Models
Paper
•
2406.08414
•
Published
•
14
HelpSteer2: Open-source dataset for training top-performing reward
models
Paper
•
2406.08673
•
Published
•
16
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context
Language Modeling
Paper
•
2406.07522
•
Published
•
37
Self-play with Execution Feedback: Improving Instruction-following
Capabilities of Large Language Models
Paper
•
2406.13542
•
Published
•
16
Iterative Length-Regularized Direct Preference Optimization: A Case
Study on Improving 7B Language Models to GPT-4 Level
Paper
•
2406.11817
•
Published
•
12
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs
Paper
•
2406.15319
•
Published
•
62
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls
and Complex Instructions
Paper
•
2406.15877
•
Published
•
45
Scaling Laws for Linear Complexity Language Models
Paper
•
2406.16690
•
Published
•
22
Sparser is Faster and Less is More: Efficient Sparse Attention for
Long-Range Transformers
Paper
•
2406.16747
•
Published
•
18
OlympicArena Medal Ranks: Who Is the Most Intelligent AI So Far?
Paper
•
2406.16772
•
Published
•
2
Unlocking Continual Learning Abilities in Language Models
Paper
•
2406.17245
•
Published
•
28
Direct Preference Knowledge Distillation for Large Language Models
Paper
•
2406.19774
•
Published
•
21
AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for
Retrieval-Augmented Generation
Paper
•
2406.19251
•
Published
•
8
RegMix: Data Mixture as Regression for Language Model Pre-training
Paper
•
2407.01492
•
Published
•
35
Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical
Reasoning
Paper
•
2407.00782
•
Published
•
23
DogeRM: Equipping Reward Models with Domain Knowledge through Model
Merging
Paper
•
2407.01470
•
Published
•
5
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems
Paper
•
2407.01370
•
Published
•
86
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via
Dynamic Sparse Attention
Paper
•
2407.02490
•
Published
•
23
To Forget or Not? Towards Practical Knowledge Unlearning for Large
Language Models
Paper
•
2407.01920
•
Published
•
13
Eliminating Position Bias of Language Models: A Mechanistic Approach
Paper
•
2407.01100
•
Published
•
6
DotaMath: Decomposition of Thought with Code Assistance and
Self-correction for Mathematical Reasoning
Paper
•
2407.04078
•
Published
•
17
LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation
Capabilities Beyond 100 Languages
Paper
•
2407.05975
•
Published
•
34
InverseCoder: Unleashing the Power of Instruction-Tuned Code LLMs with
Inverse-Instruct
Paper
•
2407.05700
•
Published
•
10
PAS: Data-Efficient Plug-and-Play Prompt Augmentation System
Paper
•
2407.06027
•
Published
•
8
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in
Large Language Models Using Only Attention Maps
Paper
•
2407.07071
•
Published
•
11
AgentInstruct: Toward Generative Teaching with Agentic Flows
Paper
•
2407.03502
•
Published
•
50
Inference Performance Optimization for Large Language Models on CPUs
Paper
•
2407.07304
•
Published
•
52
SpreadsheetLLM: Encoding Spreadsheets for Large Language Models
Paper
•
2407.09025
•
Published
•
129
Human-like Episodic Memory for Infinite Context LLMs
Paper
•
2407.09450
•
Published
•
59
MUSCLE: A Model Update Strategy for Compatible LLM Evolution
Paper
•
2407.09435
•
Published
•
20
Transformer Layers as Painters
Paper
•
2407.09298
•
Published
•
13
H2O-Danube3 Technical Report
Paper
•
2407.09276
•
Published
•
18
Understanding Retrieval Robustness for Retrieval-Augmented Image
Captioning
Paper
•
2406.02265
•
Published
•
6
Characterizing Prompt Compression Methods for Long Context Inference
Paper
•
2407.08892
•
Published
•
9
Paper
•
2407.10671
•
Published
•
160
Learning to Refuse: Towards Mitigating Privacy Risks in LLMs
Paper
•
2407.10058
•
Published
•
29
Q-Sparse: All Large Language Models can be Fully Sparsely-Activated
Paper
•
2407.10969
•
Published
•
20
The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore
Non-Determinism
Paper
•
2407.10457
•
Published
•
22
Foundational Autoraters: Taming Large Language Models for Better
Automatic Evaluation
Paper
•
2407.10817
•
Published
•
13
MMM: Multilingual Mutual Reinforcement Effect Mix Datasets & Test with
Open-domain Information Extraction Large Language Models
Paper
•
2407.10953
•
Published
•
4
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language
Models
Paper
•
2407.12327
•
Published
•
77
GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill
and Extreme KV-Cache Compression
Paper
•
2407.12077
•
Published
•
54
Patch-Level Training for Large Language Models
Paper
•
2407.12665
•
Published
•
16
The Art of Saying No: Contextual Noncompliance in Language Models
Paper
•
2407.12043
•
Published
•
4
Practical Unlearning for Large Language Models
Paper
•
2407.10223
•
Published
•
4
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
Paper
•
2407.13623
•
Published
•
54
Understanding Reference Policies in Direct Preference Optimization
Paper
•
2407.13709
•
Published
•
16
Internal Consistency and Self-Feedback in Large Language Models: A
Survey
Paper
•
2407.14507
•
Published
•
46
SciCode: A Research Coding Benchmark Curated by Scientists
Paper
•
2407.13168
•
Published
•
13
Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix"
Cycle
Paper
•
2407.13833
•
Published
•
11
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Paper
•
2407.15017
•
Published
•
33
Compact Language Models via Pruning and Knowledge Distillation
Paper
•
2407.14679
•
Published
•
38
BOND: Aligning LLMs with Best-of-N Distillation
Paper
•
2407.14622
•
Published
•
18
DDK: Distilling Domain Knowledge for Efficient Large Language Models
Paper
•
2407.16154
•
Published
•
21
Data Mixture Inference: What do BPE Tokenizers Reveal about their
Training Data?
Paper
•
2407.16607
•
Published
•
22
SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models
for Southeast Asian Languages
Paper
•
2407.19672
•
Published
•
55
Self-Training with Direct Preference Optimization Improves
Chain-of-Thought Reasoning
Paper
•
2407.18248
•
Published
•
31
Mixture of Nested Experts: Adaptive Processing of Visual Tokens
Paper
•
2407.19985
•
Published
•
36
Visual Riddles: a Commonsense and World Knowledge Challenge for Large
Vision and Language Models
Paper
•
2407.19474
•
Published
•
23
ThinK: Thinner Key Cache by Query-Driven Pruning
Paper
•
2407.21018
•
Published
•
30
The Llama 3 Herd of Models
Paper
•
2407.21783
•
Published
•
110
ShieldGemma: Generative AI Content Moderation Based on Gemma
Paper
•
2407.21772
•
Published
•
14
Gemma 2: Improving Open Language Models at a Practical Size
Paper
•
2408.00118
•
Published
•
75
Improving Text Embeddings for Smaller Language Models Using Contrastive
Fine-tuning
Paper
•
2408.00690
•
Published
•
23
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data
Assessment and Selection for Instruction Tuning of Language Models
Paper
•
2408.02085
•
Published
•
17
Paper
•
2408.02666
•
Published
•
27
Scaling LLM Test-Time Compute Optimally can be More Effective than
Scaling Model Parameters
Paper
•
2408.03314
•
Published
•
52
Transformer Explainer: Interactive Learning of Text-Generative Models
Paper
•
2408.04619
•
Published
•
155
Better Alignment with Instruction Back-and-Forth Translation
Paper
•
2408.04614
•
Published
•
14
Learning to Predict Program Execution by Modeling Dynamic Dependency on
Code Graphs
Paper
•
2408.02816
•
Published
•
4
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2
Paper
•
2408.05147
•
Published
•
38
ToolSandbox: A Stateful, Conversational, Interactive Evaluation
Benchmark for LLM Tool Use Capabilities
Paper
•
2408.04682
•
Published
•
14
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers
Paper
•
2408.06195
•
Published
•
63
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
Paper
•
2408.07055
•
Published
•
64
Layerwise Recurrent Router for Mixture-of-Experts
Paper
•
2408.06793
•
Published
•
31
Amuro & Char: Analyzing the Relationship between Pre-Training and
Fine-Tuning of Large Language Models
Paper
•
2408.06663
•
Published
•
15
FuxiTranyu: A Multilingual Large Language Model Trained with Balanced
Data
Paper
•
2408.06273
•
Published
•
9
Paper
•
2408.07410
•
Published
•
13
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for
Reinforcement Learning and Monte-Carlo Tree Search
Paper
•
2408.08152
•
Published
•
52
I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative
Self-Enhancement Paradigm
Paper
•
2408.08072
•
Published
•
32
Training Language Models on the Knowledge Graph: Insights on
Hallucinations and Their Detectability
Paper
•
2408.07852
•
Published
•
15
FuseChat: Knowledge Fusion of Chat Models
Paper
•
2408.07990
•
Published
•
10
BAM! Just Like That: Simple and Efficient Parameter Upcycling for
Mixture of Experts
Paper
•
2408.08274
•
Published
•
12
Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risk
of Language Models
Paper
•
2408.08926
•
Published
•
5
TableBench: A Comprehensive and Complex Benchmark for Table Question
Answering
Paper
•
2408.09174
•
Published
•
51
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper
•
2408.10914
•
Published
•
41
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context
Generation with Speculative Decoding
Paper
•
2408.11049
•
Published
•
12
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper
•
2408.11796
•
Published
•
57
FocusLLM: Scaling LLM's Context by Parallel Decoding
Paper
•
2408.11745
•
Published
•
23
Hermes 3 Technical Report
Paper
•
2408.11857
•
Published
•
41
ConflictBank: A Benchmark for Evaluating the Influence of Knowledge
Conflicts in LLM
Paper
•
2408.12076
•
Published
•
12
Memory-Efficient LLM Training with Online Subspace Descent
Paper
•
2408.12857
•
Published
•
12
SWE-bench-java: A GitHub Issue Resolving Benchmark for Java
Paper
•
2408.14354
•
Published
•
40
LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to
Small-Scale Local LLMs
Paper
•
2408.13467
•
Published
•
24
MobileQuant: Mobile-friendly Quantization for On-device Language Models
Paper
•
2408.13933
•
Published
•
13
Efficient Detection of Toxic Prompts in Large Language Models
Paper
•
2408.11727
•
Published
•
12
Writing in the Margins: Better Inference Pattern for Long Context
Retrieval
Paper
•
2408.14906
•
Published
•
138
The Mamba in the Llama: Distilling and Accelerating Hybrid Models
Paper
•
2408.15237
•
Published
•
37
BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and
Deduplication by Introducing a Competitive Large Language Model Baseline
Paper
•
2408.15079
•
Published
•
52
Leveraging Open Knowledge for Advancing Task Expertise in Large Language
Models
Paper
•
2408.15915
•
Published
•
19
Efficient LLM Scheduling by Learning to Rank
Paper
•
2408.15792
•
Published
•
19
Knowledge Navigator: LLM-guided Browsing Framework for Exploratory
Search in Scientific Literature
Paper
•
2408.15836
•
Published
•
12
ReMamba: Equip Mamba with Effective Long-Sequence Modeling
Paper
•
2408.15496
•
Published
•
10
Auxiliary-Loss-Free Load Balancing Strategy for Mixture-of-Experts
Paper
•
2408.15664
•
Published
•
11
SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding
Paper
•
2408.15545
•
Published
•
34
CURLoRA: Stable LLM Continual Fine-Tuning and Catastrophic Forgetting
Mitigation
Paper
•
2408.14572
•
Published
•
7
GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs
Paper
•
2408.15300
•
Published
•
3
OLMoE: Open Mixture-of-Experts Language Models
Paper
•
2409.02060
•
Published
•
77
LongRecipe: Recipe for Efficient Long Context Generalization in Large
Languge Models
Paper
•
2409.00509
•
Published
•
37
ContextCite: Attributing Model Generation to Context
Paper
•
2409.00729
•
Published
•
13
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in
Action
Paper
•
2409.00138
•
Published
•
1
LongCite: Enabling LLMs to Generate Fine-grained Citations in
Long-context QA
Paper
•
2409.02897
•
Published
•
44
Arctic-SnowCoder: Demystifying High-Quality Data in Code Pretraining
Paper
•
2409.02326
•
Published
•
18
Attention Heads of Large Language Models: A Survey
Paper
•
2409.03752
•
Published
•
88
WildVis: Open Source Visualizer for Million-Scale Chat Logs in the Wild
Paper
•
2409.03753
•
Published
•
18
How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with
High-Quality Data
Paper
•
2409.03810
•
Published
•
30
Configurable Foundation Models: Building LLMs from a Modular Perspective
Paper
•
2409.02877
•
Published
•
27
Spinning the Golden Thread: Benchmarking Long-Form Generation in
Language Models
Paper
•
2409.02076
•
Published
•
10
Towards a Unified View of Preference Learning for Large Language Models:
A Survey
Paper
•
2409.02795
•
Published
•
71
MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge
Discovery
Paper
•
2409.05591
•
Published
•
29
Benchmarking Chinese Knowledge Rectification in Large Language Models
Paper
•
2409.05806
•
Published
•
13
GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question
Answering
Paper
•
2409.06595
•
Published
•
37
PingPong: A Benchmark for Role-Playing Language Models with User
Emulation and Multi-Model Evaluation
Paper
•
2409.06820
•
Published
•
63
Gated Slot Attention for Efficient Linear-Time Sequence Modeling
Paper
•
2409.07146
•
Published
•
19
Self-Harmonized Chain of Thought
Paper
•
2409.04057
•
Published
•
16
Source2Synth: Synthetic Data Generation and Curation Grounded in Real
Data Sources
Paper
•
2409.08239
•
Published
•
16
Ferret: Federated Full-Parameter Tuning at Scale for Large Language
Models
Paper
•
2409.06277
•
Published
•
14
On the Diagram of Thought
Paper
•
2409.10038
•
Published
•
12
A Comprehensive Evaluation of Quantized Instruction-Tuned Large Language
Models: An Experimental Analysis up to 405B
Paper
•
2409.11055
•
Published
•
16
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded
Attributions and Learning to Refuse
Paper
•
2409.11242
•
Published
•
5
Qwen2.5-Coder Technical Report
Paper
•
2409.12186
•
Published
•
138
LLMs + Persona-Plug = Personalized LLMs
Paper
•
2409.11901
•
Published
•
31
Preference Tuning with Human Feedback on Language, Speech, and Vision
Tasks: A Survey
Paper
•
2409.11564
•
Published
•
19
GRIN: GRadient-INformed MoE
Paper
•
2409.12136
•
Published
•
15
Training Language Models to Self-Correct via Reinforcement Learning
Paper
•
2409.12917
•
Published
•
135
MMSearch: Benchmarking the Potential of Large Models as Multi-modal
Search Engines
Paper
•
2409.12959
•
Published
•
36
Scaling Smart: Accelerating Large Language Model Pre-training with Small
Model Initialization
Paper
•
2409.12903
•
Published
•
21
Language Models Learn to Mislead Humans via RLHF
Paper
•
2409.12822
•
Published
•
9
Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented
Generation
Paper
•
2409.12941
•
Published
•
23
Hackphyr: A Local Fine-Tuned LLM Agent for Network Security Environments
Paper
•
2409.11276
•
Published
•
6
HelloBench: Evaluating Long Text Generation Capabilities of Large
Language Models
Paper
•
2409.16191
•
Published
•
41
Reward-Robust RLHF in LLMs
Paper
•
2409.15360
•
Published
•
5
Programming Every Example: Lifting Pre-training Data Quality like
Experts at Scale
Paper
•
2409.17115
•
Published
•
60
Boosting Healthcare LLMs Through Retrieved Context
Paper
•
2409.15127
•
Published
•
19
NoTeeline: Supporting Real-Time Notetaking from Keypoints with Large
Language Models
Paper
•
2409.16493
•
Published
•
9
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks
at Scale
Paper
•
2409.16299
•
Published
•
10
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
Paper
•
2409.17481
•
Published
•
46
The Imperative of Conversation Analysis in the Era of LLMs: A Survey of
Tasks, Techniques, and Trends
Paper
•
2409.14195
•
Published
•
11
Enhancing Structured-Data Retrieval with GraphRAG: Soccer Data Case
Study
Paper
•
2409.17580
•
Published
•
7
Modulated Intervention Preference Optimization (MIPO): Keep the Easy,
Refine the Difficult
Paper
•
2409.17545
•
Published
•
19
Erasing Conceptual Knowledge from Language Models
Paper
•
2410.02760
•
Published
•
13
CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding
Capabilities of CodeLLMs
Paper
•
2410.01999
•
Published
•
10
Data Selection via Optimal Control for Language Models
Paper
•
2410.07064
•
Published
•
8
Falcon Mamba: The First Competitive Attention-free 7B Language Model
Paper
•
2410.05355
•
Published
•
31
Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for
Text-to-Image Diffusion Model Unlearning
Paper
•
2410.05664
•
Published
•
7
MathCoder2: Better Math Reasoning from Continued Pretraining on
Model-translated Mathematical Code
Paper
•
2410.08196
•
Published
•
45
Benchmarking Agentic Workflow Generation
Paper
•
2410.07869
•
Published
•
25
PositionID: LLMs can Control Lengths, Copy and Paste with Explicit
Positional Awareness
Paper
•
2410.07035
•
Published
•
16
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via
Inference-time Hybrid Information Structurization
Paper
•
2410.08815
•
Published
•
43
Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining
Paper
•
2410.08102
•
Published
•
19
KV Prediction for Improved Time to First Token
Paper
•
2410.08391
•
Published
•
11
Mentor-KD: Making Small Language Models Better Multi-step Reasoners
Paper
•
2410.09037
•
Published
•
4
DA-Code: Agent Data Science Code Generation Benchmark for Large Language
Models
Paper
•
2410.07331
•
Published
•
4
Toward General Instruction-Following Alignment for Retrieval-Augmented
Generation
Paper
•
2410.09584
•
Published
•
47
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large
Language Models
Paper
•
2410.07985
•
Published
•
28
LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content
Paper
•
2410.10783
•
Published
•
26
Rethinking Data Selection at Scale: Random Selection is Almost All You
Need
Paper
•
2410.09335
•
Published
•
16
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive
Memory
Paper
•
2410.10813
•
Published
•
9
Tree of Problems: Improving structured problem solving with
compositionality
Paper
•
2410.06634
•
Published
•
8
Thinking LLMs: General Instruction Following with Thought Generation
Paper
•
2410.10630
•
Published
•
18
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free
Paper
•
2410.10814
•
Published
•
48
What Matters in Transformers? Not All Attention is Needed
Paper
•
2406.15786
•
Published
•
29
LLMtimesMapReduce: Simplified Long-Sequence Processing using Large
Language Models
Paper
•
2410.09342
•
Published
•
37
SecCodePLT: A Unified Platform for Evaluating the Security of Code GenAI
Paper
•
2410.11096
•
Published
•
12
Agent-as-a-Judge: Evaluate Agents with Agents
Paper
•
2410.10934
•
Published
•
18
HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of
Large Multimodal Models Through Coding Tasks
Paper
•
2410.12381
•
Published
•
42
JudgeBench: A Benchmark for Evaluating LLM-based Judges
Paper
•
2410.12784
•
Published
•
42
PopAlign: Diversifying Contrasting Patterns for a More Comprehensive
Alignment
Paper
•
2410.13785
•
Published
•
18
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model
Paper
•
2410.13639
•
Published
•
16
FlatQuant: Flatness Matters for LLM Quantization
Paper
•
2410.09426
•
Published
•
12
Retrospective Learning from Interactions
Paper
•
2410.13852
•
Published
•
8
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and
Evolution
Paper
•
2410.16256
•
Published
•
58
Baichuan Alignment Technical Report
Paper
•
2410.14940
•
Published
•
49
RM-Bench: Benchmarking Reward Models of Language Models with Subtlety
and Style
Paper
•
2410.16184
•
Published
•
23
Pre-training Distillation for Large Language Models: A Design Space
Exploration
Paper
•
2410.16215
•
Published
•
15
Aligning Large Language Models via Self-Steering Optimization
Paper
•
2410.17131
•
Published
•
21
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Paper
•
2410.17215
•
Published
•
14
Scaling Diffusion Language Models via Adaptation from Autoregressive
Models
Paper
•
2410.17891
•
Published
•
15
LOGO -- Long cOntext aliGnment via efficient preference Optimization
Paper
•
2410.18533
•
Published
•
42
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis
from Scratch
Paper
•
2410.18693
•
Published
•
40
Why Does the Effective Context Length of LLMs Fall Short?
Paper
•
2410.18745
•
Published
•
17
Taipan: Efficient and Expressive State Space Language Models with
Selective Attention
Paper
•
2410.18572
•
Published
•
16
Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs
Paper
•
2410.18451
•
Published
•
15
CCI3.0-HQ: a large-scale Chinese dataset of high quality designed for
pre-training large language models
Paper
•
2410.18505
•
Published
•
10
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language
Models
Paper
•
2410.18252
•
Published
•
5
Should We Really Edit Language Models? On the Evaluation of Edited
Language Models
Paper
•
2410.18785
•
Published
•
5
Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with
System Co-Design
Paper
•
2410.19123
•
Published
•
15
A Survey of Small Language Models
Paper
•
2410.20011
•
Published
•
40
LongReward: Improving Long-context Large Language Models with AI
Feedback
Paper
•
2410.21252
•
Published
•
17
Fast Best-of-N Decoding via Speculative Rejection
Paper
•
2410.20290
•
Published
•
10
Relaxed Recursive Transformers: Effective Parameter Sharing with
Layer-wise LoRA
Paper
•
2410.20672
•
Published
•
6
CLEAR: Character Unlearning in Textual and Visual Modalities
Paper
•
2410.18057
•
Published
•
200
SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy
Segment Optimization
Paper
•
2410.21411
•
Published
•
19
Flow-DPO: Improving LLM Mathematical Reasoning through Online
Multi-Agent Learning
Paper
•
2410.22304
•
Published
•
16
Accelerating Direct Preference Optimization with Prefix Sharing
Paper
•
2410.20305
•
Published
•
6
RARe: Retrieval Augmented Retrieval with In-Context Examples
Paper
•
2410.20088
•
Published
•
5
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation
Generation
Paper
•
2410.23090
•
Published
•
54
Stealing User Prompts from Mixture of Experts
Paper
•
2410.22884
•
Published
•
14
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A
Gradient Perspective
Paper
•
2410.23743
•
Published
•
59
SelfCodeAlign: Self-Alignment for Code Generation
Paper
•
2410.24198
•
Published
•
22
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for
Minority Languages
Paper
•
2410.23825
•
Published
•
3
Personalization of Large Language Models: A Survey
Paper
•
2411.00027
•
Published
•
31
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated
Parameters by Tencent
Paper
•
2411.02265
•
Published
•
24
LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in
Large Language Models
Paper
•
2411.00918
•
Published
•
8
Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting
Rare Concepts in Foundation Models
Paper
•
2411.00743
•
Published
•
6
SALSA: Soup-based Alignment Learning for Stronger Adaptation in RLHF
Paper
•
2411.01798
•
Published
•
8
LoRA-Contextualizing Adaptation of Large Multimodal Models for Long
Document Understanding
Paper
•
2411.01106
•
Published
•
4
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge
in RAG Systems
Paper
•
2411.02959
•
Published
•
64
Mixture-of-Transformers: A Sparse and Scalable Architecture for
Multi-Modal Foundation Models
Paper
•
2411.04996
•
Published
•
49
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page
Multi-document Understanding
Paper
•
2411.04952
•
Published
•
28
Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test
Generation: An Empirical Study
Paper
•
2411.02462
•
Published
•
9
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Paper
•
2411.04905
•
Published
•
111
Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language
Models
Paper
•
2411.07140
•
Published
•
33
IOPO: Empowering LLMs with Complex Instruction Following via
Input-Output Preference Optimization
Paper
•
2411.06208
•
Published
•
19
Stronger Models are NOT Stronger Teachers for Instruction Tuning
Paper
•
2411.07133
•
Published
•
34
Large Language Models Can Self-Improve in Long-context Reasoning
Paper
•
2411.08147
•
Published
•
62
Direct Preference Optimization Using Sparse Feature-Level Constraints
Paper
•
2411.07618
•
Published
•
15
Top-nσ: Not All Logits Are You Need
Paper
•
2411.07641
•
Published
•
18
SlimLM: An Efficient Small Language Model for On-Device Document
Assistance
Paper
•
2411.09944
•
Published
•
12
Adaptive Decoding via Latent Preference Optimization
Paper
•
2411.09661
•
Published
•
10
Building Trust: Foundations of Security, Safety and Transparency in AI
Paper
•
2411.12275
•
Published
•
10
SymDPO: Boosting In-Context Learning of Large Multimodal Models with
Symbol Demonstration Direct Preference Optimization
Paper
•
2411.11909
•
Published
•
20
Loss-to-Loss Prediction: Scaling Laws for All Datasets
Paper
•
2411.12925
•
Published
•
5
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions
Paper
•
2411.14405
•
Published
•
58
Hymba: A Hybrid-head Architecture for Small Language Models
Paper
•
2411.13676
•
Published
•
39
Do I Know This Entity? Knowledge Awareness and Hallucinations in
Language Models
Paper
•
2411.14257
•
Published
•
9
UnifiedCrawl: Aggregated Common Crawl for Affordable Adaptation of LLMs
on Low-Resource Languages
Paper
•
2411.14343
•
Published
•
7
Patience Is The Key to Large Language Model Reasoning
Paper
•
2411.13082
•
Published
•
7
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training
Paper
•
2411.15124
•
Published
•
56
A Flexible Large Language Models Guardrail Development Methodology
Applied to Off-Topic Prompt Detection
Paper
•
2411.12946
•
Published
•
20
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple
Distillation, Big Progress or Bitter Lesson?
Paper
•
2411.16489
•
Published
•
40
From Generation to Judgment: Opportunities and Challenges of
LLM-as-a-judge
Paper
•
2411.16594
•
Published
•
36
MH-MoE:Multi-Head Mixture-of-Experts
Paper
•
2411.16205
•
Published
•
23
VisualLens: Personalization through Visual History
Paper
•
2411.16034
•
Published
•
16
LLMs Do Not Think Step-by-step In Implicit Reasoning
Paper
•
2411.15862
•
Published
•
8
All Languages Matter: Evaluating LMMs on Culturally Diverse 100
Languages
Paper
•
2411.16508
•
Published
•
8
Training and Evaluating Language Models with Template-based Data
Generation
Paper
•
2411.18104
•
Published
•
3
Beyond Examples: High-level Automated Reasoning Paradigm in In-Context
Learning via MCTS
Paper
•
2411.18478
•
Published
•
32
LLM Teacher-Student Framework for Text Classification With No Manually
Annotated Data: A Case Study in IPTC News Topic Classification
Paper
•
2411.19638
•
Published
•
6
o1-Coder: an o1 Replication for Coding
Paper
•
2412.00154
•
Published
•
41
TinyFusion: Diffusion Transformers Learned Shallow
Paper
•
2412.01199
•
Published
•
14
VLsI: Verbalized Layers-to-Interactions from Large to Small Vision
Language Models
Paper
•
2412.01822
•
Published
•
14
Free Process Rewards without Process Labels
Paper
•
2412.01981
•
Published
•
28
Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-OASIS
Paper
•
2411.19655
•
Published
•
20
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on
Retrieval-Augmented Generation
Paper
•
2412.02592
•
Published
•
20
Surveying the Effects of Quality, Diversity, and Complexity in Synthetic
Data From Large Language Models
Paper
•
2412.02980
•
Published
•
12
Weighted-Reward Preference Optimization for Implicit Model Fusion
Paper
•
2412.03187
•
Published
•
9
Evaluating Language Models as Synthetic Data Generators
Paper
•
2412.03679
•
Published
•
43
Monet: Mixture of Monosemantic Experts for Transformers
Paper
•
2412.04139
•
Published
•
10
Marco-LLM: Bridging Languages via Massive Multilingual Training for
Cross-Lingual Enhancement
Paper
•
2412.04003
•
Published
•
9
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases
Paper
•
2412.04862
•
Published
•
48
Training Large Language Models to Reason in a Continuous Latent Space
Paper
•
2412.06769
•
Published
•
63
Evaluating and Aligning CodeLLMs on Human Preference
Paper
•
2412.05210
•
Published
•
47
Paper
•
2412.07724
•
Published
•
18
KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models
Paper
•
2412.06071
•
Published
•
7
Paper
•
2412.08905
•
Published
•
93
SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better
Reasoning in SLMs
Paper
•
2412.08347
•
Published
•
4
GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong
Prompt Optimizers
Paper
•
2412.09722
•
Published
•
5
Smaller Language Models Are Better Instruction Evolvers
Paper
•
2412.11231
•
Published
•
25
SPaR: Self-Play with Tree-Search Refinement to Improve
Instruction-Following in Large Language Models
Paper
•
2412.11605
•
Published
•
15
The Open Source Advantage in Large Language Models (LLMs)
Paper
•
2412.12004
•
Published
•
9
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and
Post-LN
Paper
•
2412.13795
•
Published
•
18
How to Synthesize Text Data without Model Collapse?
Paper
•
2412.14689
•
Published
•
47
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper
•
2412.16145
•
Published
•
33
SCOPE: Optimizing Key-Value Cache Compression in Long-context Generation
Paper
•
2412.13649
•
Published
•
18
MixLLM: LLM Quantization with Global Mixed-precision between
Output-features and Highly-efficient System Design
Paper
•
2412.14590
•
Published
•
12
RobustFT: Robust Supervised Fine-tuning for Large Language Models under
Noisy Response
Paper
•
2412.14922
•
Published
•
78
Outcome-Refining Process Supervision for Code Generation
Paper
•
2412.15118
•
Published
•
16
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought
Paper
•
2412.17498
•
Published
•
17
NILE: Internal Consistency Alignment in Large Language Models
Paper
•
2412.16686
•
Published
•
6
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing
Paper
•
2412.14711
•
Published
•
12
Ensembling Large Language Models with Process Reward-Guided Tree Search
for Better Complex Reasoning
Paper
•
2412.15797
•
Published
•
9
Token-Budget-Aware LLM Reasoning
Paper
•
2412.18547
•
Published
•
35