GeonmoGu
's Collections
readings
updated
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper
•
2408.11796
•
Published
•
58
TableBench: A Comprehensive and Complex Benchmark for Table Question
Answering
Paper
•
2408.09174
•
Published
•
52
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper
•
2408.10914
•
Published
•
42
Open-FinLLMs: Open Multimodal Large Language Models for Financial
Applications
Paper
•
2408.11878
•
Published
•
56
Law of Vision Representation in MLLMs
Paper
•
2408.16357
•
Published
•
93
CURLoRA: Stable LLM Continual Fine-Tuning and Catastrophic Forgetting
Mitigation
Paper
•
2408.14572
•
Published
•
8
SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding
Paper
•
2408.15545
•
Published
•
35
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via
Hybrid Architecture
Paper
•
2409.02889
•
Published
•
55
LongCite: Enabling LLMs to Generate Fine-grained Citations in
Long-context QA
Paper
•
2409.02897
•
Published
•
47
Attention Heads of Large Language Models: A Survey
Paper
•
2409.03752
•
Published
•
89
Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free
Real Image Editing
Paper
•
2409.01322
•
Published
•
95
Towards a Unified View of Preference Learning for Large Language Models:
A Survey
Paper
•
2409.02795
•
Published
•
72
Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized
Academic Assistance
Paper
•
2409.04593
•
Published
•
26
ProteinBench: A Holistic Evaluation of Protein Foundation Models
Paper
•
2409.06744
•
Published
•
9
Qwen2.5-Coder Technical Report
Paper
•
2409.12186
•
Published
•
140
Training Language Models to Self-Correct via Reinforcement Learning
Paper
•
2409.12917
•
Published
•
136
HelloBench: Evaluating Long Text Generation Capabilities of Large
Language Models
Paper
•
2409.16191
•
Published
•
42
Making Text Embedders Few-Shot Learners
Paper
•
2409.15700
•
Published
•
30
Instruction Following without Instruction Tuning
Paper
•
2409.14254
•
Published
•
29
TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices
Paper
•
2410.00531
•
Published
•
31
From Code to Correctness: Closing the Last Mile of Code Generation with
Hierarchical Debugging
Paper
•
2410.01215
•
Published
•
31
Not All LLM Reasoners Are Created Equal
Paper
•
2410.01748
•
Published
•
28
RATIONALYST: Pre-training Process-Supervision for Improving Reasoning
Paper
•
2410.01044
•
Published
•
36
Training Language Models on Synthetic Edit Sequences Improves Code
Synthesis
Paper
•
2410.02749
•
Published
•
12
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference
Acceleration
Paper
•
2410.02367
•
Published
•
47
Addition is All You Need for Energy-efficient Language Models
Paper
•
2410.00907
•
Published
•
145
Selective Attention Improves Transformer
Paper
•
2410.02703
•
Published
•
24
Agent S: An Open Agentic Framework that Uses Computers Like a Human
Paper
•
2410.08164
•
Published
•
24
Toward General Instruction-Following Alignment for Retrieval-Augmented
Generation
Paper
•
2410.09584
•
Published
•
48
A Unified View of Delta Parameter Editing in Post-Trained Large-Scale
Models
Paper
•
2410.13841
•
Published
•
17
HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of
Large Multimodal Models Through Coding Tasks
Paper
•
2410.12381
•
Published
•
44
Revealing the Barriers of Language Agents in Planning
Paper
•
2410.12409
•
Published
•
26
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for
Contrastive Loss
Paper
•
2410.17243
•
Published
•
89
Why Does the Effective Context Length of LLMs Fall Short?
Paper
•
2410.18745
•
Published
•
17
Robots Pre-train Robots: Manipulation-Centric Robotic Representation
from Large-Scale Robot Dataset
Paper
•
2410.22325
•
Published
•
10
A Large Recurrent Action Model: xLSTM enables Fast Inference for
Robotics Tasks
Paper
•
2410.22391
•
Published
•
22
Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM
Data Contamination
Paper
•
2411.03823
•
Published
•
45
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle
Grandmaster Level
Paper
•
2411.03562
•
Published
•
65
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge
in RAG Systems
Paper
•
2411.02959
•
Published
•
68
Let the Flows Tell: Solving Graph Combinatorial Optimization Problems
with GFlowNets
Paper
•
2305.17010
•
Published
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Paper
•
2411.04905
•
Published
•
115
Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test
Generation: An Empirical Study
Paper
•
2411.02462
•
Published
•
10
Large Language Models Can Self-Improve in Long-context Reasoning
Paper
•
2411.08147
•
Published
•
64
Cut Your Losses in Large-Vocabulary Language Models
Paper
•
2411.09009
•
Published
•
46
ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical
Prediction?
Paper
•
2411.06469
•
Published
•
17
SlimLM: An Efficient Small Language Model for On-Device Document
Assistance
Paper
•
2411.09944
•
Published
•
12
SageAttention2 Technical Report: Accurate 4 Bit Attention for
Plug-and-play Inference Acceleration
Paper
•
2411.10958
•
Published
•
52
Enhancing the Reasoning Ability of Multimodal Large Language Models via
Mixed Preference Optimization
Paper
•
2411.10442
•
Published
•
73
Hymba: A Hybrid-head Architecture for Small Language Models
Paper
•
2411.13676
•
Published
•
41
Natural Language Reinforcement Learning
Paper
•
2411.14251
•
Published
•
29
Cautious Optimizers: Improving Training with One Line of Code
Paper
•
2411.16085
•
Published
•
19
Predicting Emergent Capabilities by Finetuning
Paper
•
2411.16035
•
Published
•
9
Star Attention: Efficient LLM Inference over Long Sequences
Paper
•
2411.17116
•
Published
•
52
o1-Coder: an o1 Replication for Coding
Paper
•
2412.00154
•
Published
•
43
Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's
Reasoning Capability
Paper
•
2411.19943
•
Published
•
58
VisionZip: Longer is Better but Not Necessary in Vision Language Models
Paper
•
2412.04467
•
Published
•
107
Code-as-Monitor: Constraint-aware Visual Programming for Reactive and
Proactive Robotic Failure Detection
Paper
•
2412.04455
•
Published
•
38
Personalized Multimodal Large Language Models: A Survey
Paper
•
2412.02142
•
Published
•
14
Evaluating Language Models as Synthetic Data Generators
Paper
•
2412.03679
•
Published
•
48
Expanding Performance Boundaries of Open-Source Multimodal Models with
Model, Data, and Test-Time Scaling
Paper
•
2412.05271
•
Published
•
132
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at
Scale
Paper
•
2412.05237
•
Published
•
47
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases
Paper
•
2412.04862
•
Published
•
50
Moto: Latent Motion Token as the Bridging Language for Robot
Manipulation
Paper
•
2412.04445
•
Published
•
22
Evaluating and Aligning CodeLLMs on Human Preference
Paper
•
2412.05210
•
Published
•
47
POINTS1.5: Building a Vision-Language Model towards Real World
Applications
Paper
•
2412.08443
•
Published
•
38
Paper
•
2412.08905
•
Published
•
106
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for
Long-term Streaming Video and Audio Interactions
Paper
•
2412.09596
•
Published
•
94
GenEx: Generating an Explorable World
Paper
•
2412.09624
•
Published
•
90
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World
Tasks
Paper
•
2412.14161
•
Published
•
51
Paper
•
2412.15115
•
Published
•
345
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic
Long-context Multitasks
Paper
•
2412.15204
•
Published
•
33
How to Synthesize Text Data without Model Collapse?
Paper
•
2412.14689
•
Published
•
50
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper
•
2412.16145
•
Published
•
38
SCOPE: Optimizing Key-Value Cache Compression in Long-context Generation
Paper
•
2412.13649
•
Published
•
20
B-STaR: Monitoring and Balancing Exploration and Exploitation in
Self-Taught Reasoners
Paper
•
2412.17256
•
Published
•
46
RobustFT: Robust Supervised Fine-tuning for Large Language Models under
Noisy Response
Paper
•
2412.14922
•
Published
•
86
Diving into Self-Evolving Training for Multimodal Reasoning
Paper
•
2412.17451
•
Published
•
43
Revisiting In-Context Learning with Long Context Language Models
Paper
•
2412.16926
•
Published
•
30
Outcome-Refining Process Supervision for Code Generation
Paper
•
2412.15118
•
Published
•
19
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought
Paper
•
2412.17498
•
Published
•
22
NILE: Internal Consistency Alignment in Large Language Models
Paper
•
2412.16686
•
Published
•
8
LearnLM: Improving Gemini for Learning
Paper
•
2412.16429
•
Published
•
22
PC Agent: While You Sleep, AI Works -- A Cognitive Journey into Digital
World
Paper
•
2412.17589
•
Published
•
12
3DGraphLLM: Combining Semantic Graphs and Large Language Models for 3D
Scene Understanding
Paper
•
2412.18450
•
Published
•
33
Fourier Position Embedding: Enhancing Attention's Periodic Extension for
Length Generalization
Paper
•
2412.17739
•
Published
•
41
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing
Paper
•
2412.14711
•
Published
•
16
Ensembling Large Language Models with Process Reward-Guided Tree Search
for Better Complex Reasoning
Paper
•
2412.15797
•
Published
•
18
YuLan-Mini: An Open Data-efficient Language Model
Paper
•
2412.17743
•
Published
•
65
Molar: Multimodal LLMs with Collaborative Filtering Alignment for
Enhanced Sequential Recommendation
Paper
•
2412.18176
•
Published
•
15
MMFactory: A Universal Solution Search Engine for Vision-Language Tasks
Paper
•
2412.18072
•
Published
•
18
Explanatory Instructions: Towards Unified Vision Tasks Understanding and
Zero-shot Generalization
Paper
•
2412.18525
•
Published
•
75
Efficiently Serving LLM Reasoning Programs with Certaindex
Paper
•
2412.20993
•
Published
•
36
HumanEval Pro and MBPP Pro: Evaluating Large Language Models on
Self-invoking Code Generation
Paper
•
2412.21199
•
Published
•
14
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse
Task Synthesis
Paper
•
2412.19723
•
Published
•
82
2.5 Years in Class: A Multimodal Textbook for Vision-Language
Pretraining
Paper
•
2501.00958
•
Published
•
99
CodeElo: Benchmarking Competition-level Code Generation of LLMs with
Human-comparable Elo Ratings
Paper
•
2501.01257
•
Published
•
49
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent
Diffusion Models
Paper
•
2501.01423
•
Published
•
36
ProgCo: Program Helps Self-Correction of Large Language Models
Paper
•
2501.01264
•
Published
•
25
STAR: Spatial-Temporal Augmentation with Text-to-Video Models for
Real-World Video Super-Resolution
Paper
•
2501.02976
•
Published
•
54
BoostStep: Boosting mathematical capability of Large Language Models via
improved single-step reasoning
Paper
•
2501.03226
•
Published
•
38
Test-time Computing: from System-1 Thinking to System-2 Thinking
Paper
•
2501.02497
•
Published
•
41
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language
Models
Paper
•
2501.03262
•
Published
•
90
MotionBench: Benchmarking and Improving Fine-grained Video Motion
Understanding for Vision Language Models
Paper
•
2501.02955
•
Published
•
40
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One
Vision Token
Paper
•
2501.03895
•
Published
•
49
Cosmos World Foundation Model Platform for Physical AI
Paper
•
2501.03575
•
Published
•
68
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides
Paper
•
2501.03936
•
Published
•
19
An Empirical Study of Autoregressive Pre-training from Videos
Paper
•
2501.05453
•
Published
•
37
Enhancing Human-Like Responses in Large Language Models
Paper
•
2501.05032
•
Published
•
49
SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub
Issue Resolution
Paper
•
2501.05040
•
Published
•
15
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs
Paper
•
2501.06186
•
Published
•
61
VideoRAG: Retrieval-Augmented Generation over Video Corpus
Paper
•
2501.05874
•
Published
•
67
OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video
Understanding?
Paper
•
2501.05510
•
Published
•
39
The Lessons of Developing Process Reward Models in Mathematical
Reasoning
Paper
•
2501.07301
•
Published
•
90
Tensor Product Attention Is All You Need
Paper
•
2501.06425
•
Published
•
83
Transformer^2: Self-adaptive LLMs
Paper
•
2501.06252
•
Published
•
53
WebWalker: Benchmarking LLMs in Web Traversal
Paper
•
2501.07572
•
Published
•
19
O1 Replication Journey -- Part 3: Inference-time Scaling for Medical
Reasoning
Paper
•
2501.06458
•
Published
•
29
Towards Best Practices for Open Datasets for LLM Training
Paper
•
2501.08365
•
Published
•
55
MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents
Paper
•
2501.08828
•
Published
•
30
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising
Steps
Paper
•
2501.09732
•
Published
•
67
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with
Large Language Models
Paper
•
2501.09686
•
Published
•
36
FAST: Efficient Action Tokenization for Vision-Language-Action Models
Paper
•
2501.09747
•
Published
•
23
Evolving Deeper LLM Thinking
Paper
•
2501.09891
•
Published
•
106
PaSa: An LLM Agent for Comprehensive Academic Paper Search
Paper
•
2501.10120
•
Published
•
43
Agent-R: Training Language Model Agents to Reflect via Iterative
Self-Training
Paper
•
2501.11425
•
Published
•
91
Demons in the Detail: On Implementing Load Balancing Loss for Training
Specialized Mixture-of-Expert Models
Paper
•
2501.11873
•
Published
•
63
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via
Reinforcement Learning
Paper
•
2501.12948
•
Published
•
319
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative
Textual Feedback
Paper
•
2501.12895
•
Published
•
56
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video
Understanding
Paper
•
2501.13106
•
Published
•
83
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Paper
•
2501.12599
•
Published
•
93
Autonomy-of-Experts Models
Paper
•
2501.13074
•
Published
•
41
SRMT: Shared Memory for Multi-agent Lifelong Pathfinding
Paper
•
2501.13200
•
Published
•
63
Sigma: Differential Rescaling of Query, Key and Value for Efficient
Language Models
Paper
•
2501.13629
•
Published
•
44
Baichuan-Omni-1.5 Technical Report
Paper
•
2501.15368
•
Published
•
56
Qwen2.5-1M Technical Report
Paper
•
2501.15383
•
Published
•
57
Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling
Paper
•
2501.16975
•
Published
•
26
Optimizing Large Language Model Training Using FP4 Quantization
Paper
•
2501.17116
•
Published
•
34
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model
Post-training
Paper
•
2501.17161
•
Published
•
105
Atla Selene Mini: A General Purpose Evaluation Model
Paper
•
2501.17195
•
Published
•
33
Critique Fine-Tuning: Learning to Critique is More Effective than
Learning to Imitate
Paper
•
2501.17703
•
Published
•
54
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs
Paper
•
2501.18585
•
Published
•
53
s1: Simple test-time scaling
Paper
•
2501.19393
•
Published
•
100
Reward-Guided Speculative Decoding for Efficient LLM Reasoning
Paper
•
2501.19324
•
Published
•
35
GuardReasoner: Towards Reasoning-based LLM Safeguards
Paper
•
2501.18492
•
Published
•
81
The Differences Between Direct Alignment Algorithms are a Blur
Paper
•
2502.01237
•
Published
•
112
Process Reinforcement through Implicit Rewards
Paper
•
2502.01456
•
Published
•
53
The Jumping Reasoning Curve? Tracking the Evolution of Reasoning
Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles
Paper
•
2502.01081
•
Published
•
13
Scaling Embedding Layers in Language Models
Paper
•
2502.01637
•
Published
•
21
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking
Paper
•
2502.02339
•
Published
•
21
LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion
Transformer
Paper
•
2502.01105
•
Published
•
18
Large Language Model Guided Self-Debugging Code Generation
Paper
•
2502.02928
•
Published
•
10
TwinMarket: A Scalable Behavioral and Social Simulation for Financial
Markets
Paper
•
2502.01506
•
Published
•
31
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language
Model
Paper
•
2502.02737
•
Published
•
168
Demystifying Long Chain-of-Thought Reasoning in LLMs
Paper
•
2502.03373
•
Published
•
51
LIMO: Less is More for Reasoning
Paper
•
2502.03387
•
Published
•
52
ConceptAttention: Diffusion Transformers Learn Highly Interpretable
Features
Paper
•
2502.04320
•
Published
•
32
Enhancing Code Generation for Low-Resource Languages: No Silver Bullet
Paper
•
2501.19085
•
Published
•
5
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time
Scaling
Paper
•
2502.06703
•
Published
•
123
SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data
Annotators
Paper
•
2502.06394
•
Published
•
84
Exploring the Limit of Outcome Reward for Learning Mathematical
Reasoning
Paper
•
2502.06781
•
Published
•
49
Lossless Acceleration of Large Language Models with Hierarchical
Drafting based on Temporal Locality in Speculative Decoding
Paper
•
2502.05609
•
Published
•
14
Show-o Turbo: Towards Accelerated Unified Multimodal Understanding and
Generation
Paper
•
2502.05415
•
Published
•
20
Paper
•
2502.06049
•
Published
•
26
The Hidden Life of Tokens: Reducing Hallucination of Large
Vision-Language Models via Visual Information Steering
Paper
•
2502.03628
•
Published
•
11
Paper
•
2502.06786
•
Published
•
22
History-Guided Video Diffusion
Paper
•
2502.06764
•
Published
•
10
CustomVideoX: 3D Reference Attention Driven Dynamic Adaptation for
Zero-Shot Customized Video Diffusion Transformers
Paper
•
2502.06527
•
Published
•
9
The Curse of Depth in Large Language Models
Paper
•
2502.05795
•
Published
•
26
MetaChain: A Fully-Automated and Zero-Code Framework for LLM Agents
Paper
•
2502.05957
•
Published
•
15
Competitive Programming with Large Reasoning Models
Paper
•
2502.06807
•
Published
•
55
CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction
Paper
•
2502.07316
•
Published
•
32
Teaching Language Models to Critique via Reinforcement Learning
Paper
•
2502.03492
•
Published
•
22
Expect the Unexpected: FailSafe Long Context QA for Finance
Paper
•
2502.06329
•
Published
•
118
Scaling Pre-training to One Hundred Billion Data for Vision Language
Models
Paper
•
2502.07617
•
Published
•
24
LLMs Can Easily Learn to Reason from Demonstrations Structure, not
content, is what matters!
Paper
•
2502.07374
•
Published
•
29
Retrieval-augmented Large Language Models for Financial Time Series
Forecasting
Paper
•
2502.05878
•
Published
•
38
Hephaestus: Improving Fundamental Agent Capabilities of Large Language
Models through Continual Pre-Training
Paper
•
2502.06589
•
Published
•
16
Forget What You Know about LLMs Evaluations - LLMs are Like a Chameleon
Paper
•
2502.07445
•
Published
•
8
TransMLA: Multi-head Latent Attention Is All You Need
Paper
•
2502.07864
•
Published
•
37
Distillation Scaling Laws
Paper
•
2502.08606
•
Published
•
32
LLM Pretraining with Continuous Concepts
Paper
•
2502.08524
•
Published
•
20
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on
a Single GPU
Paper
•
2502.08910
•
Published
•
117
Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient
Text-to-Image Generation
Paper
•
2502.08690
•
Published
•
32
SelfCite: Self-Supervised Alignment for Context Attribution in Large
Language Models
Paper
•
2502.09604
•
Published
•
26
Exploring the Potential of Encoder-free Architectures in 3D LMMs
Paper
•
2502.09620
•
Published
•
23
An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in
One Day via Model Merging
Paper
•
2502.09056
•
Published
•
26
Logical Reasoning in Large Language Models: A Survey
Paper
•
2502.09100
•
Published
•
18
DexTrack: Towards Generalizable Neural Tracking Control for Dexterous
Manipulation from Human References
Paper
•
2502.09614
•
Published
•
9
Can this Model Also Recognize Dogs? Zero-Shot Model Search from Weights
Paper
•
2502.09619
•
Published
•
28
EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language
Models for Vision-Driven Embodied Agents
Paper
•
2502.09560
•
Published
•
27
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of
Physical Concept Understanding
Paper
•
2502.08946
•
Published
•
156