readings - a GeonmoGu Collection

Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

GeonmoGu 's Collections

readings

updated 2 days ago

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21, 2024 • 58
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Paper • 2408.09174 • Published Aug 17, 2024 • 52
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 42
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20, 2024 • 56
Law of Vision Representation in MLLMs

Paper • 2408.16357 • Published Aug 29, 2024 • 93
CURLoRA: Stable LLM Continual Fine-Tuning and Catastrophic Forgetting Mitigation

Paper • 2408.14572 • Published Aug 26, 2024 • 8
SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding

Paper • 2408.15545 • Published Aug 28, 2024 • 35
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture

Paper • 2409.02889 • Published Sep 4, 2024 • 55
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA

Paper • 2409.02897 • Published Sep 4, 2024 • 47
Attention Heads of Large Language Models: A Survey

Paper • 2409.03752 • Published Sep 5, 2024 • 89
Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing

Paper • 2409.01322 • Published Sep 2, 2024 • 95
Towards a Unified View of Preference Learning for Large Language Models: A Survey

Paper • 2409.02795 • Published Sep 4, 2024 • 72
Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance

Paper • 2409.04593 • Published Sep 6, 2024 • 26
ProteinBench: A Holistic Evaluation of Protein Foundation Models

Paper • 2409.06744 • Published Sep 10, 2024 • 9
Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18, 2024 • 140
Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 136
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Paper • 2409.16191 • Published Sep 24, 2024 • 42
Making Text Embedders Few-Shot Learners

Paper • 2409.15700 • Published Sep 24, 2024 • 30
Instruction Following without Instruction Tuning

Paper • 2409.14254 • Published Sep 21, 2024 • 29
TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices

Paper • 2410.00531 • Published Oct 1, 2024 • 31
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging

Paper • 2410.01215 • Published Oct 2, 2024 • 31
Not All LLM Reasoners Are Created Equal

Paper • 2410.01748 • Published Oct 2, 2024 • 28
RATIONALYST: Pre-training Process-Supervision for Improving Reasoning

Paper • 2410.01044 • Published Oct 1, 2024 • 36
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis

Paper • 2410.02749 • Published Oct 3, 2024 • 12
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration

Paper • 2410.02367 • Published Oct 3, 2024 • 47
Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1, 2024 • 145
Selective Attention Improves Transformer

Paper • 2410.02703 • Published Oct 3, 2024 • 24
Agent S: An Open Agentic Framework that Uses Computers Like a Human

Paper • 2410.08164 • Published Oct 10, 2024 • 24
Toward General Instruction-Following Alignment for Retrieval-Augmented Generation

Paper • 2410.09584 • Published Oct 12, 2024 • 48
A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models

Paper • 2410.13841 • Published Oct 17, 2024 • 17
HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks

Paper • 2410.12381 • Published Oct 16, 2024 • 44
Revealing the Barriers of Language Agents in Planning

Paper • 2410.12409 • Published Oct 16, 2024 • 26
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published Oct 22, 2024 • 89
Why Does the Effective Context Length of LLMs Fall Short?

Paper • 2410.18745 • Published Oct 24, 2024 • 17
Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Dataset

Paper • 2410.22325 • Published Oct 29, 2024 • 10
A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks

Paper • 2410.22391 • Published Oct 29, 2024 • 22
Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination

Paper • 2411.03823 • Published Nov 6, 2024 • 45
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Paper • 2411.03562 • Published Nov 5, 2024 • 65
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems

Paper • 2411.02959 • Published Nov 5, 2024 • 68
Let the Flows Tell: Solving Graph Combinatorial Optimization Problems with GFlowNets

Paper • 2305.17010 • Published May 26, 2023
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published Nov 7, 2024 • 115
Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test Generation: An Empirical Study

Paper • 2411.02462 • Published Nov 4, 2024 • 10
Large Language Models Can Self-Improve in Long-context Reasoning

Paper • 2411.08147 • Published Nov 12, 2024 • 64
Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 46
ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction?

Paper • 2411.06469 • Published Nov 10, 2024 • 17
SlimLM: An Efficient Small Language Model for On-Device Document Assistance

Paper • 2411.09944 • Published Nov 15, 2024 • 12
SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration

Paper • 2411.10958 • Published Nov 17, 2024 • 52
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization

Paper • 2411.10442 • Published Nov 15, 2024 • 73
Hymba: A Hybrid-head Architecture for Small Language Models

Paper • 2411.13676 • Published Nov 20, 2024 • 41
Natural Language Reinforcement Learning

Paper • 2411.14251 • Published Nov 21, 2024 • 29
Cautious Optimizers: Improving Training with One Line of Code

Paper • 2411.16085 • Published Nov 25, 2024 • 19
Predicting Emergent Capabilities by Finetuning

Paper • 2411.16035 • Published Nov 25, 2024 • 9
Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published Nov 26, 2024 • 52
o1-Coder: an o1 Replication for Coding

Paper • 2412.00154 • Published Nov 29, 2024 • 43
Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability

Paper • 2411.19943 • Published Nov 29, 2024 • 58
VisionZip: Longer is Better but Not Necessary in Vision Language Models

Paper • 2412.04467 • Published Dec 5, 2024 • 107
Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection

Paper • 2412.04455 • Published Dec 5, 2024 • 38
Personalized Multimodal Large Language Models: A Survey

Paper • 2412.02142 • Published Dec 3, 2024 • 14
Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published Dec 4, 2024 • 48
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Paper • 2412.05271 • Published Dec 6, 2024 • 132
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Paper • 2412.05237 • Published Dec 6, 2024 • 47
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases

Paper • 2412.04862 • Published Dec 6, 2024 • 50
Moto: Latent Motion Token as the Bridging Language for Robot Manipulation

Paper • 2412.04445 • Published Dec 5, 2024 • 22
Evaluating and Aligning CodeLLMs on Human Preference

Paper • 2412.05210 • Published Dec 6, 2024 • 47
POINTS1.5: Building a Vision-Language Model towards Real World Applications

Paper • 2412.08443 • Published Dec 11, 2024 • 38
Phi-4 Technical Report

Paper • 2412.08905 • Published Dec 12, 2024 • 106
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published Dec 12, 2024 • 94
GenEx: Generating an Explorable World

Paper • 2412.09624 • Published Dec 12, 2024 • 90
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Paper • 2412.14161 • Published Dec 18, 2024 • 51
Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 345
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks

Paper • 2412.15204 • Published Dec 19, 2024 • 33
How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published Dec 19, 2024 • 50
Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 38
SCOPE: Optimizing Key-Value Cache Compression in Long-context Generation

Paper • 2412.13649 • Published Dec 18, 2024 • 20
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

Paper • 2412.17256 • Published Dec 23, 2024 • 46
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response

Paper • 2412.14922 • Published Dec 19, 2024 • 86
Diving into Self-Evolving Training for Multimodal Reasoning

Paper • 2412.17451 • Published Dec 23, 2024 • 43
Revisiting In-Context Learning with Long Context Language Models

Paper • 2412.16926 • Published Dec 22, 2024 • 30
Outcome-Refining Process Supervision for Code Generation

Paper • 2412.15118 • Published Dec 19, 2024 • 19
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought

Paper • 2412.17498 • Published Dec 23, 2024 • 22
NILE: Internal Consistency Alignment in Large Language Models

Paper • 2412.16686 • Published Dec 21, 2024 • 8
LearnLM: Improving Gemini for Learning

Paper • 2412.16429 • Published Dec 21, 2024 • 22
PC Agent: While You Sleep, AI Works -- A Cognitive Journey into Digital World

Paper • 2412.17589 • Published Dec 23, 2024 • 12
3DGraphLLM: Combining Semantic Graphs and Large Language Models for 3D Scene Understanding

Paper • 2412.18450 • Published Dec 24, 2024 • 33
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Paper • 2412.17739 • Published Dec 23, 2024 • 41
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing

Paper • 2412.14711 • Published Dec 19, 2024 • 16
Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning

Paper • 2412.15797 • Published Dec 20, 2024 • 18
YuLan-Mini: An Open Data-efficient Language Model

Paper • 2412.17743 • Published Dec 23, 2024 • 65
Molar: Multimodal LLMs with Collaborative Filtering Alignment for Enhanced Sequential Recommendation

Paper • 2412.18176 • Published Dec 24, 2024 • 15
MMFactory: A Universal Solution Search Engine for Vision-Language Tasks

Paper • 2412.18072 • Published Dec 24, 2024 • 18
Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization

Paper • 2412.18525 • Published Dec 24, 2024 • 75
Efficiently Serving LLM Reasoning Programs with Certaindex

Paper • 2412.20993 • Published Dec 30, 2024 • 36
HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation

Paper • 2412.21199 • Published Dec 30, 2024 • 14
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Paper • 2412.19723 • Published Dec 27, 2024 • 82
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 99
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Paper • 2501.01257 • Published Jan 2 • 49
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Paper • 2501.01423 • Published Jan 2 • 36
ProgCo: Program Helps Self-Correction of Large Language Models

Paper • 2501.01264 • Published Jan 2 • 25
STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution

Paper • 2501.02976 • Published Jan 6 • 54
BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning

Paper • 2501.03226 • Published Jan 6 • 38
Test-time Computing: from System-1 Thinking to System-2 Thinking

Paper • 2501.02497 • Published Jan 5 • 41
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 90
MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models

Paper • 2501.02955 • Published Jan 6 • 40
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

Paper • 2501.03895 • Published Jan 7 • 49
Cosmos World Foundation Model Platform for Physical AI

Paper • 2501.03575 • Published Jan 7 • 68
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides

Paper • 2501.03936 • Published Jan 7 • 19
An Empirical Study of Autoregressive Pre-training from Videos

Paper • 2501.05453 • Published Jan 9 • 37
Enhancing Human-Like Responses in Large Language Models

Paper • 2501.05032 • Published Jan 9 • 49
SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution

Paper • 2501.05040 • Published Jan 9 • 15
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

Paper • 2501.06186 • Published Jan 10 • 61
VideoRAG: Retrieval-Augmented Generation over Video Corpus

Paper • 2501.05874 • Published Jan 10 • 67
OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

Paper • 2501.05510 • Published Jan 9 • 39
The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published Jan 13 • 90
Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11 • 83
Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published Jan 9 • 53
WebWalker: Benchmarking LLMs in Web Traversal

Paper • 2501.07572 • Published Jan 13 • 19
O1 Replication Journey -- Part 3: Inference-time Scaling for Medical Reasoning

Paper • 2501.06458 • Published Jan 11 • 29
Towards Best Practices for Open Datasets for LLM Training

Paper • 2501.08365 • Published Jan 14 • 55
MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents

Paper • 2501.08828 • Published Jan 15 • 30
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published Jan 16 • 67
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

Paper • 2501.09686 • Published Jan 16 • 36
FAST: Efficient Action Tokenization for Vision-Language-Action Models

Paper • 2501.09747 • Published Jan 16 • 23
Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published about 1 month ago • 106
PaSa: An LLM Agent for Comprehensive Academic Paper Search

Paper • 2501.10120 • Published about 1 month ago • 43
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training

Paper • 2501.11425 • Published 27 days ago • 91
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Paper • 2501.11873 • Published 27 days ago • 63
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 25 days ago • 319
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback

Paper • 2501.12895 • Published 25 days ago • 56
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

Paper • 2501.13106 • Published 25 days ago • 83
Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published 26 days ago • 93
Autonomy-of-Experts Models

Paper • 2501.13074 • Published 25 days ago • 41
SRMT: Shared Memory for Multi-agent Lifelong Pathfinding

Paper • 2501.13200 • Published 25 days ago • 63
Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Paper • 2501.13629 • Published 24 days ago • 44
Baichuan-Omni-1.5 Technical Report

Paper • 2501.15368 • Published 22 days ago • 56
Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published 22 days ago • 57
Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling

Paper • 2501.16975 • Published 19 days ago • 26
Optimizing Large Language Model Training Using FP4 Quantization

Paper • 2501.17116 • Published 19 days ago • 34
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published 19 days ago • 105
Atla Selene Mini: A General Purpose Evaluation Model

Paper • 2501.17195 • Published 20 days ago • 33
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

Paper • 2501.17703 • Published 18 days ago • 54
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published 17 days ago • 53
s1: Simple test-time scaling

Paper • 2501.19393 • Published 16 days ago • 100
Reward-Guided Speculative Decoding for Efficient LLM Reasoning

Paper • 2501.19324 • Published 16 days ago • 35
GuardReasoner: Towards Reasoning-based LLM Safeguards

Paper • 2501.18492 • Published 17 days ago • 81
The Differences Between Direct Alignment Algorithms are a Blur

Paper • 2502.01237 • Published 13 days ago • 112
Process Reinforcement through Implicit Rewards

Paper • 2502.01456 • Published 13 days ago • 53
The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles

Paper • 2502.01081 • Published 14 days ago • 13
Scaling Embedding Layers in Language Models

Paper • 2502.01637 • Published 13 days ago • 21
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking

Paper • 2502.02339 • Published 12 days ago • 21
LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer

Paper • 2502.01105 • Published 14 days ago • 18
Large Language Model Guided Self-Debugging Code Generation

Paper • 2502.02928 • Published 12 days ago • 10
TwinMarket: A Scalable Behavioral and Social Simulation for Financial Markets

Paper • 2502.01506 • Published 13 days ago • 31
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 12 days ago • 168
Demystifying Long Chain-of-Thought Reasoning in LLMs

Paper • 2502.03373 • Published 11 days ago • 51
LIMO: Less is More for Reasoning

Paper • 2502.03387 • Published 11 days ago • 52
ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features

Paper • 2502.04320 • Published 10 days ago • 32
Enhancing Code Generation for Low-Resource Languages: No Silver Bullet

Paper • 2501.19085 • Published 16 days ago • 5
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published 6 days ago • 123
SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators

Paper • 2502.06394 • Published 6 days ago • 84
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Paper • 2502.06781 • Published 6 days ago • 49
Lossless Acceleration of Large Language Models with Hierarchical Drafting based on Temporal Locality in Speculative Decoding

Paper • 2502.05609 • Published 8 days ago • 14
Show-o Turbo: Towards Accelerated Unified Multimodal Understanding and Generation

Paper • 2502.05415 • Published 9 days ago • 20
LM2: Large Memory Models

Paper • 2502.06049 • Published 7 days ago • 26
The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information Steering

Paper • 2502.03628 • Published 11 days ago • 11
Matryoshka Quantization

Paper • 2502.06786 • Published 6 days ago • 22
History-Guided Video Diffusion

Paper • 2502.06764 • Published 6 days ago • 10
CustomVideoX: 3D Reference Attention Driven Dynamic Adaptation for Zero-Shot Customized Video Diffusion Transformers

Paper • 2502.06527 • Published 6 days ago • 9
The Curse of Depth in Large Language Models

Paper • 2502.05795 • Published 8 days ago • 26
MetaChain: A Fully-Automated and Zero-Code Framework for LLM Agents

Paper • 2502.05957 • Published 7 days ago • 15
Competitive Programming with Large Reasoning Models

Paper • 2502.06807 • Published 13 days ago • 55
CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction

Paper • 2502.07316 • Published 6 days ago • 32
Teaching Language Models to Critique via Reinforcement Learning

Paper • 2502.03492 • Published 12 days ago • 22
Expect the Unexpected: FailSafe Long Context QA for Finance

Paper • 2502.06329 • Published 6 days ago • 118
Scaling Pre-training to One Hundred Billion Data for Vision Language Models

Paper • 2502.07617 • Published 5 days ago • 24
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!

Paper • 2502.07374 • Published 5 days ago • 29
Retrieval-augmented Large Language Models for Financial Time Series Forecasting

Paper • 2502.05878 • Published 7 days ago • 38
Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training

Paper • 2502.06589 • Published 6 days ago • 16
Forget What You Know about LLMs Evaluations - LLMs are Like a Chameleon

Paper • 2502.07445 • Published 5 days ago • 8
TransMLA: Multi-head Latent Attention Is All You Need

Paper • 2502.07864 • Published 5 days ago • 37
Distillation Scaling Laws

Paper • 2502.08606 • Published 4 days ago • 32
LLM Pretraining with Continuous Concepts

Paper • 2502.08524 • Published 4 days ago • 20
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published 4 days ago • 117
Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation

Paper • 2502.08690 • Published 4 days ago • 32
SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Paper • 2502.09604 • Published 3 days ago • 26
Exploring the Potential of Encoder-free Architectures in 3D LMMs

Paper • 2502.09620 • Published 3 days ago • 23
An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging

Paper • 2502.09056 • Published 3 days ago • 26
Logical Reasoning in Large Language Models: A Survey

Paper • 2502.09100 • Published 3 days ago • 18
DexTrack: Towards Generalizable Neural Tracking Control for Dexterous Manipulation from Human References

Paper • 2502.09614 • Published 3 days ago • 9
Can this Model Also Recognize Dogs? Zero-Shot Model Search from Weights

Paper • 2502.09619 • Published 3 days ago • 28
EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents

Paper • 2502.09560 • Published 3 days ago • 27
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding

Paper • 2502.08946 • Published 4 days ago • 156

Collection guide
Browse collections

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs