Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2501.06252

Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published Jan 9 • 53

Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published Jan 9 • 53

Ai cutting edge papers

Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published Jan 9 • 53

Fine-Tuning and Adapter

Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published Jan 9 • 53

Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback

Paper • 2501.03916 • Published Jan 7 • 15
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8 • 91
Agent Laboratory: Using LLM Agents as Research Assistants

Paper • 2501.04227 • Published Jan 8 • 86
Search-o1: Agentic Search-Enhanced Large Reasoning Models

Paper • 2501.05366 • Published Jan 9 • 95

Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published Jan 9 • 53
MiniMaxAI/MiniMax-Text-01

Text Generation • Updated 18 days ago • 1.5k • 547
Running on CPU Upgrade

2.03k

2.03k

Anychat

🏢

Display code snippets for different chat providers

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 263
Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published Jan 9 • 53
Multimodal LLMs Can Reason about Aesthetics in Zero-Shot

Paper • 2501.09012 • Published Jan 15 • 10
FAST: Efficient Action Tokenization for Vision-Language-Action Models

Paper • 2501.09747 • Published Jan 16 • 23

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 100
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Paper • 2501.01257 • Published Jan 2 • 50
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Paper • 2501.01423 • Published Jan 2 • 37
REDUCIO! Generating 1024times1024 Video within 16 Seconds using Extremely Compressed Motion Latents

Paper • 2411.13552 • Published Nov 20, 2024

2025 LLM Papers on Hugging Face with Japanese Memos

MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models

Paper • 2501.02955 • Published Jan 6 • 40
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 100
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

Paper • 2501.12380 • Published Jan 21 • 84
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos

Paper • 2501.09781 • Published Jan 16 • 27

Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published Jan 9 • 53
s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 111
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 142
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models

Paper • 2501.12370 • Published Jan 21 • 11

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs