RoFormer: Enhanced Transformer with Rotary Position Embedding Paper • 2104.09864 • Published Apr 20, 2021 • 12
Fine-Tuning Small Language Models for Domain-Specific AI: An Edge AI Perspective Paper • 2503.01933 • Published 7 days ago • 10
Phantom: Subject-consistent video generation via cross-modal alignment Paper • 2502.11079 • Published 22 days ago • 52
Have We Designed Generalizable Structural Knowledge Promptings? Systematic Evaluation and Rethinking Paper • 2501.00244 • Published Dec 31, 2024 • 1
Training language models to follow instructions with human feedback Paper • 2203.02155 • Published Mar 4, 2022 • 17
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 199
ReLearn: Unlearning via Learning for Large Language Models Paper • 2502.11190 • Published 22 days ago • 29
SWE-bench: Can Language Models Resolve Real-World GitHub Issues? Paper • 2310.06770 • Published Oct 10, 2023 • 5
ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models Paper • 2502.09696 • Published 24 days ago • 38
The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks Paper • 2502.08235 • Published 26 days ago • 54
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models Paper • 2502.06608 • Published 28 days ago • 32
ToMoE: Converting Dense Large Language Models to Mixture-of-Experts through Dynamic Structural Pruning Paper • 2501.15316 • Published Jan 25 • 1
DarwinLM: Evolutionary Structured Pruning of Large Language Models Paper • 2502.07780 • Published 26 days ago • 17
Shortened LLaMA: A Simple Depth Pruning for Large Language Models Paper • 2402.02834 • Published Feb 5, 2024 • 16