mp1704
's Collections
maymo
updated
Jamba: A Hybrid Transformer-Mamba Language Model
Paper
•
2403.19887
•
Published
•
106
sDPO: Don't Use Your Data All at Once
Paper
•
2403.19270
•
Published
•
41
ViTAR: Vision Transformer with Any Resolution
Paper
•
2403.18361
•
Published
•
53
Mini-Gemini: Mining the Potential of Multi-modality Vision Language
Models
Paper
•
2403.18814
•
Published
•
46
The Unreasonable Ineffectiveness of the Deeper Layers
Paper
•
2403.17887
•
Published
•
79
LLM Agent Operating System
Paper
•
2403.16971
•
Published
•
65
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual
Math Problems?
Paper
•
2403.14624
•
Published
•
52
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Paper
•
2403.13372
•
Published
•
62
RAFT: Adapting Language Model to Domain Specific RAG
Paper
•
2403.10131
•
Published
•
68
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper
•
2403.09611
•
Published
•
126
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper
•
2403.03507
•
Published
•
184
SaulLM-7B: A pioneering Large Language Model for Law
Paper
•
2403.03883
•
Published
•
78
Design2Code: How Far Are We From Automating Front-End Engineering?
Paper
•
2403.03163
•
Published
•
94
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper
•
2402.17764
•
Published
•
607
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper
•
2402.13753
•
Published
•
115
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for
Language Models
Paper
•
2402.13064
•
Published
•
48
Chain-of-Thought Reasoning Without Prompting
Paper
•
2402.10200
•
Published
•
105
OLMo: Accelerating the Science of Language Models
Paper
•
2402.00838
•
Published
•
82
ReFT: Representation Finetuning for Language Models
Paper
•
2404.03592
•
Published
•
92
Rho-1: Not All Tokens Are What You Need
Paper
•
2404.07965
•
Published
•
89
Leave No Context Behind: Efficient Infinite Context Transformers with
Infini-attention
Paper
•
2404.07143
•
Published
•
106
OpenELM: An Efficient Language Model Family with Open-source Training
and Inference Framework
Paper
•
2404.14619
•
Published
•
127
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report
Paper
•
2405.00732
•
Published
•
120
Prometheus 2: An Open Source Language Model Specialized in Evaluating
Other Language Models
Paper
•
2405.01535
•
Published
•
121