LHPKAI
's Collections
Multilingual Instruction Tuning With Just a Pinch of Multilinguality
Paper
•
2401.01854
•
Published
•
10
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper
•
2401.01055
•
Published
•
54
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper
•
2401.01325
•
Published
•
27
Improving Text Embeddings with Large Language Models
Paper
•
2401.00368
•
Published
•
79
Generative AI for Math: Part I -- MathPile: A Billion-Token-Scale
Pretraining Corpus for Math
Paper
•
2312.17120
•
Published
•
25
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective
Depth Up-Scaling
Paper
•
2312.15166
•
Published
•
56
Principled Instructions Are All You Need for Questioning LLaMA-1/2,
GPT-3.5/4
Paper
•
2312.16171
•
Published
•
34
WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with
Refined Data Generation
Paper
•
2312.14187
•
Published
•
49
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language
Models
Paper
•
2401.01335
•
Published
•
64
LLaMA Pro: Progressive LLaMA with Block Expansion
Paper
•
2401.02415
•
Published
•
53
Paper
•
2401.04088
•
Published
•
158
SeaLLMs -- Large Language Models for Southeast Asia
Paper
•
2312.00738
•
Published
•
23
System 2 Attention (is something you might need too)
Paper
•
2311.11829
•
Published
•
39
Contrastive Chain-of-Thought Prompting
Paper
•
2311.09277
•
Published
•
34
Blending Is All You Need: Cheaper, Better Alternative to
Trillion-Parameters LLM
Paper
•
2401.02994
•
Published
•
49
Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon
Paper
•
2401.03462
•
Published
•
27
Self-Rewarding Language Models
Paper
•
2401.10020
•
Published
•
145
Tuning Language Models by Proxy
Paper
•
2401.08565
•
Published
•
21
ReFT: Reasoning with Reinforced Fine-Tuning
Paper
•
2401.08967
•
Published
•
29
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper
•
2402.17764
•
Published
•
603
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper
•
2310.11453
•
Published
•
96
Orca-Math: Unlocking the potential of SLMs in Grade School Math
Paper
•
2402.14830
•
Published
•
24
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper
•
2403.03507
•
Published
•
183
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Paper
•
2403.07816
•
Published
•
39
RAFT: Adapting Language Model to Domain Specific RAG
Paper
•
2403.10131
•
Published
•
67
ORPO: Monolithic Preference Optimization without Reference Model
Paper
•
2403.07691
•
Published
•
63
Evolutionary Optimization of Model Merging Recipes
Paper
•
2403.13187
•
Published
•
50
RakutenAI-7B: Extending Large Language Models for Japanese
Paper
•
2403.15484
•
Published
•
12
sDPO: Don't Use Your Data All at Once
Paper
•
2403.19270
•
Published
•
40
Jamba: A Hybrid Transformer-Mamba Language Model
Paper
•
2403.19887
•
Published
•
104
Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language
Models through Question Complexity
Paper
•
2403.14403
•
Published
•
6
Long-context LLMs Struggle with Long In-context Learning
Paper
•
2404.02060
•
Published
•
35
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
Paper
•
2404.05961
•
Published
•
64
JetMoE: Reaching Llama2 Performance with 0.1M Dollars
Paper
•
2404.07413
•
Published
•
36
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your
Phone
Paper
•
2404.14219
•
Published
•
253
OpenELM: An Efficient Language Model Family with Open-source Training
and Inference Framework
Paper
•
2404.14619
•
Published
•
126
Mixture-of-Agents Enhances Large Language Model Capabilities
Paper
•
2406.04692
•
Published
•
55