tmarechaux
's Collections
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large
Language Models in 167 Languages
Paper
•
2309.09400
•
Published
•
82
PDFTriage: Question Answering over Long, Structured Documents
Paper
•
2309.08872
•
Published
•
53
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper
•
2309.11495
•
Published
•
38
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Paper
•
2309.12307
•
Published
•
87
SCREWS: A Modular Framework for Reasoning with Revisions
Paper
•
2309.13075
•
Published
•
15
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme
Long Sequence Transformer Models
Paper
•
2309.14509
•
Published
•
17
Paper
•
2309.16609
•
Published
•
34
From Sparse to Dense: GPT-4 Summarization with Chain of Density
Prompting
Paper
•
2309.04269
•
Published
•
32
In-Context Pretraining: Language Modeling Beyond Document Boundaries
Paper
•
2310.10638
•
Published
•
28
Can LLMs Follow Simple Rules?
Paper
•
2311.04235
•
Published
•
10
Lumos: Learning Agents with Unified Data, Modular Design, and
Open-Source LLMs
Paper
•
2311.05657
•
Published
•
27
Lost in the Middle: How Language Models Use Long Contexts
Paper
•
2307.03172
•
Published
•
36
Challenges and Applications of Large Language Models
Paper
•
2307.10169
•
Published
•
47
Direct Preference Optimization: Your Language Model is Secretly a Reward
Model
Paper
•
2305.18290
•
Published
•
48
GAIA: a benchmark for General AI Assistants
Paper
•
2311.12983
•
Published
•
183
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
Paper
•
2312.07987
•
Published
•
40
Paper
•
2401.04088
•
Published
•
159
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language
Modeling
Paper
•
2401.16380
•
Published
•
48
Generative Representational Instruction Tuning
Paper
•
2402.09906
•
Published
•
51
RAFT: Adapting Language Model to Domain Specific RAG
Paper
•
2403.10131
•
Published
•
67
Can large language models explore in-context?
Paper
•
2403.15371
•
Published
•
32
Leave No Context Behind: Efficient Infinite Context Transformers with
Infini-attention
Paper
•
2404.07143
•
Published
•
103
Rho-1: Not All Tokens Are What You Need
Paper
•
2404.07965
•
Published
•
84
Towards a Unified View of Preference Learning for Large Language Models:
A Survey
Paper
•
2409.02795
•
Published
•
72
Adapting While Learning: Grounding LLMs for Scientific Problems with
Intelligent Tool Usage Adaptation
Paper
•
2411.00412
•
Published
•
9