LLMs - a tmarechaux Collection

tmarechaux 's Collections

LLMs

IR

LLMs

updated Nov 4

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 84
PDFTriage: Question Answering over Long, Structured Documents

Paper • 2309.08872 • Published Sep 16, 2023 • 53
Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 37
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

Paper • 2309.12307 • Published Sep 21, 2023 • 88
SCREWS: A Modular Framework for Reasoning with Revisions

Paper • 2309.13075 • Published Sep 20, 2023 • 15
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models

Paper • 2309.14509 • Published Sep 25, 2023 • 17
Qwen Technical Report

Paper • 2309.16609 • Published Sep 28, 2023 • 35
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting

Paper • 2309.04269 • Published Sep 8, 2023 • 32
In-Context Pretraining: Language Modeling Beyond Document Boundaries

Paper • 2310.10638 • Published Oct 16, 2023 • 29
Can LLMs Follow Simple Rules?

Paper • 2311.04235 • Published Nov 6, 2023 • 10
Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs

Paper • 2311.05657 • Published Nov 9, 2023 • 27
Lost in the Middle: How Language Models Use Long Contexts

Paper • 2307.03172 • Published Jul 6, 2023 • 37
Challenges and Applications of Large Language Models

Paper • 2307.10169 • Published Jul 19, 2023 • 47
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 50
GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 186
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention

Paper • 2312.07987 • Published Dec 13, 2023 • 41
Mixtral of Experts

Paper • 2401.04088 • Published Jan 8 • 158
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling

Paper • 2401.16380 • Published Jan 29 • 48
Generative Representational Instruction Tuning

Paper • 2402.09906 • Published Feb 15 • 53
RAFT: Adapting Language Model to Domain Specific RAG

Paper • 2403.10131 • Published Mar 15 • 67
Can large language models explore in-context?

Paper • 2403.15371 • Published Mar 22 • 32
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Paper • 2404.07143 • Published Apr 10 • 104
Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11 • 87
Towards a Unified View of Preference Learning for Large Language Models: A Survey

Paper • 2409.02795 • Published Sep 4 • 71
Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation

Paper • 2411.00412 • Published Nov 1 • 9