Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2401.18058

TRAMS: Training-free Memory Selection for Long-range Language Modeling

Paper • 2310.15494 • Published Oct 24, 2023 • 1
A Long Way to Go: Investigating Length Correlations in RLHF

Paper • 2310.03716 • Published Oct 5, 2023 • 9
YaRN: Efficient Context Window Extension of Large Language Models

Paper • 2309.00071 • Published Aug 31, 2023 • 65
Giraffe: Adventures in Expanding Context Lengths in LLMs

Paper • 2308.10882 • Published Aug 21, 2023 • 1

Alignment - Contents Regulation - Unlearning

Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15 • 82
Aligning Teacher with Student Preferences for Tailored Training Data Generation

Paper • 2406.19227 • Published Jun 27 • 24
Self-Play Preference Optimization for Language Model Alignment

Paper • 2405.00675 • Published May 1 • 22
CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues

Paper • 2404.03820 • Published Apr 4 • 23

LLoCO: Learning Long Contexts Offline

Paper • 2404.07979 • Published Apr 11 • 19
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Paper • 2402.13753 • Published Feb 21 • 111
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration

Paper • 2402.11550 • Published Feb 18 • 15
LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31 • 21

Interesting things.

AtP*: An efficient and scalable method for localizing LLM behaviour to components

Paper • 2403.00745 • Published Mar 1 • 11
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 592
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT

Paper • 2402.16840 • Published Feb 26 • 23
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Paper • 2402.13753 • Published Feb 21 • 111

Data-efficient LLMs

dataset pruning for advancing the capabilities of LLMs

Effective pruning of web-scale datasets based on complexity of concept clusters

Paper • 2401.04578 • Published Jan 9
How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15 • 38
A Survey on Data Selection for LLM Instruction Tuning

Paper • 2402.05123 • Published Feb 4 • 3
LESS: Selecting Influential Data for Targeted Instruction Tuning

Paper • 2402.04333 • Published Feb 6 • 3

E^2-LLM: Efficient and Extreme Length Extension of Large Language Models

Paper • 2401.06951 • Published Jan 13 • 24
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

Paper • 2401.01325 • Published Jan 2 • 26
Extending LLMs' Context Window with 100 Samples

Paper • 2401.07004 • Published Jan 13 • 14
LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31 • 21

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

Paper • 2402.00159 • Published Jan 31 • 59
LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31 • 21

LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31 • 21

AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning

Paper • 2402.00769 • Published Feb 1 • 20
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

Paper • 2311.05556 • Published Nov 9, 2023 • 79
LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31 • 21
Efficient Tool Use with Chain-of-Abstraction Reasoning

Paper • 2401.17464 • Published Jan 30 • 16

My (Denis Gordeev) collection of mostly NLP papers. You can message me at t.me/nlp_party

LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31 • 21
Efficient Tool Use with Chain-of-Abstraction Reasoning

Paper • 2401.17464 • Published Jan 30 • 16
Scavenging Hyena: Distilling Transformers into Long Convolution Models

Paper • 2401.17574 • Published Jan 31 • 15
Rethinking Interpretability in the Era of Large Language Models

Paper • 2402.01761 • Published Jan 30 • 21

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs