TRL

https://github.com/huggingface/trl

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

qgallouedec updated a dataset 3 days ago

trl-lib/documentation-images

qgallouedec updated a dataset 9 days ago

trl-lib/ultrafeedback-prompt

qgallouedec updated a model 13 days ago

trl-lib/Qwen2-0.5B-Reward-Math-Sheperd

View all activity

Organization Card

Community About org cards

This is the organization grouping all the models and datasets used in the TRL library.

Collections 2

spaces 2

Sleeping

⚒️

TextEnvironments

Runtime error

213

🦙

StackLLaMa

models 81

datasets 19

trl-lib/documentation-images

Viewer • Updated 3 days ago • 1 • 1.46k

trl-lib/ultrafeedback-prompt

Viewer • Updated 9 days ago • 39.8k • 1.11k • 3

trl-lib/math_shepherd

Viewer • Updated 24 days ago • 445k • 196 • 1

trl-lib/alpaca-cleaned

Viewer • Updated 24 days ago • 51.8k • 60

trl-lib/hh-rlhf-helpful-base

Viewer • Updated about 1 month ago • 46.2k • 105

trl-lib/prm800k

Viewer • Updated Nov 20 • 41.2k • 79 • 1

trl-lib/rlaif-v

Viewer • Updated Sep 27 • 83.1k • 117 • 1

trl-lib/Capybara-Preferences

Viewer • Updated Sep 19 • 15.4k • 106

trl-lib/Capybara

Viewer • Updated Sep 19 • 16k • 1.02k • 1

trl-lib/tldr

Viewer • Updated Sep 12 • 130k • 559

TRL

AI & ML interests

Recent Activity

Collections 2

teknium/OpenHermes-2.5-Mistral-7B

Intel/orca_dpo_pairs

trl-lib/OpenHermes-2-Mistral-7B-ipo-beta-0.1-steps-200

trl-lib/OpenHermes-2-Mistral-7B-ipo-beta-0.2-steps-200

trl-lib/pythia-1b-deduped-tldr-online-dpo

trl-lib/pythia-1b-deduped-tldr-sft

trl-lib/pythia-6.9b-deduped-tldr-online-dpo

trl-lib/pythia-2.8b-deduped-tldr-sft

spaces 2

TextEnvironments

StackLLaMa

models 81

trl-lib/Qwen2-0.5B-Reward-Math-Sheperd

trl-lib/Qwen2-0.5B-XPO

trl-lib/Qwen2-0.5B-OnlineDPO

trl-lib/Qwen2-0.5B-KTO

trl-lib/Qwen2-0.5B-ORPO

trl-lib/Qwen2-0.5B-DPO

trl-lib/Qwen2-0.5B-Reward

trl-lib/pythia-1b-deduped-tldr-rm

trl-lib/pythia-2.8b-deduped-tldr-online-dpo

trl-lib/pythia-6.9b-deduped-tldr-offline-dpo

datasets 19

trl-lib/documentation-images

trl-lib/ultrafeedback-prompt

trl-lib/math_shepherd

trl-lib/alpaca-cleaned

trl-lib/hh-rlhf-helpful-base

trl-lib/prm800k

trl-lib/rlaif-v

trl-lib/Capybara-Preferences

trl-lib/Capybara

trl-lib/tldr

AI & ML interests

Recent Activity

Team members 8

Collections 2

spaces 2 Sort: Recently updated

TextEnvironments

StackLLaMa

models 81 Sort: Recently updated

datasets 19 Sort: Recently updated

spaces 2

models 81

datasets 19