Capybara Preferences

distilabel-internal-testing 's Collections

Capybara and SystemChat-1.1 Preferences with SOTA LLMs

updated Apr 17, 2024

This collection contains the results of the effort on extending `LDJnr/Capybara` to convert it into a preference dataset, with 7B LLMs

Upvote

LDJnr/Capybara

Viewer • Updated Jun 7, 2024 • 16k • 251 • 230

Note The original Capybara dataset for SFT (ending with an assistant response)
argilla/distilabel-capybara-dpo-7k-binarized

Viewer • Updated Jul 16, 2024 • 7.56k • 3.65k • 178

Note The first iteration on Argilla's end to generate responses with `argilla/notus-7b-v1`, `teknium/OpenHermes-2.5-Mistral-7B`, and `mlabonne/NeuralBeagle14-7B`; then using GPT-4 as a judge via UltraFeedback using `distilabel`
distilabel-internal-testing/Capybara-Deduped

Viewer • Updated Apr 15, 2024 • 16k • 32

Note A subset of `LDJnr/Capybara` dropping the duplicates, as apparently there are some duplicate entries within the Dove subset
distilabel-internal-testing/Capybara-Deduped-Remaining

Viewer • Updated Apr 15, 2024 • 8.38k • 33

Note A subset of `distilabel-internal-testing/Capybara-Deduped` removing the rows that have already been generated and judged in `argilla/distilabel-capybara-dpo-7k-binarized`
distilabel-internal-testing/Capybara-Preferences-Remaining

Viewer • Updated Apr 17, 2024 • 7.84k • 34

Note This subset contains the generations and preferences of the samples in `distilabel-internal-testing/Capybara-Deduped-Remaining`, and should be merged into `argilla/distilabel-capybara-dpo-7k-binarized`
argilla/Capybara-Preferences

Viewer • Updated May 9, 2024 • 15.4k • 303 • 40

Note The final dataset as an iteration on top of `LDJnr/Capybara` generating alternative completions with `argilla/notus-7b-v1`, `teknium/OpenHermes-2.5-Mistral-7B`, and `mlabonne/NeuralBeagle14-7B`; then using GPT-4 as a judge via UltraFeedback using `distilabel`

Upvote