dataset-sft - a xwz-xmu Collection

xwz-xmu 's Collections

speech

models-long-context

dataset-pretrain

dataset-sft

updated Dec 5, 2024

CohereForAI/aya_dataset

Viewer • Updated Jun 28, 2024 • 206k • 2.58k • 299

Note Multilingual human curated
CohereForAI/aya_collection

Viewer • Updated Jun 28, 2024 • 514M • 29.3k • 220

Note FLAN-like from multilingual multitask NLP datasets
HuggingFaceTB/cosmopedia

Viewer • Updated Aug 12, 2024 • 31.1M • 26.3k • 593

Note 百科全书式数据，由Mistral-7bx8-instruct-v0.1生成
HuggingFaceTB/ultrachat_questions_about_world

Viewer • Updated Feb 20, 2024 • 578k • 266 • 6
m-a-p/Code-Feedback

Viewer • Updated Feb 26, 2024 • 66.4k • 315 • 206
bigcode/commitpackft

Viewer • Updated Aug 20, 2023 • 702k • 8.99k • 66
databricks/databricks-dolly-15k

Viewer • Updated Jun 30, 2023 • 15k • 14.7k • 798
arbml/CIDAR

Viewer • Updated Feb 12, 2024 • 10k • 364 • 40

Note Arabic sft
m-a-p/COIG-CQIA

Viewer • Updated Apr 18, 2024 • 44.7k • 5.51k • 611

Note Chinese sft
nvidia/OpenMathInstruct-1

Viewer • Updated Feb 16, 2024 • 6.08M • 630 • 224
fka/awesome-chatgpt-prompts

Viewer • Updated Jan 6 • 203 • 12.3k • 7.62k
nvidia/HelpSteer

Viewer • Updated Dec 18, 2024 • 37.1k • 2.03k • 234

Note 1. 使用来自 Open Assistant (OASST) 中标注的 quality, toxicity, violence, helpfulness, creativity, humor and inappropriateness 作为 response 属性训练 attribute prediction model (APM) 2. 用 APM 标注现有 sft dataset D 用得到 D'(x,y,v), v 为 attributes 3. 用 llm 在 D' 上 sft，得到 llm' 4. 用 llm' 加属性在 sft 上采样生成大量 response，然后用 APM 重新预测属性得到 v'，再用新的 v' 结合生成的 responses 再巡一遍 llm 得到 llm' 这种方式可以用 language-modelling 的方式学习到反馈信号，而不必 rlhf
NobodyExistsOnTheInternet/ToxicDPOqa

Viewer • Updated Apr 26, 2024 • 6.87k • 21 • 17

Note dpo; system prompt; toxic
Naomibas/llm-system-prompts-benchmark

Viewer • Updated Jul 11, 2024 • 100 • 293 • 10

Note 支持离线评测，类似 IFEval
NobodyExistsOnTheInternet/Fixed-FilteredTruthyDPO

Viewer • Updated Jan 31, 2024 • 477 • 39

Note dpo; system prompt; roleplay
shidowake/slimorca-with-system-prompt-5k

Viewer • Updated Jan 26, 2024 • 5k • 48 • 1

Note sft; system prompt; system prompt 多样性一般
NobodyExistsOnTheInternet/SystemMessageContradictionsSharegpt

Viewer • Updated Jan 1, 2024 • 90.3k • 83 • 2

Note system prompt; contradicted system prompt; 1. 先构造 system prompt + input 生成对应 response 2. 生成一个 contradict system prompt + input 生成对应 response 这个可以用来避免 system prompt 句式单一，都是肯定句，让模型能够遵循 system prompt 里的否定指令。
ZenMoore/RoleBench

Preview • Updated Nov 23, 2023 • 779 • 76

Note roleplay; train/test; benchmark
bai-roleplay/evol-character-entire

Viewer • Updated Feb 1, 2024 • 3.76k • 116 • 62

Note roleplay; chinese; 中文
NobodyExistsOnTheInternet/system-message-DPO

Viewer • Updated Feb 21, 2024 • 90.3k • 94 • 9
abacusai/SystemChat

Viewer • Updated Mar 4, 2024 • 7.02k • 96 • 132
BAAI/COIG-PC

Viewer • Updated Jun 14, 2024 • 540M • 151 • 267
BAAI/COIG-PC-Lite

Viewer • Updated Jun 14, 2024 • 1.08M • 583 • 36
m-a-p/COIG-Kun

Viewer • Updated Apr 8, 2024 • 368k • 522 • 30
google/Synthetic-Persona-Chat

Viewer • Updated Mar 1, 2024 • 10.9k • 1.69k • 101
hbx/IN3-interaction

Viewer • Updated Feb 20, 2024 • 2.53k • 92 • 3
sorry-bench/sorry-bench-202406

Viewer • Updated Jul 2, 2024 • 9.45k • 354 • 19