-
-
-
-
-
-
Inference Providers
Active filters:
dpo
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-logic-dpo
Text Generation
•
Updated
•
148
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-math-coding-dpo
Text Generation
•
Updated
•
148
NicholasCorrado/uf-rlced-conifer_tulu-2-7b-group-dpo-no-clip
Text Generation
•
Updated
•
6
mradermacher/uf-tulu-2-7b-dpo-GGUF
Updated
mradermacher/zephyr-7b-hh-dpo-GGUF
sfulay/zephyr-7b-dpo-full-gpt-reward-scale-05
sfulay/zephyr-7b-dpo-full-gpt_consistent-reward-scale-1-rpo-gamma-2
mradermacher/uf-tulu-2-7b-dpo-i1-GGUF
sfulay/zephyr-7b-dpo-full-gpt_consistent-reward-scale-05
sfulay/zephyr-7b-dpo-full-gpt_consistent-reward-scale-1-rpo-gamma-05
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-math-coding-group-dpo
Text Generation
•
Updated
•
11
mradermacher/zephyr-7b-hh-dpo-i1-GGUF
Updated
•
121
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-math-dpo-2
Text Generation
•
Updated
•
8
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-coding-dpo-2
Text Generation
•
Updated
•
13
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-math-coding-dpo-2
Text Generation
•
Updated
•
7
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-logic-dpo-2
Text Generation
•
Updated
•
94
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-dpo-2
Text Generation
•
Updated
•
90
tsavage68/Na_L3_1000steps_1e6rate_03beta_cSFTDPO
Text Generation
•
Updated
•
6
NicholasCorrado/tinyllama-1.1b-chat-v1.0-rlced-conifer-3-1-dpo
Text Generation
•
Updated
•
93
NicholasCorrado/tulu-2-7b-rlced-conifer-dpo
Text Generation
•
Updated
•
6
tsavage68/Na_L3_1000steps_1e6rate_01beta_cSFTDPO
Text Generation
•
Updated
•
8
NanQiangHF/llama3.1_8b_dpo_bwgenerator
CultriX/Lama-DPOlphin-8B
Text Generation
•
Updated
•
12
•
1
tsavage68/Na_L3_150steps_1e6rate_01beta_cSFTDPO
Text Generation
•
Updated
•
5
tsavage68/Na_L3_100steps_1e6rate_03beta_cSFTDPO
Text Generation
•
Updated
•
5
NicholasCorrado/zephyr-7b-uf-rlced-conifer-dpo-2e
Text Generation
•
Updated
•
10
tsavage68/Na_L3_1000steps_1e6rate_05beta_cSFTDPO
Text Generation
•
Updated
•
6
tsavage68/Na_L3_100steps_1e6rate_05beta_cSFTDPO
Text Generation
•
Updated
•
5
CultriX/Lama-DPOlphin-8B-Q3_K_S-GGUF
Text Generation
•
Updated
•
5
•
1
CultriX/Lama-DPOlphin-8B-Q3_K_M-GGUF
Text Generation
•
Updated
•
10
•
1