arxiv:2311.08290
Corrado
NicholasCorrado
AI & ML interests
Reinforcement learning
Organizations
None yet
Papers
3
models
60
NicholasCorrado/mistral-7b-ift
Text Generation
•
Updated
•
12
NicholasCorrado/zephyr-7b-uf-rlced-conifer-group-dpo-2e-alr-0.1
Text Generation
•
Updated
•
10
NicholasCorrado/zephyr-7b-uf-rlced-conifer-group-dpo-2e-alr-0.01
Text Generation
•
Updated
•
9
NicholasCorrado/zephyr-7b-uf-rlced-conifer-group-dpo-2e-alr-0.01-1e
Text Generation
•
Updated
•
5
NicholasCorrado/zephyr-7b-uf-rc-small-dpo
Text Generation
•
Updated
•
10
NicholasCorrado/test
Updated
NicholasCorrado/zephyr-7b-uf-dpo-2e
Text Generation
•
Updated
•
8
NicholasCorrado/rlced-conifer-zephyr-7b-dpo-2e
Text Generation
•
Updated
•
8
NicholasCorrado/zephyr-7b-uf-rlced-conifer-1e2e-group-dpo-2e
Text Generation
•
Updated
•
10
NicholasCorrado/zephyr-7b-uf-rlced-conifer-group-dpo-2e
Text Generation
•
Updated
•
8
datasets
None public yet