arxiv:2407.04842
Chaoqi Wang
alecwangcq
AI & ML interests
RL \cap LLMs
Organizations
Papers
2
models
8
alecwangcq/Meta-Llama-3-8B-Instruct-sft
Text Generation
•
Updated
•
14
alecwangcq/sft_openassistant-guanaco
Updated
•
2
alecwangcq/zephyr-7b-sft-full
Text Generation
•
Updated
•
24
alecwangcq/zephyr-7b-dpo-full-10-epochs-debug
Text Generation
•
Updated
•
12
alecwangcq/zephyr-7b-dpo-full-10-epochs
Text Generation
•
Updated
•
9
alecwangcq/zephyr-7b-dpo-full
Updated
alecwangcq/sdxl-test
Updated
alecwangcq/ghibli-small-v0
Updated
datasets
None public yet