arxiv:2405.07863
Wei Xiong
weqweasdas
AI & ML interests
Machine learning, RLHF
Organizations
models
24
weqweasdas/clip_600_zephyr_1epoch
Text Generation
•
Updated
•
1
weqweasdas/zephyr-7b-dpo-full
Text Generation
•
Updated
•
2
weqweasdas/zephyr-7b-gemma-dpo
Updated
weqweasdas/zephyr-7b-sft-full
Updated
weqweasdas/zephyr-7b-dpo-qlora
Updated
weqweasdas/gpt2-cpt-dutch
Text Generation
•
Updated
•
1
weqweasdas/zephyr-7b-gemma-sft
Updated
weqweasdas/raft_baseline_zephyr_packing_model6_1_4_e6_weight085
Text Generation
•
Updated
•
2
weqweasdas/raft_baseline_zephyr_packing_model6_1_4_e6
Text Generation
•
Updated
•
2
weqweasdas/raft_baseline_zephyr_packing_model6
Text Generation
•
Updated
•
2
datasets
26
weqweasdas/math_idx
Viewer
•
Updated
•
12.5k
weqweasdas/SHP-standard-tmp
Viewer
•
Updated
•
93.3k
weqweasdas/preference_dataset_mixture2_and_safe_pku
Viewer
•
Updated
•
555k
•
488
•
8
weqweasdas/preference_mix2_and_cornfield_ultrainteract
Viewer
•
Updated
•
689k
weqweasdas/zephyr_pi0_gen_ultra_n2
Viewer
•
Updated
•
58.3k
weqweasdas/preference_dataset_mixture2_and_safe_pku30k_and_argilla_math_and_ultra_code_for_preference_model
Viewer
•
Updated
•
606k
weqweasdas/preference_dataset_mixture2_and_safe_pku30k_for_preference_model
Viewer
•
Updated
•
554k
weqweasdas/ultra_feedback_binarized_for_preference_no_chat_all
Viewer
•
Updated
•
60.9k
weqweasdas/ultra_feedback_binarized_for_preference_no_chat_40k
Viewer
•
Updated
•
40k
weqweasdas/gemma_ultra_feedback_binarized_for_preference_15k
Viewer
•
Updated
•
15k