Di Zhang

qq8933

AI & ML interests

AI4Chem, LLM, Green LLM

Recent Activity

Organizations

AI4Chem's profile picture SimpleBerry Research Lab's profile picture

Posts 21

view post
Post
2537
LLaMA-O1-PRM and LLaMA-O1-Reinforcement will release in this weekend.
We have implemented a novel Reinforcement finetune(RFT) pipeline that taught models learning reasoning and reward labeling without human annotation.