zkshan2002
/

PPO-uf-llama3-8B-OpenRLHF

Model card Files Files and versions Community

README.md exists but content is empty.

Downloads last month: 1

Safetensors

Model size

8.03B params

Tensor type

BF16

·

Inference API

Unable to determine this model's library. Check the docs .

Model tree for zkshan2002/PPO-uf-llama3-8B-OpenRLHF

Base model

OpenRLHF/Llama-3-8b-sft-mixture

Finetuned

(3)

this model

Dataset used to train zkshan2002/PPO-uf-llama3-8B-OpenRLHF