ubermenchh
/

llama3.1-8B-gsm8k-grpo

Inference Endpoints

Model card Files Files and versions Community

llama3.1-8B-gsm8k-grpo / README.md

ubermenchh's picture

Trained with Unsloth

b42217f verified 14 days ago

|

history blame contribute delete

57 Bytes

metadata

license: mit
tags:
  - unsloth
  - trl
  - grpo