Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
ubermenchh
/
llama3.1-8B-gsm8k-grpo
like
0
PyTorch
Safetensors
GGUF
llama
unsloth
trl
grpo
Inference Endpoints
conversational
License:
mit
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
llama3.1-8B-gsm8k-grpo
/
README.md
ubermenchh
Trained with Unsloth
b42217f
verified
14 days ago
preview
code
|
raw
Copy download link
history
blame
contribute
delete
Safe
57 Bytes
metadata
license:
mit
tags:
-
unsloth
-
trl
-
grpo