Edit model card

This is the official checkpoint of feedback model trained using COFFEE-GYM with PPO strategy.

This model generates natural language feedback given an erroneous code.

For further detials, please see our paper.

https://huggingface.co/spaces/Coffee-Gym/Project-Coffee-Gym

Downloads last month
10
Safetensors
Model size
6.74B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Team-Coffee-Gym/DS-Coder-7B-PPO-CoffeeEval

Quantizations
1 model

Spaces using Team-Coffee-Gym/DS-Coder-7B-PPO-CoffeeEval 2