|
--- |
|
language: |
|
- code |
|
license: bigcode-openrail-m |
|
datasets: |
|
- bigcode/the-stack-dedup |
|
pipeline_tag: text-generation |
|
tags: |
|
- code |
|
- shader |
|
base_model: bigcode/santacoder |
|
widget: |
|
- text: void mainImage( out vec4 fragColor, in vec2 fragCoord ) |
|
example_title: mainImage |
|
group: Shadertoy |
|
model-index: |
|
- name: santacoder-finetuned-the-stack-glsl |
|
results: |
|
- task: |
|
type: text-generation |
|
name: ShaderEval |
|
dataset: |
|
type: Vipitis/Shadertoys-fine |
|
name: Shadertoys-fine |
|
config: return_completion |
|
revision: 0.0.2 |
|
metrics: |
|
- type: exact_match |
|
value: 0.380 |
|
name: 300 samples, greedy decoding |
|
verified: false |
|
--- |
|
|
|
[Santacoder](https://huggingface.co/bigcode/santacoder) finetuned on [The-Stack-dedup (GLSL subset)](https://huggingface.co/datasets/bigcode/the-stack-dedup/tree/main/data/glsl) for 1000 steps with a batch size of 2 and full sequence length of 2048. |
|
adapted finetuning script found [here](./train.py) |
|
|
|
### Finetuning parameters |
|
```sh |
|
python3 train.py --model_path "bigcode/santacoder" \ |
|
--dataset_name "bigcode/the-stack-dedup" \ |
|
--subset "data/glsl" \ |
|
--data_column "content" \ |
|
--split "train" \ |
|
--seq_length 2048 \ |
|
--max_steps 1000 \ |
|
--batch_size 2 \ |
|
--gradient_accumulation_steps 4 \ |
|
--learning_rate 5e-5 \ |
|
--num_warmup_steps 100 \ |
|
--eval_freq 100 \ |
|
--save_freq 100 \ |
|
--log_freq 1 \ |
|
--output_dir "checkpoint_dir" \ |
|
--no_fp16 |
|
|
|
``` |
|
|
|
Main purpose of this model is to explore if finetuning models improves performance on [ShaderEval](https://huggingface.co/spaces/Vipitis/ShaderEval), which reached 0.380 with 300 samples. |
|
|
|
License carried over from model, and the finetuning dataset holds the same license. |