YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

This is a GPT-2 model trained in llm.c for 100K steps (of 1M batch size) on FineWeb-EDU.

This model is exactly as the post above, except changing -x 100000 to run 100K steps. The model achieves HellaSwag of 57.7

Safetensors

Model size

1.56B params

Tensor type

BF16

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

Model tree for karpathy/gpt2_1558M_final3_hf

Quantizations