Edit model card

llm.c checkpoint: GPT-2 774M

This is a HF/safetensors conversion of the llm.c checkpoint of a 774M parameter run on 150B tokens from FineWeb.

Training was conducted on a single 8xA100 80GB SXM node for ~6 days.

See discussion on GitHub for more information.

Downloads last month
99
Safetensors
Model size
774M params
Tensor type
BF16
·
Inference API
This model can be loaded on Inference API (serverless).

Dataset used to train mdouglas/llmc-gpt2-774M-150B

Collection including mdouglas/llmc-gpt2-774M-150B