llm.c checkpoint: GPT-2 774M

This is a HF/safetensors conversion of the llm.c checkpoint of a 774M parameter run on 150B tokens from FineWeb.

Training was conducted on a single 8xA100 80GB SXM node for ~6 days.

See discussion on GitHub for more information.

Safetensors

Model size

774M params

Tensor type

BF16

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

Dataset used to train mdouglas/llmc-gpt2-774M-150B