metadata
license: mit
base_model: rhysjones/gpt2-774M-fineweb-150B
datasets:
- HuggingFaceFW/fineweb
widget:
- text: >-
<|user|><|content|>Hello! Who are you?<|end_turn|>
<|assistant|><|content|>Hello! I am ChatBot, a large language model. I am
here to help you with information, answer questions, provide
recommendations, and assist with a variety of tasks. How can I assist you
today?<|end_turn|>
<|user|><|content|>Write a message asking a friend to come to my wedding
next month. I want it to be very short and casual, but with an option to
opt out.<|end_turn|> <|assistant|><|content|>
This is karpathy's model from the llm.c project converted to HF format to investigate bfloat16 performance.
The training run was 150B tokens, 1.5 epochs over the 100B FineWeb sample dataset.
There's active work underway at https://github.com/karpathy/llm.c so I'd suggest following the developments there!