readme update
Browse files
README.md
CHANGED
@@ -14,7 +14,7 @@ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-
|
|
14 |
|
15 |
# Training a Nano GPT from scratch
|
16 |
|
17 |
-
This repo contains code for training a nano GPT from scratch on any dataset. The implementation is taken from Andrej Karpathy's [repo](https://github.com/karpathy/nanoGPT/tree/master).
|
18 |
|
19 |
## Model Architecture
|
20 |
The Bigram Language Model is based on the Transformer architecture, which has been widely adopted in natural language processing tasks due to its ability to capture long-range dependencies in sequential data. Here's a detailed explanation of each component in the model:
|
|
|
14 |
|
15 |
# Training a Nano GPT from scratch
|
16 |
|
17 |
+
This repo contains code for training a nano GPT from scratch on any dataset. The implementation is taken from Andrej Karpathy's [repo](https://github.com/karpathy/nanoGPT/tree/master). The github repo with the notebooks used for model training can be found [here](https://github.com/mkthoma/nanoGPT).
|
18 |
|
19 |
## Model Architecture
|
20 |
The Bigram Language Model is based on the Transformer architecture, which has been widely adopted in natural language processing tasks due to its ability to capture long-range dependencies in sequential data. Here's a detailed explanation of each component in the model:
|