Spaces:
Runtime error
Runtime error
parambharat
commited on
Commit
•
f9cf95c
1
Parent(s):
79cf24f
chore: change heading level
Browse files- rag/rag.py +2 -2
rag/rag.py
CHANGED
@@ -55,7 +55,7 @@ Here's an example of a question and an answer. You must use this as a template t
|
|
55 |
What was the main mix of the training data ? How much data was used to train the model ?
|
56 |
</question>
|
57 |
|
58 |
-
|
59 |
The main mix of the training data for the Llama 3 405 billion parameter model is as follows:
|
60 |
|
61 |
- **General knowledge**: 50%
|
@@ -65,7 +65,7 @@ The main mix of the training data for the Llama 3 405 billion parameter model is
|
|
65 |
|
66 |
Regarding the amount of data used to train the model, the snippets do not provide a specific total volume of data in terms of tokens or bytes. However, they do mention that the model was pre-trained on a large dataset containing knowledge until the end of 2023[^2^]. Additionally, the training process involved pre-training on 2.87 trillion tokens before further adjustments[^3^].
|
67 |
|
68 |
-
|
69 |
|
70 |
[^1^]: "Scaling Laws for Data Mix," page 6.
|
71 |
[^2^]: "Pre-Training Data," page 4.
|
|
|
55 |
What was the main mix of the training data ? How much data was used to train the model ?
|
56 |
</question>
|
57 |
|
58 |
+
## Answer
|
59 |
The main mix of the training data for the Llama 3 405 billion parameter model is as follows:
|
60 |
|
61 |
- **General knowledge**: 50%
|
|
|
65 |
|
66 |
Regarding the amount of data used to train the model, the snippets do not provide a specific total volume of data in terms of tokens or bytes. However, they do mention that the model was pre-trained on a large dataset containing knowledge until the end of 2023[^2^]. Additionally, the training process involved pre-training on 2.87 trillion tokens before further adjustments[^3^].
|
67 |
|
68 |
+
## Footnotes
|
69 |
|
70 |
[^1^]: "Scaling Laws for Data Mix," page 6.
|
71 |
[^2^]: "Pre-Training Data," page 4.
|