Spaces:

wandb
/

paper_reader

Runtime error

parambharat commited on Jul 25, 2024

Commit

f9cf95c

1 Parent(s): 79cf24f

chore: change heading level

Files changed (1) hide show

rag/rag.py CHANGED Viewed

@@ -55,7 +55,7 @@ Here's an example of a question and an answer. You must use this as a template t
 What was the main mix of the training data ? How much data was used to train the model ?
 </question>
-### Answer
 The main mix of the training data for the Llama 3 405 billion parameter model is as follows:
 - **General knowledge**: 50%
@@ -65,7 +65,7 @@ The main mix of the training data for the Llama 3 405 billion parameter model is
 Regarding the amount of data used to train the model, the snippets do not provide a specific total volume of data in terms of tokens or bytes. However, they do mention that the model was pre-trained on a large dataset containing knowledge until the end of 2023[^2^]. Additionally, the training process involved pre-training on 2.87 trillion tokens before further adjustments[^3^].
-### Footnotes
 [^1^]: "Scaling Laws for Data Mix," page 6.
 [^2^]: "Pre-Training Data," page 4.

 What was the main mix of the training data ? How much data was used to train the model ?
 </question>
+## Answer
 The main mix of the training data for the Llama 3 405 billion parameter model is as follows:
 - **General knowledge**: 50%
 Regarding the amount of data used to train the model, the snippets do not provide a specific total volume of data in terms of tokens or bytes. However, they do mention that the model was pre-trained on a large dataset containing knowledge until the end of 2023[^2^]. Additionally, the training process involved pre-training on 2.87 trillion tokens before further adjustments[^3^].
+## Footnotes
 [^1^]: "Scaling Laws for Data Mix," page 6.
 [^2^]: "Pre-Training Data," page 4.