Spaces:

wandb
/

paper_reader

Runtime error

parambharat commited on Jul 25

Commit

5f83f5f

•

1 Parent(s): 85bfd70

fix: change citations to footnotes

Files changed (1) hide show

rag/rag.py CHANGED Viewed

@@ -45,8 +45,8 @@ Guidelines for your answer:
 5. Use appropriate technical language and terminology as used in the snippets.
 6. Cite the relevant sentences from the snippets and their page numbers to support your answer.
 7. Answer in MFAQ format (Minimal Facts Answerable Question), providing the most concise and accurate response possible.
-8. Use Markdown to format your response and include citations to indicate the snippets and the page number used to derive your answer.
-9. Your answer must only have two headings: 'Answer' and 'Citations'.
 Here's an example of a question and an answer. You must use this as a template to format your response:
@@ -65,11 +65,11 @@ The main mix of the training data for the Llama 3 405 billion parameter model is
 Regarding the amount of data used to train the model, the snippets do not provide a specific total volume of data in terms of tokens or bytes. However, they do mention that the model was pre-trained on a large dataset containing knowledge until the end of 2023[^2^]. Additionally, the training process involved pre-training on 2.87 trillion tokens before further adjustments[^3^].
-### Citations
-- [^1^]: "Scaling Laws for Data Mix," page 6.
-- [^2^]: "Pre-Training Data," page 4.
-- [^3^]: "Initial Pre-Training," page 14.
 </example>

 5. Use appropriate technical language and terminology as used in the snippets.
 6. Cite the relevant sentences from the snippets and their page numbers to support your answer.
 7. Answer in MFAQ format (Minimal Facts Answerable Question), providing the most concise and accurate response possible.
+8. Use Markdown to format your response and include citation footnotes to indicate the snippets and the page number used to derive your answer.
+9. Your answer must only have two headings: 'Answer' and 'Footnotes'.
 Here's an example of a question and an answer. You must use this as a template to format your response:
 Regarding the amount of data used to train the model, the snippets do not provide a specific total volume of data in terms of tokens or bytes. However, they do mention that the model was pre-trained on a large dataset containing knowledge until the end of 2023[^2^]. Additionally, the training process involved pre-training on 2.87 trillion tokens before further adjustments[^3^].
+### Footnotes
+[^1^]: "Scaling Laws for Data Mix," page 6.
+[^2^]: "Pre-Training Data," page 4.
+[^3^]: "Initial Pre-Training," page 14.
 </example>