LiteLLMs
/

Llama3-ChatQA-1.5-8B-GGUF

@@ -211,30 +211,30 @@ Here are guides on using llama-cpp-python and ctransformers with LangChain:
 ## Model Details
-We introduce Llama3-ChatQA-1.5, which excels at conversational question answering (QA) and retrieval-augmented generation (RAG). Llama3-ChatQA-1.5 is developed using an improved training recipe from [ChatQA (1.0)](https://arxiv.org/abs/2401.10225), and it is built on top of [Llama-3 base model](https://huggingface.co/meta-llama/Meta-Llama-3-8B). Specifically, we incorporate more conversational QA data to enhance its tabular and arithmetic calculation capability. Llama3-ChatQA-1.5 has two variants: Llama3-ChatQA-1.5-8B and Llama3-ChatQA-1.5-70B. Both models were originally trained using [Megatron-LM](https://github.com/NVIDIA/Megatron-LM), we converted the checkpoints to Hugging Face format.
 ## Other Resources
-[Llama3-ChatQA-1.5-70B](https://huggingface.co/nvidia/Llama3-ChatQA-1.5-70B)   [Evaluation Data](https://huggingface.co/datasets/nvidia/ChatRAG-Bench)   [Training Data](https://huggingface.co/datasets/nvidia/ChatQA-Training-Data)   [Retriever](https://huggingface.co/nvidia/dragon-multiturn-query-encoder)   [Paper](https://arxiv.org/abs/2401.10225)
 ## Benchmark Results
 Results in [ChatRAG Bench](https://huggingface.co/datasets/nvidia/ChatRAG-Bench) are as follows:
-|                             | ChatQA-1.0-7B | Command-R-Plus | Llama-3-instruct-70b | GPT-4-0613 | ChatQA-1.0-70B | ChatQA-1.5-8B | ChatQA-1.5-70B |
-| --: | :: | :: | :--: | :: | :---: |
-| Doc2Dial                    |     37.88     |     33.51      |        37.88         |   34.16    |      38.9      |     39.33     |     41.26      |
-| QuAC                        |     29.69     |     34.16      |        36.96         |   40.29    |     41.82      |     39.73     |     38.82      |
-| QReCC                       |     46.97     |     49.77      |        51.34         |   52.01    |     48.05      |     49.03     |     51.40      |
-| CoQA                        |     76.61     |     69.71      |        76.98         |   77.42    |     78.57      |     76.46     |     78.44      |
-| DoQA                        |     41.57     |     40.67      |        41.24         |   43.39    |     51.94      |     49.6      |     50.67      |
-| ConvFinQA                   |     51.61     |     71.21      |         76.6         |   81.28    |     73.69      |     78.46     |     81.88      |
-| SQA                         |     61.87     |     74.07      |        69.61         |   79.21    |     69.14      |     73.28     |     83.82      |
-| TopioCQA                    |     45.45     |     53.77      |        49.72         |   45.09    |     50.98      |     49.96     |     55.63      |
-| HybriDial*                  |     54.51     |      46.7      |        48.59         |   49.81    |     56.44      |     65.76     |     68.27      |
-| INSCIT                      |     30.96     |     35.76      |        36.23         |   36.34    |      31.9      |     30.1      |     32.31      |
-| Average (all)               |     47.71     |     50.93      |        52.52         |   53.90    |     54.14      |     55.17     |     58.25      |
-| Average (exclude HybriDial) |     46.96     |     51.40      |        52.95         |   54.35    |     53.89      |     53.99     |     57.14      |
-Note that ChatQA-1.5 is built based on Llama-3 base model, and ChatQA-1.0 is built based on Llama-2 base model. ChatQA-1.5 used some samples from the HybriDial training dataset. To ensure fair comparison, we also compare average scores excluding HybriDial. The data and evaluation scripts for ChatRAG Bench can be found [here](https://huggingface.co/datasets/nvidia/ChatRAG-Bench).
 ## Prompt Format
@@ -383,7 +383,7 @@ Zihan Liu (zihanl@nvidia.com), Wei Ping (wping@nvidia.com)
 ## Citation
 <pre>
 @article{liu2024chatqa,
-  title={ChatQA: Building GPT-4 Level Conversational QA Models},
   author={Liu, Zihan and Ping, Wei and Roy, Rajarshi and Xu, Peng and Lee, Chankyu and Shoeybi, Mohammad and Catanzaro, Bryan},
   journal={arXiv preprint arXiv:2401.10225},
   year={2024}}

 ## Model Details
+We introduce Llama3-ChatQA-1.5, which excels at conversational question answering (QA) and retrieval-augmented generation (RAG). Llama3-ChatQA-1.5 is developed using an improved training recipe from [ChatQA paper](https://arxiv.org/pdf/2401.10225v3), and it is built on top of [Llama-3 base model](https://huggingface.co/meta-llama/Meta-Llama-3-8B). Specifically, we incorporate more conversational QA data to enhance its tabular and arithmetic calculation capability. Llama3-ChatQA-1.5 has two variants: Llama3-ChatQA-1.5-8B and Llama3-ChatQA-1.5-70B. Both models were originally trained using [Megatron-LM](https://github.com/NVIDIA/Megatron-LM), we converted the checkpoints to Hugging Face format. **For more information about ChatQA, check the [website](https://chatqa-project.github.io/)!**
 ## Other Resources
+[Llama3-ChatQA-1.5-70B](https://huggingface.co/nvidia/Llama3-ChatQA-1.5-70B)   [Evaluation Data](https://huggingface.co/datasets/nvidia/ChatRAG-Bench)   [Training Data](https://huggingface.co/datasets/nvidia/ChatQA-Training-Data)   [Retriever](https://huggingface.co/nvidia/dragon-multiturn-query-encoder)   [Website](https://chatqa-project.github.io/)   [Paper](https://arxiv.org/pdf/2401.10225v3)
 ## Benchmark Results
 Results in [ChatRAG Bench](https://huggingface.co/datasets/nvidia/ChatRAG-Bench) are as follows:
+|                             | ChatQA-1.0-7B | Command-R-Plus | Llama3-instruct-70b | GPT-4-0613 | GPT-4-Turbo | ChatQA-1.0-70B | ChatQA-1.5-8B | ChatQA-1.5-70B |
+| --: | :: | :: | :: | :---: |
+| Doc2Dial                    |     37.88     |     33.51      |        37.88        |   34.16    |    35.35    |     38.90      |     39.33     |     41.26      |
+| QuAC                        |     29.69     |     34.16      |        36.96        |   40.29    |    40.10    |     41.82      |     39.73     |     38.82      |
+| QReCC                       |     46.97     |     49.77      |        51.34        |   52.01    |    51.46    |     48.05      |     49.03     |     51.40      |
+| CoQA                        |     76.61     |     69.71      |        76.98        |   77.42    |    77.73    |     78.57      |     76.46     |     78.44      |
+| DoQA                        |     41.57     |     40.67      |        41.24        |   43.39    |    41.60    |     51.94      |     49.60     |     50.67      |
+| ConvFinQA                   |     51.61     |     71.21      |        76.6         |   81.28    |    84.16    |     73.69      |     78.46     |     81.88      |
+| SQA                         |     61.87     |     74.07      |        69.61        |   79.21    |    79.98    |     69.14      |     73.28     |     83.82      |
+| TopioCQA                    |     45.45     |     53.77      |        49.72        |   45.09    |    48.32    |     50.98      |     49.96     |     55.63      |
+| HybriDial*                  |     54.51     |      46.7      |        48.59        |   49.81    |    47.86    |     56.44      |     65.76     |     68.27      |
+| INSCIT                      |     30.96     |     35.76      |        36.23        |   36.34    |    33.75    |     31.90      |     30.10     |     32.31      |
+| Average (all)               |     47.71     |     50.93      |        52.52        |   53.90    |    54.03    |     54.14      |     55.17     |     58.25      |
+| Average (exclude HybriDial) |     46.96     |     51.40      |        52.95        |   54.35    |    54.72    |     53.89      |     53.99     |     57.14      |
+Note that ChatQA-1.5 is built based on Llama-3 base model, and ChatQA-1.0 is built based on Llama-2 base model. ChatQA-1.5 models use HybriDial training dataset. To ensure fair comparison, we also compare average scores excluding HybriDial. The data and evaluation scripts for ChatRAG Bench can be found [here](https://huggingface.co/datasets/nvidia/ChatRAG-Bench).
 ## Prompt Format
 ## Citation
 <pre>
 @article{liu2024chatqa,
+  title={ChatQA: Surpassing GPT-4 on Conversational QA and RAG},
   author={Liu, Zihan and Ping, Wei and Roy, Rajarshi and Xu, Peng and Lee, Chankyu and Shoeybi, Mohammad and Catanzaro, Bryan},
   journal={arXiv preprint arXiv:2401.10225},
   year={2024}}