Update README.md
Browse files
README.md
CHANGED
@@ -36,7 +36,7 @@ Results in ConvRAG Bench are as follows:
|
|
36 |
| Average (all) | 47.71 | 50.93 | 52.52 | 53.90 | 54.14 | 55.17 | 58.25 |
|
37 |
| Average (exclude HybriDial) | 46.96 | 51.40 | 52.95 | 54.35 | 53.89 | 53.99 | 57.14 |
|
38 |
|
39 |
-
Note that ChatQA-1.5 used some samples from the HybriDial training dataset. To ensure fair comparison, we also compare average scores excluding HybriDial. The data and evaluation scripts for ConvRAG can be found [here](https://huggingface.co/datasets/nvidia/ConvRAG-Bench).
|
40 |
|
41 |
|
42 |
## Prompt Format
|
@@ -63,7 +63,7 @@ This can be applied to the scenario where the whole document can be fitted into
|
|
63 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
64 |
import torch
|
65 |
|
66 |
-
model_id = "nvidia/ChatQA-1.5-8B"
|
67 |
|
68 |
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
69 |
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
|
@@ -104,7 +104,7 @@ print(tokenizer.decode(response, skip_special_tokens=True))
|
|
104 |
```
|
105 |
|
106 |
### run retrieval to get top-n chunks as context
|
107 |
-
This can be applied to the scenario when the document is very long, so that it is necessary to run retrieval. Here, we use our [Dragon-multiturn](https://huggingface.co/nvidia/dragon-multiturn-query-encoder) retriever which can handle conversatinoal query. In addition, we provide a few [documents](https://huggingface.co/nvidia/ChatQA-1.5-8B/tree/main/docs) for users to play with.
|
108 |
|
109 |
```python
|
110 |
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoModel
|
@@ -112,7 +112,7 @@ import torch
|
|
112 |
import json
|
113 |
|
114 |
## load ChatQA-1.5 tokenizer and model
|
115 |
-
model_id = "nvidia/ChatQA-1.5-8B"
|
116 |
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
117 |
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
|
118 |
|
|
|
36 |
| Average (all) | 47.71 | 50.93 | 52.52 | 53.90 | 54.14 | 55.17 | 58.25 |
|
37 |
| Average (exclude HybriDial) | 46.96 | 51.40 | 52.95 | 54.35 | 53.89 | 53.99 | 57.14 |
|
38 |
|
39 |
+
Note that ChatQA-1.5 is built based on Llama-3 base model, and ChatQA-1.0 is built based on Llama-2 base model. We used some samples from the HybriDial training dataset. To ensure fair comparison, we also compare average scores excluding HybriDial. The data and evaluation scripts for ConvRAG can be found [here](https://huggingface.co/datasets/nvidia/ConvRAG-Bench).
|
40 |
|
41 |
|
42 |
## Prompt Format
|
|
|
63 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
64 |
import torch
|
65 |
|
66 |
+
model_id = "nvidia/Llama3-ChatQA-1.5-8B"
|
67 |
|
68 |
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
69 |
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
|
|
|
104 |
```
|
105 |
|
106 |
### run retrieval to get top-n chunks as context
|
107 |
+
This can be applied to the scenario when the document is very long, so that it is necessary to run retrieval. Here, we use our [Dragon-multiturn](https://huggingface.co/nvidia/dragon-multiturn-query-encoder) retriever which can handle conversatinoal query. In addition, we provide a few [documents](https://huggingface.co/nvidia/Llama3-ChatQA-1.5-8B/tree/main/docs) for users to play with.
|
108 |
|
109 |
```python
|
110 |
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoModel
|
|
|
112 |
import json
|
113 |
|
114 |
## load ChatQA-1.5 tokenizer and model
|
115 |
+
model_id = "nvidia/Llama3-ChatQA-1.5-8B"
|
116 |
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
117 |
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
|
118 |
|