fix typos
Browse files
README.md
CHANGED
@@ -13,7 +13,7 @@ pipeline_tag: text-generation
|
|
13 |
|
14 |
# Dataset Card for Faro-Yi-9B-DPO
|
15 |
|
16 |
-
This is the DPO version of [wenbopan/Faro-Yi-9B](https://huggingface.co/wenbopan/Faro-Yi-9B). Compared to Faro-Yi-9B and [Yi-9B-200K](https://huggingface.co/01-ai/Yi-9B-200K), the DPO model
|
17 |
|
18 |
| **Metric** | **MMLU** | GSM8K | **hellaswag** | **truthfulqa** | **ai2_arc** | **winogrande** | **CMMLU** |
|
19 |
| --------------- | --------- | --------- | ------------- | -------------- | ----------- | -------------- | --------- |
|
@@ -24,9 +24,6 @@ This is the DPO version of [wenbopan/Faro-Yi-9B](https://huggingface.co/wenbopan
|
|
24 |
|
25 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/62cd3a3691d27e60db0698b0/Oa9QSbXgaYVekrYfgfaiC.png)
|
26 |
|
27 |
-
## How to Use
|
28 |
-
|
29 |
-
|
30 |
## How to Use
|
31 |
|
32 |
Faro-Yi-9B-DPO uses the chatml template and performs well in both short and long contexts. For longer inputs under **24GB of VRAM**, I recommend to use vLLM to have a max prompt of 32K. Setting `kv_cache_dtype="fp8_e5m2"` allows for 48K input length. 4bit-AWQ quantization on top of that can boost input length to 160K, albeit with some performance impact. Adjust `max_model_len` arg in vLLM or `config.json` to avoid OOM.
|
|
|
13 |
|
14 |
# Dataset Card for Faro-Yi-9B-DPO
|
15 |
|
16 |
+
This is the DPO version of [wenbopan/Faro-Yi-9B](https://huggingface.co/wenbopan/Faro-Yi-9B). Compared to Faro-Yi-9B and [Yi-9B-200K](https://huggingface.co/01-ai/Yi-9B-200K), the DPO model excels at many tasks, surpassing the original Yi-9B-200K by a large margin.
|
17 |
|
18 |
| **Metric** | **MMLU** | GSM8K | **hellaswag** | **truthfulqa** | **ai2_arc** | **winogrande** | **CMMLU** |
|
19 |
| --------------- | --------- | --------- | ------------- | -------------- | ----------- | -------------- | --------- |
|
|
|
24 |
|
25 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/62cd3a3691d27e60db0698b0/Oa9QSbXgaYVekrYfgfaiC.png)
|
26 |
|
|
|
|
|
|
|
27 |
## How to Use
|
28 |
|
29 |
Faro-Yi-9B-DPO uses the chatml template and performs well in both short and long contexts. For longer inputs under **24GB of VRAM**, I recommend to use vLLM to have a max prompt of 32K. Setting `kv_cache_dtype="fp8_e5m2"` allows for 48K input length. 4bit-AWQ quantization on top of that can boost input length to 160K, albeit with some performance impact. Adjust `max_model_len` arg in vLLM or `config.json` to avoid OOM.
|