OPEA
/

DeepSeek-R1-Distill-Llama-70B-int4-gptq-sym-inc

4-bit precision

Model card Files Files and versions Community

cicdatopea commited on 6 days ago

Commit

feaca7b

·

verified ·

1 Parent(s): f086880

Update README.md

Files changed (1) hide show

README.md +17 -16

README.md CHANGED Viewed

@@ -186,8 +186,7 @@ without assuming the other birds fly away, then after
 ### Evaluate the model
-will update later
-<!-- pip3 install lm-eval==0.4.7
 we found lm-eval is very unstable for this model. Please set `add_bos_token=True `to align with the origin model. **Please use autogptq format**
 ```bash
@@ -195,20 +194,22 @@ lm-eval --model hf --model_args pretrained=OPEA/DeepSeek-R1-Distill-Llama-70B-in
 ```
 |           Metric           |   BF16   |      INT4      |
 | :------------------------ | :---------------------- | :--------------- |
-| avg                  | 0.6647 |  0.6639|
-| leaderboard_mmlu_pro | -      | -      |
-| mmlu                 | 0.7964 | 0.7928 |
-| lambada_openai       | 0.6649 | 0.6718 |
-| hellaswag            | 0.6292 | 0.6223 |
-| winogrande           | 0.7482 | 0.7482 |
-| piqa                 | 0.8058 | 0.7982 |
-| truthfulqa_mc1       | 0.3831 | 0.3905 |
-| openbookqa           | 0.3520 | 0.3520 |
-| boolq                | 0.8963 | 0.8972 |
-| arc_easy             | 0.8207 | 0.8194 |
-| arc_challenge        | 0.5503 | 0.5469 |
-| leaderboard_ifeval   | -      | -      |
-| gsm8k                | -      | -      | -->

 ### Evaluate the model
+pip3 install lm-eval==0.4.7
 we found lm-eval is very unstable for this model. Please set `add_bos_token=True `to align with the origin model. **Please use autogptq format**
 ```bash
 ```
 |           Metric           |   BF16   |      INT4      |
 | :------------------------ | :---------------------- | :--------------- |
+| avg                  | 0.6636 | 0.6678 |
+|----------------------|--------|--------|
+| leaderboard_mmlu_pro | 0.4913 | 0.4780 |
+| mmlu                 | 0.7752 | 0.7791 |
+| lambada_openai       | 0.6977 | 0.6996 |
+| hellaswag            | 0.6408 | 0.6438 |
+| winogrande           | 0.7530 | 0.7782 |
+| piqa                 | 0.8112 | 0.8194 |
+| truthfulqa_mc1       | 0.3709 | 0.3721 |
+| openbookqa           | 0.3380 | 0.3600 |
+| boolq                | 0.8847 | 0.8917 |
+| arc_easy             | 0.8131 | 0.8106 |
+| arc_challenge        | 0.5512 | 0.5239 |
+| leaderboard_ifeval   | 0.4421 | 0.4208 |
+| gsm8k                | 0.9295 | 0.9265 |