Update README.md
Browse files
README.md
CHANGED
@@ -33,8 +33,8 @@ llama-3-8b-instruct-262k-chinese基于[Llama-3-8B-Instruct-262k](https://hugging
|
|
33 |
|
34 |
Quantization | Peak Usage for Encoding 2048 Tokens | Peak Usage for Generating 8192 Tokens
|
35 |
-- | -- | --
|
36 |
-
FP16/BF16 |
|
37 |
-
Int4 |
|
38 |
|
39 |
|
40 |
缺点:
|
|
|
33 |
|
34 |
Quantization | Peak Usage for Encoding 2048 Tokens | Peak Usage for Generating 8192 Tokens
|
35 |
-- | -- | --
|
36 |
+
FP16/BF16 | 18.66GB | 24.58GB
|
37 |
+
Int4 | 9.21GB | 14.62GB
|
38 |
|
39 |
|
40 |
缺点:
|