Update README.md
Browse files
README.md
CHANGED
@@ -26,6 +26,11 @@ Among its main features are:
|
|
26 |
- device: Nvidia A100 40G
|
27 |
- batch size: 8
|
28 |
|
|
|
|
|
|
|
|
|
|
|
29 |
|version|speed|
|
30 |
|:-:|:-:|
|
31 |
|original|30 tokens/s|
|
|
|
26 |
- device: Nvidia A100 40G
|
27 |
- batch size: 8
|
28 |
|
29 |
+
**Since early chatGLM version dosen't suport batch inference, `original` in below table is measured on batch_size=1**
|
30 |
+
|
31 |
+
|
32 |
+
**According to [this discussion](https://huggingface.co/TMElyralab/lyraChatGLM/discussions/6), this bug has been fixed and the speed on batch_size=8 reachs up to 137 tokens/s**
|
33 |
+
|
34 |
|version|speed|
|
35 |
|:-:|:-:|
|
36 |
|original|30 tokens/s|
|