TMElyralab
/

lyraChatGLM

Model card Files Files and versions Community

bigmoyan commited on May 17, 2023

Commit

060055d

·

1 Parent(s): b977cd1

Update README.md

Files changed (1) hide show

README.md +5 -0

README.md CHANGED Viewed

@@ -26,6 +26,11 @@ Among its main features are:
 - device: Nvidia A100 40G
 - batch size: 8
 |version|speed|
 |:-:|:-:|
 |original|30 tokens/s|

 - device: Nvidia A100 40G
 - batch size: 8
+**Since early chatGLM version dosen't suport batch inference, `original` in below table is measured on batch_size=1**
+**According to [this discussion](https://huggingface.co/TMElyralab/lyraChatGLM/discussions/6), this bug has been fixed and the speed on batch_size=8 reachs up to 137 tokens/s**
 |version|speed|
 |:-:|:-:|
 |original|30 tokens/s|