TMElyralab
/

lyraBELLE

Model card Files Files and versions Community

bigmoyan commited on May 22, 2023

Commit

b623eca

•

1 Parent(s): abd81f7

Update README.md

Files changed (1) hide show

README.md +15 -3

README.md CHANGED Viewed

@@ -7,11 +7,11 @@ tags:
 - tensorRT
 - Belle
 ---
-## Model Card for lyraBELLE
 lyraBelle is currently the **fastest BELLE model** available. To the best of our knowledge, it is the **first accelerated version of Belle**.
-The inference speed of lyraChatGLM has achieved **10x** acceleration upon the ealry original version. We are still working hard to further improve the performance.
 Among its main features are:
@@ -19,6 +19,12 @@ Among its main features are:
 - device: Nvidia Ampere architechture or newer (e.g A100)
 - batch_size: compiled with dynamic batch size, max batch_size = 8
 ## Speed
 ### test environment
@@ -33,7 +39,13 @@ Among its main features are:
 - **Repository:** [https://huggingface.co/BelleGroup/BELLE-7B-2M?clone=true]
 ## Uses
@@ -47,7 +59,7 @@ model_dir = "./model"
 model_name = "1-gpu-fp16.h5"
 max_output_length = 512
 model = LyraBelle(model_dir, model_name, data_type, 0)
 output_texts = model.generate(prompts, output_length=max_output_length,top_k=30, top_p=0.85, temperature=0.35, repetition_penalty=1.2, do_sample=True)
 print(output_texts)

 - tensorRT
 - Belle
 ---
+## Model Card for lyraBelle
 lyraBelle is currently the **fastest BELLE model** available. To the best of our knowledge, it is the **first accelerated version of Belle**.
+The inference speed of lyraBelle has achieved **10x** acceleration upon the ealry original version. We are still working hard to further improve the performance.
 Among its main features are:
 - device: Nvidia Ampere architechture or newer (e.g A100)
 - batch_size: compiled with dynamic batch size, max batch_size = 8
+Note that:
+**Some interface/code were set for future uses(see demo below).**
+- **int8 mode**: not supported yet, please always set it to 0
+- **data type**: only `fp16` available.
 ## Speed
 ### test environment
 - **Repository:** [https://huggingface.co/BelleGroup/BELLE-7B-2M?clone=true]
+## Environment
+- **docker image available** at [https://hub.docker.com/repository/docker/bigmoyan/lyrallm/general], pull image by:
+```
+docker pull bigmoyan/lyrallm:v0.1
+```
 ## Uses
 model_name = "1-gpu-fp16.h5"
 max_output_length = 512
+# int8 mode not supported, data_type only support fp16
 model = LyraBelle(model_dir, model_name, data_type, 0)
 output_texts = model.generate(prompts, output_length=max_output_length,top_k=30, top_p=0.85, temperature=0.35, repetition_penalty=1.2, do_sample=True)
 print(output_texts)