ALmonster
/

ChemGPT2-QA-72B

Model card Files Files and versions Community

ALmonster commited on Nov 7, 2024

Commit

2c257e4

·

verified ·

1 Parent(s): 032c691

Update README.md

Files changed (1) hide show

README.md +69 -0

README.md CHANGED Viewed

@@ -54,4 +54,73 @@ generated_ids = [
 ]
 response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
 ```

 ]
 response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
+```
+## VLLM
+We recommend deploying our model using 4 A100 GPUs. You can run the vllm server-side with the following code in terminal:
+```python
+python -m vllm.entrypoints.openai.api_server --served-model-name chemgpt --model path/to/chemgpt --gpu-memory-utilization 0.98 --tensor-parallel-size 4 --port 6000
+```
+Then, you can use the following code to deploy client-side:
+```python
+import requests
+import json
+def general_chemgpt_stream(inputs,history):
+    url = 'http://loaclhost:6000/v1/chat/completions'
+    history+=[{"role": "user", "content": inputs},]
+    data = {
+        "model": "chemgpt",
+        "messages": history,
+    }
+    headers = {
+        'Content-Type': 'application/json'
+    }
+    response = requests.post(url, headers=headers, data=json.dumps(data))
+    headers = {"User-Agent": "vLLM Client"}
+    pload = {
+        "model": "chemgpt",
+        "stream": True,
+        "messages": history
+    }
+    response = requests.post(url,
+                             headers=headers,
+                             json=pload,
+                             stream=True)
+    for chunk in response.iter_lines(chunk_size=1,
+                                     decode_unicode=False,
+                                     delimiter=b"\n"):
+        if chunk:
+            string_data = chunk.decode("utf-8")
+            try:
+                json_data = json.loads(string_data[6:])
+                delta_content = json_data["choices"][0]["delta"]["content"]
+                assistant_reply+=delta_content
+                yield delta_content
+            except KeyError as e:
+                delta_content = json_data["choices"][0]["delta"]["role"]
+            except json.JSONDecodeError as e:
+                history+=[{
+                        "role": "assistant",
+                        "content": assistant_reply,
+                        "tool_calls": []
+                    },]
+                delta_content='[DONE]'
+                assert '[DONE]'==chunk.decode("utf-8")[6:]
+inputs='介绍一下NaoH'
+history_chem=[]
+for response_text in general_chemgpt_stream(inputs,history_chem):
+    print(response_text,end='')
 ```