czczup commited on
Commit
bba800e
·
verified ·
1 Parent(s): ef43f6e

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +20 -17
README.md CHANGED
@@ -36,39 +36,38 @@ This article comprises the following sections:
36
 
37
  ## Inference
38
 
39
- To deploy InternVL2, please configure the chat template config first. Create the following JSON file `chat_template.json`.
40
-
41
- ```json
42
- {
43
- "model_name":"internvl-internlm2",
44
- "meta_instruction":"我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。",
45
- "stop_words":["<|im_start|>", "<|im_end|>"]
46
- }
47
- ```
48
-
49
  Trying the following codes, you can perform the batched offline inference with the quantized model:
50
 
51
  ```python
52
- from lmdeploy import pipeline
53
- from lmdeploy.model import ChatTemplateConfig
54
- from lmdeploy.messages import TurbomindEngineConfig
55
  from lmdeploy.vl import load_image
56
 
57
  model = 'OpenGVLab/InternVL2-2B-AWQ'
58
- chat_template_config = ChatTemplateConfig.from_json('chat_template.json')
59
  image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
 
 
60
  backend_config = TurbomindEngineConfig(model_format='awq')
61
  pipe = pipeline(model, chat_template_config=chat_template_config,
62
- backend_config=backend_config,
63
- log_level='INFO')
64
  response = pipe(('describe this image', image))
65
- print(response)
66
  ```
67
 
68
  For more information about the pipeline parameters, please refer to [here](https://github.com/InternLM/lmdeploy/blob/main/docs/en/inference/pipeline.md).
69
 
70
  ## Service
71
 
 
 
 
 
 
 
 
 
 
 
72
  LMDeploy's `api_server` enables models to be easily packed into services with a single command. The provided RESTful APIs are compatible with OpenAI's interfaces. Below are an example of service startup.
73
 
74
  ```shell
@@ -77,6 +76,10 @@ lmdeploy serve api_server OpenGVLab/InternVL2-2B-AWQ --model-name InternVL2-2B-A
77
 
78
  To use the OpenAI-style interface, you need to install OpenAI:
79
 
 
 
 
 
80
  Then, use the code below to make the API call:
81
 
82
  ```python
 
36
 
37
  ## Inference
38
 
 
 
 
 
 
 
 
 
 
 
39
  Trying the following codes, you can perform the batched offline inference with the quantized model:
40
 
41
  ```python
42
+ from lmdeploy import pipeline, TurbomindEngineConfig, ChatTemplateConfig
 
 
43
  from lmdeploy.vl import load_image
44
 
45
  model = 'OpenGVLab/InternVL2-2B-AWQ'
46
+ system_prompt = '我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。'
47
  image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
48
+ chat_template_config = ChatTemplateConfig('internvl-internlm2')
49
+ chat_template_config.meta_instruction = system_prompt
50
  backend_config = TurbomindEngineConfig(model_format='awq')
51
  pipe = pipeline(model, chat_template_config=chat_template_config,
52
+ backend_config=backend_config))
 
53
  response = pipe(('describe this image', image))
54
+ print(response.text)
55
  ```
56
 
57
  For more information about the pipeline parameters, please refer to [here](https://github.com/InternLM/lmdeploy/blob/main/docs/en/inference/pipeline.md).
58
 
59
  ## Service
60
 
61
+ To deploy InternVL2 as an API, please configure the chat template config first. Create the following JSON file `chat_template.json`.
62
+
63
+ ```json
64
+ {
65
+ "model_name":"internvl-internlm2",
66
+ "meta_instruction":"我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。",
67
+ "stop_words":["<|im_start|>", "<|im_end|>"]
68
+ }
69
+ ```
70
+
71
  LMDeploy's `api_server` enables models to be easily packed into services with a single command. The provided RESTful APIs are compatible with OpenAI's interfaces. Below are an example of service startup.
72
 
73
  ```shell
 
76
 
77
  To use the OpenAI-style interface, you need to install OpenAI:
78
 
79
+ ```shell
80
+ pip install openai
81
+ ```
82
+
83
  Then, use the code below to make the API call:
84
 
85
  ```python