czczup commited on
Commit
65abe2e
·
verified ·
1 Parent(s): bcc0556

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -11,7 +11,7 @@ pipeline_tag: image-text-to-text
11
 
12
  ## Introduction
13
 
14
- We are excited to announce the release of InternVL 2.0, the latest addition to the InternVL series of multimodal large language models. InternVL 2.0 features a variety of **instruction-tuned models**, ranging from 2 billion to 108 billion parameters. This repository contains the instruction-tuned InternVL2-4B model.
15
 
16
  Compared to the state-of-the-art open-source multimodal large language models, InternVL 2.0 surpasses most open-source models. It demonstrates competitive performance on par with proprietary commercial models across various capabilities, including document and chart comprehension, infographics QA, scene text understanding and OCR tasks, scientific and mathematical problem solving, as well as cultural understanding and integrated multimodal capabilities.
17
 
@@ -60,8 +60,8 @@ InternVL 2.0 is a multimodal large language model series, featuring models of va
60
  | Model Size | 4B | 7B | 4.2B | 4.2B |
61
  | | | | | |
62
  | MVBench | 55.1 | 60.4 | 46.9 | 63.7 |
63
- | Video-MME<br>wo subs | - | 42.3 | TBD | TBD |
64
- | Video-MME<br>w/ subs | - | 54.6 | TBD | TBD |
65
 
66
  - We evaluate our models on MVBench by extracting 16 frames from each video, and each frame was resized to a 448x448 image.
67
 
@@ -71,6 +71,8 @@ Limitations: Although we have made efforts to ensure the safety of the model dur
71
 
72
  We provide an example code to run InternVL2-4B using `transformers`.
73
 
 
 
74
  > Please use transformers==4.37.2 to ensure the model works normally.
75
 
76
  ```python
@@ -330,7 +332,7 @@ from lmdeploy import pipeline, PytorchEngineConfig, ChatTemplateConfig
330
  from lmdeploy.vl import load_image
331
 
332
  model = 'OpenGVLab/InternVL2-4B'
333
- system_prompt = '我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新,开源开放,共享共创,推动科技进步和产业发展。'
334
  image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
335
  chat_template_config = ChatTemplateConfig('internvl-phi3')
336
  chat_template_config.meta_instruction = system_prompt
@@ -346,13 +348,15 @@ If `ImportError` occurs while executing this case, please install the required d
346
 
347
  When dealing with multiple images, you can put them all in one list. Keep in mind that multiple images will lead to a higher number of input tokens, and as a result, the size of the context window typically needs to be increased.
348
 
 
 
349
  ```python
350
  from lmdeploy import pipeline, PytorchEngineConfig, ChatTemplateConfig
351
  from lmdeploy.vl import load_image
352
  from lmdeploy.vl.constants import IMAGE_TOKEN
353
 
354
  model = 'OpenGVLab/InternVL2-4B'
355
- system_prompt = '我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新,开源开放,共享共创,推动科技进步和产业发展。'
356
  chat_template_config = ChatTemplateConfig('internvl-phi3')
357
  chat_template_config.meta_instruction = system_prompt
358
  pipe = pipeline(model, chat_template_config=chat_template_config,
@@ -378,7 +382,7 @@ from lmdeploy import pipeline, PytorchEngineConfig, ChatTemplateConfig
378
  from lmdeploy.vl import load_image
379
 
380
  model = 'OpenGVLab/InternVL2-4B'
381
- system_prompt = '我是书生·万象,英文名是InternVL,是由上���人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新,开源开放,共享共创,推动科技进步和产业发展。'
382
  chat_template_config = ChatTemplateConfig('internvl-phi3')
383
  chat_template_config.meta_instruction = system_prompt
384
  pipe = pipeline(model, chat_template_config=chat_template_config,
@@ -402,7 +406,7 @@ from lmdeploy import pipeline, PytorchEngineConfig, ChatTemplateConfig, Generati
402
  from lmdeploy.vl import load_image
403
 
404
  model = 'OpenGVLab/InternVL2-4B'
405
- system_prompt = '我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新,开源开放,共享共创,推动科技进步和产业发展。'
406
  chat_template_config = ChatTemplateConfig('internvl-phi3')
407
  chat_template_config.meta_instruction = system_prompt
408
  pipe = pipeline(model, chat_template_config=chat_template_config,
@@ -418,29 +422,63 @@ print(sess.response.text)
418
 
419
  #### Service
420
 
421
- For lmdeploy v0.5.0, please configure the chat template config first. Create the following JSON file `chat_template.json`.
422
 
423
  ```json
424
  {
425
- "model_name":"internlm2",
426
- "meta_instruction":"我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新,开源开放,共享共创,推动科技进步和产业发展。",
427
- "stop_words":["<|im_start|>", "<|im_end|>"]
428
  }
429
  ```
430
 
431
  LMDeploy's `api_server` enables models to be easily packed into services with a single command. The provided RESTful APIs are compatible with OpenAI's interfaces. Below are an example of service startup:
432
 
433
  ```shell
434
- lmdeploy serve api_server OpenGVLab/InternVL2-4B --backend pytorch --chat-template chat_template.json
435
  ```
436
 
437
- The default port of `api_server` is `23333`. After the server is launched, you can communicate with server on terminal through `api_client`:
438
 
439
  ```shell
440
- lmdeploy serve api_client http://0.0.0.0:23333
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
441
  ```
442
 
443
- You can overview and try out `api_server` APIs online by swagger UI at `http://0.0.0.0:23333`, or you can also read the API specification from [here](https://github.com/InternLM/lmdeploy/blob/main/docs/en/serving/restful_api.md).
 
 
 
 
 
 
444
 
445
  ## License
446
 
@@ -467,7 +505,7 @@ If you find this project useful in your research, please consider citing:
467
 
468
  ## 简介
469
 
470
- 我们很高兴宣布 InternVL 2.0 的发布,这是 InternVL 系列多模态大语言模型的最新版本。InternVL 2.0 提供了多种**指令微调**的模型,参数从 20 亿到 1080 亿不等。此仓库包含经过指令微调的 InternVL2-4B 模型。
471
 
472
  与最先进的开源多模态大语言模型相比,InternVL 2.0 超越了大多数开源模型。它在各种能力上表现出与闭源商业模型相媲美的竞争力,包括文档和图表理解、信息图表问答、场景文本理解和 OCR 任务、科学和数学问题解决,以及文化理解和综合多模态能力。
473
 
@@ -516,8 +554,8 @@ InternVL 2.0 是一个多模态大语言模型系列,包含各种规模的模
516
  | 模型大小 | 4B | 7B | 4.2B | 4.2B |
517
  | | | | | |
518
  | MVBench | 55.1 | 60.4 | 46.9 | 63.7 |
519
- | Video-MME<br>wo subs | - | 42.3 | TBD | TBD |
520
- | Video-MME<br>w/ subs | - | 54.6 | TBD | TBD |
521
 
522
  - 我们通过从每个视频中提取16帧来评估我们的模型在MVBench上的性能,每个视频帧被调整为448x448的图像。
523
 
@@ -527,6 +565,8 @@ InternVL 2.0 是一个多模态大语言模型系列,包含各种规模的模
527
 
528
  我们提供了一个示例代码,用于使用 `transformers` 运行 InternVL2-4B。
529
 
 
 
530
  > 请使用 transformers==4.37.2 以确保模型正常运行。
531
 
532
  示例代码请[点击这里](#quick-start)。
@@ -550,7 +590,7 @@ from lmdeploy import pipeline, PytorchEngineConfig, ChatTemplateConfig
550
  from lmdeploy.vl import load_image
551
 
552
  model = 'OpenGVLab/InternVL2-4B'
553
- system_prompt = '我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新,开源开放,共享共创,推动科技进步和产业发展。'
554
  image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
555
  chat_template_config = ChatTemplateConfig('internvl-phi3')
556
  chat_template_config.meta_instruction = system_prompt
@@ -572,7 +612,7 @@ from lmdeploy.vl import load_image
572
  from lmdeploy.vl.constants import IMAGE_TOKEN
573
 
574
  model = 'OpenGVLab/InternVL2-4B'
575
- system_prompt = '我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新,开源开放,共享共创,推动科技进步和产业发展。'
576
  chat_template_config = ChatTemplateConfig('internvl-phi3')
577
  chat_template_config.meta_instruction = system_prompt
578
  pipe = pipeline(model, chat_template_config=chat_template_config,
@@ -597,7 +637,7 @@ from lmdeploy import pipeline, PytorchEngineConfig, ChatTemplateConfig
597
  from lmdeploy.vl import load_image
598
 
599
  model = 'OpenGVLab/InternVL2-4B'
600
- system_prompt = '我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新,开源开放,共享共创,推动科技进步和产业发展。'
601
  chat_template_config = ChatTemplateConfig('internvl-phi3')
602
  chat_template_config.meta_instruction = system_prompt
603
  pipe = pipeline(model, chat_template_config=chat_template_config,
@@ -621,7 +661,7 @@ from lmdeploy import pipeline, PytorchEngineConfig, ChatTemplateConfig, Generati
621
  from lmdeploy.vl import load_image
622
 
623
  model = 'OpenGVLab/InternVL2-4B'
624
- system_prompt = '我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新,开源开放,共享共创,推动科技进步和产业发展。'
625
  chat_template_config = ChatTemplateConfig('internvl-phi3')
626
  chat_template_config.meta_instruction = system_prompt
627
  pipe = pipeline(model, chat_template_config=chat_template_config,
@@ -637,29 +677,63 @@ print(sess.response.text)
637
 
638
  #### API部署
639
 
640
- 对于 lmdeploy v0.5.0,请先配置聊天模板配置文件。创建如下的 JSON 文件 `chat_template.json`。
641
 
642
  ```json
643
  {
644
- "model_name":"internlm2",
645
- "meta_instruction":"我是书生·万象,英文���是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新,开源开放,共享共创,推动科技进步和产业发展。",
646
- "stop_words":["<|im_start|>", "<|im_end|>"]
647
  }
648
  ```
649
 
650
  LMDeploy 的 `api_server` 使模型能够通过一个命令轻松打包成服务。提供的 RESTful API 与 OpenAI 的接口兼容。以下是服务启动的示例:
651
 
652
  ```shell
653
- lmdeploy serve api_server OpenGVLab/InternVL2-4B --backend pytorch --chat-template chat_template.json
654
  ```
655
 
656
- `api_server` 的默认端口是 `23333`。服务器启动后,你可以通过 `api_client` 在终端与服务器通信:
657
 
658
  ```shell
659
- lmdeploy serve api_client http://0.0.0.0:23333
660
  ```
661
 
662
- 你可以通过 `http://0.0.0.0:23333` 的 swagger UI 在线查看和试用 `api_server` 的 API,也可以从 [这里](https://github.com/InternLM/lmdeploy/blob/main/docs/en/serving/restful_api.md) 阅读 API 规范。
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
663
 
664
  ## 开源许可证
665
 
 
11
 
12
  ## Introduction
13
 
14
+ We are excited to announce the release of InternVL 2.0, the latest addition to the InternVL series of multimodal large language models. InternVL 2.0 features a variety of **instruction-tuned models**, ranging from 1 billion to 108 billion parameters. This repository contains the instruction-tuned InternVL2-4B model.
15
 
16
  Compared to the state-of-the-art open-source multimodal large language models, InternVL 2.0 surpasses most open-source models. It demonstrates competitive performance on par with proprietary commercial models across various capabilities, including document and chart comprehension, infographics QA, scene text understanding and OCR tasks, scientific and mathematical problem solving, as well as cultural understanding and integrated multimodal capabilities.
17
 
 
60
  | Model Size | 4B | 7B | 4.2B | 4.2B |
61
  | | | | | |
62
  | MVBench | 55.1 | 60.4 | 46.9 | 63.7 |
63
+ | Video-MME<br>wo subs | - | 42.3 | TODO | TODO |
64
+ | Video-MME<br>w/ subs | - | 54.6 | TODO | TODO |
65
 
66
  - We evaluate our models on MVBench by extracting 16 frames from each video, and each frame was resized to a 448x448 image.
67
 
 
71
 
72
  We provide an example code to run InternVL2-4B using `transformers`.
73
 
74
+ We also welcome you to experience the InternVL2 series models in our [online demo](https://internvl.opengvlab.com/). Currently, due to the limited GPU resources with public IP addresses, we can only deploy models up to a maximum of 26B. We will expand soon and deploy larger models to the online demo.
75
+
76
  > Please use transformers==4.37.2 to ensure the model works normally.
77
 
78
  ```python
 
332
  from lmdeploy.vl import load_image
333
 
334
  model = 'OpenGVLab/InternVL2-4B'
335
+ system_prompt = '我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。'
336
  image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
337
  chat_template_config = ChatTemplateConfig('internvl-phi3')
338
  chat_template_config.meta_instruction = system_prompt
 
348
 
349
  When dealing with multiple images, you can put them all in one list. Keep in mind that multiple images will lead to a higher number of input tokens, and as a result, the size of the context window typically needs to be increased.
350
 
351
+ > Warning: Due to the scarcity of multi-image conversation data, the performance on multi-image tasks may be unstable, and it may require multiple attempts to achieve satisfactory results.
352
+
353
  ```python
354
  from lmdeploy import pipeline, PytorchEngineConfig, ChatTemplateConfig
355
  from lmdeploy.vl import load_image
356
  from lmdeploy.vl.constants import IMAGE_TOKEN
357
 
358
  model = 'OpenGVLab/InternVL2-4B'
359
+ system_prompt = '我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。'
360
  chat_template_config = ChatTemplateConfig('internvl-phi3')
361
  chat_template_config.meta_instruction = system_prompt
362
  pipe = pipeline(model, chat_template_config=chat_template_config,
 
382
  from lmdeploy.vl import load_image
383
 
384
  model = 'OpenGVLab/InternVL2-4B'
385
+ system_prompt = '我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。'
386
  chat_template_config = ChatTemplateConfig('internvl-phi3')
387
  chat_template_config.meta_instruction = system_prompt
388
  pipe = pipeline(model, chat_template_config=chat_template_config,
 
406
  from lmdeploy.vl import load_image
407
 
408
  model = 'OpenGVLab/InternVL2-4B'
409
+ system_prompt = '我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。'
410
  chat_template_config = ChatTemplateConfig('internvl-phi3')
411
  chat_template_config.meta_instruction = system_prompt
412
  pipe = pipeline(model, chat_template_config=chat_template_config,
 
422
 
423
  #### Service
424
 
425
+ To deploy InternVL2 as an API, please configure the chat template config first. Create the following JSON file `chat_template.json`.
426
 
427
  ```json
428
  {
429
+ "model_name":"internlm2-phi3",
430
+ "meta_instruction":"我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。",
431
+ "stop_words":["<|end|>"]
432
  }
433
  ```
434
 
435
  LMDeploy's `api_server` enables models to be easily packed into services with a single command. The provided RESTful APIs are compatible with OpenAI's interfaces. Below are an example of service startup:
436
 
437
  ```shell
438
+ lmdeploy serve api_server OpenGVLab/InternVL2-4B --model-name InternVL2-4B --backend pytorch --server-port 23333 --chat-template chat_template.json
439
  ```
440
 
441
+ To use the OpenAI-style interface, you need to install OpenAI:
442
 
443
  ```shell
444
+ pip install openai
445
+ ```
446
+
447
+ Then, use the code below to make the API call:
448
+
449
+ ```python
450
+ from openai import OpenAI
451
+
452
+ client = OpenAI(api_key='YOUR_API_KEY', base_url='http://0.0.0.0:23333/v1')
453
+ model_name = client.models.list().data[0].id
454
+ response = client.chat.completions.create(
455
+ model="InternVL2-4B",
456
+ messages=[{
457
+ 'role':
458
+ 'user',
459
+ 'content': [{
460
+ 'type': 'text',
461
+ 'text': 'describe this image',
462
+ }, {
463
+ 'type': 'image_url',
464
+ 'image_url': {
465
+ 'url':
466
+ 'https://modelscope.oss-cn-beijing.aliyuncs.com/resource/tiger.jpeg',
467
+ },
468
+ }],
469
+ }],
470
+ temperature=0.8,
471
+ top_p=0.8)
472
+ print(response)
473
  ```
474
 
475
+ ### vLLM
476
+
477
+ TODO
478
+
479
+ ### Ollama
480
+
481
+ TODO
482
 
483
  ## License
484
 
 
505
 
506
  ## 简介
507
 
508
+ 我们很高兴宣布 InternVL 2.0 的发布,这是 InternVL 系列多模态大语言模型的最新版本。InternVL 2.0 提供了多种**指令微调**的模型,参数从 10 亿到 1080 亿不等。此仓库包含经过指令微调的 InternVL2-4B 模型。
509
 
510
  与最先进的开源多模态大语言模型相比,InternVL 2.0 超越了大多数开源模型。它在各种能力上表现出与闭源商业模型相媲美的竞争力,包括文档和图表理解、信息图表问答、场景文本理解和 OCR 任务、科学和数学问题解决,以及文化理解和综合多模态能力。
511
 
 
554
  | 模型大小 | 4B | 7B | 4.2B | 4.2B |
555
  | | | | | |
556
  | MVBench | 55.1 | 60.4 | 46.9 | 63.7 |
557
+ | Video-MME<br>wo subs | - | 42.3 | TODO | TODO |
558
+ | Video-MME<br>w/ subs | - | 54.6 | TODO | TODO |
559
 
560
  - 我们通过从每个视频中提取16帧来评估我们的模型在MVBench上的性能,每个视频帧被调整为448x448的图像。
561
 
 
565
 
566
  我们提供了一个示例代码,用于使用 `transformers` 运行 InternVL2-4B。
567
 
568
+ 我们也欢迎你在我们的[在线demo](https://internvl.opengvlab.com/)中体验InternVL2的系列模型。目前,由于具备公网IP地址的GPU资源有限,我们目前只能部署最大到26B的模型。我们会在不久之后进行扩容,把更大的模型部署到在线demo上,敬请期待。
569
+
570
  > 请使用 transformers==4.37.2 以确保模型正常运行。
571
 
572
  示例代码请[点击这里](#quick-start)。
 
590
  from lmdeploy.vl import load_image
591
 
592
  model = 'OpenGVLab/InternVL2-4B'
593
+ system_prompt = '我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。'
594
  image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
595
  chat_template_config = ChatTemplateConfig('internvl-phi3')
596
  chat_template_config.meta_instruction = system_prompt
 
612
  from lmdeploy.vl.constants import IMAGE_TOKEN
613
 
614
  model = 'OpenGVLab/InternVL2-4B'
615
+ system_prompt = '我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。'
616
  chat_template_config = ChatTemplateConfig('internvl-phi3')
617
  chat_template_config.meta_instruction = system_prompt
618
  pipe = pipeline(model, chat_template_config=chat_template_config,
 
637
  from lmdeploy.vl import load_image
638
 
639
  model = 'OpenGVLab/InternVL2-4B'
640
+ system_prompt = '我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。'
641
  chat_template_config = ChatTemplateConfig('internvl-phi3')
642
  chat_template_config.meta_instruction = system_prompt
643
  pipe = pipeline(model, chat_template_config=chat_template_config,
 
661
  from lmdeploy.vl import load_image
662
 
663
  model = 'OpenGVLab/InternVL2-4B'
664
+ system_prompt = '我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。'
665
  chat_template_config = ChatTemplateConfig('internvl-phi3')
666
  chat_template_config.meta_instruction = system_prompt
667
  pipe = pipeline(model, chat_template_config=chat_template_config,
 
677
 
678
  #### API部署
679
 
680
+ 为了将InternVL2部署成API,请先配置聊天模板配置文件。创建如下的 JSON 文件 `chat_template.json`。
681
 
682
  ```json
683
  {
684
+ "model_name":"internlm2-phi3",
685
+ "meta_instruction":"我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。",
686
+ "stop_words":["<|end|>"]
687
  }
688
  ```
689
 
690
  LMDeploy 的 `api_server` 使模型能够通过一个命令轻松打包成服务。提供的 RESTful API 与 OpenAI 的接口兼容。以下是服务启动的示例:
691
 
692
  ```shell
693
+ lmdeploy serve api_server OpenGVLab/InternVL2-4B --model-name InternVL2-4B --backend pytorch --server-port 23333 --chat-template chat_template.json
694
  ```
695
 
696
+ 为了使用OpenAI风格的API接口,您需要安装OpenAI:
697
 
698
  ```shell
699
+ pip install openai
700
  ```
701
 
702
+ 然后,使用下面的代码进行API调用:
703
+
704
+ ```python
705
+ from openai import OpenAI
706
+
707
+ client = OpenAI(api_key='YOUR_API_KEY', base_url='http://0.0.0.0:23333/v1')
708
+ model_name = client.models.list().data[0].id
709
+ response = client.chat.completions.create(
710
+ model="InternVL2-4B",
711
+ messages=[{
712
+ 'role':
713
+ 'user',
714
+ 'content': [{
715
+ 'type': 'text',
716
+ 'text': 'describe this image',
717
+ }, {
718
+ 'type': 'image_url',
719
+ 'image_url': {
720
+ 'url':
721
+ 'https://modelscope.oss-cn-beijing.aliyuncs.com/resource/tiger.jpeg',
722
+ },
723
+ }],
724
+ }],
725
+ temperature=0.8,
726
+ top_p=0.8)
727
+ print(response)
728
+ ```
729
+
730
+ ### vLLM
731
+
732
+ TODO
733
+
734
+ ### Ollama
735
+
736
+ TODO
737
 
738
  ## 开源许可证
739
 
config.json CHANGED
@@ -193,7 +193,7 @@
193
  "tie_word_embeddings": false,
194
  "tokenizer_class": null,
195
  "top_k": 50,
196
- "top_p": null,
197
  "torch_dtype": "bfloat16",
198
  "torchscript": false,
199
  "transformers_version": "4.37.2",
 
193
  "tie_word_embeddings": false,
194
  "tokenizer_class": null,
195
  "top_k": 50,
196
+ "top_p": 1.0,
197
  "torch_dtype": "bfloat16",
198
  "torchscript": false,
199
  "transformers_version": "4.37.2",
configuration_intern_vit.py CHANGED
@@ -1,6 +1,6 @@
1
  # --------------------------------------------------------
2
  # InternVL
3
- # Copyright (c) 2023 OpenGVLab
4
  # Licensed under The MIT License [see LICENSE for details]
5
  # --------------------------------------------------------
6
  import os
 
1
  # --------------------------------------------------------
2
  # InternVL
3
+ # Copyright (c) 2024 OpenGVLab
4
  # Licensed under The MIT License [see LICENSE for details]
5
  # --------------------------------------------------------
6
  import os
configuration_internvl_chat.py CHANGED
@@ -1,6 +1,6 @@
1
  # --------------------------------------------------------
2
  # InternVL
3
- # Copyright (c) 2023 OpenGVLab
4
  # Licensed under The MIT License [see LICENSE for details]
5
  # --------------------------------------------------------
6
 
 
1
  # --------------------------------------------------------
2
  # InternVL
3
+ # Copyright (c) 2024 OpenGVLab
4
  # Licensed under The MIT License [see LICENSE for details]
5
  # --------------------------------------------------------
6
 
conversation.py CHANGED
@@ -330,13 +330,16 @@ def get_conv_template(name: str) -> Conversation:
330
  return conv_templates[name].copy()
331
 
332
 
333
- # Note that for inference, using the Hermes-2 and internlm2-chat templates is equivalent.
 
 
 
334
  register_conv_template(
335
  Conversation(
336
  name='Hermes-2',
337
  system_template='<|im_start|>system\n{system_message}',
338
  # note: The new system prompt was not used here to avoid changes in benchmark performance.
339
- # system_message='我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新,开源开放,共享共创,推动科技进步和产业发展。',
340
  system_message='你是由上海人工智能实验室联合商汤科技开发的书生多模态大模型,英文名叫InternVL, 是一个有用无害的人工智能助手。',
341
  roles=('<|im_start|>user\n', '<|im_start|>assistant\n'),
342
  sep_style=SeparatorStyle.MPT,
@@ -357,7 +360,7 @@ register_conv_template(
357
  name='internlm2-chat',
358
  system_template='<|im_start|>system\n{system_message}',
359
  # note: The new system prompt was not used here to avoid changes in benchmark performance.
360
- # system_message='我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新,开源开放,共享共创,推动科技进步和产业发展。',
361
  system_message='你是由上海人工智能实验室联合商汤科技开发的书生多模态大模型,英文名叫InternVL, 是一个有用无害的人工智能助手。',
362
  roles=('<|im_start|>user\n', '<|im_start|>assistant\n'),
363
  sep_style=SeparatorStyle.MPT,
@@ -376,7 +379,7 @@ register_conv_template(
376
  name='phi3-chat',
377
  system_template='<|system|>\n{system_message}',
378
  # note: The new system prompt was not used here to avoid changes in benchmark performance.
379
- # system_message='我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。人工智能实验室致力于原始技术创新,开源开放,共享共创,推动科技进步和产业发展。',
380
  system_message='你是由上海人工智能实验室联合商汤科技开发的书生多模态大模型,英文名叫InternVL, 是一个有用无害的人工智能助手。',
381
  roles=('<|user|>\n', '<|assistant|>\n'),
382
  sep_style=SeparatorStyle.MPT,
 
330
  return conv_templates[name].copy()
331
 
332
 
333
+ # Both Hermes-2 and internlm2-chat are chatml-format conversation templates. The difference
334
+ # is that during training, the preprocessing function for the Hermes-2 template doesn't add
335
+ # <s> at the beginning of the tokenized sequence, while the internlm2-chat template does.
336
+ # Therefore, they are completely equivalent during inference.
337
  register_conv_template(
338
  Conversation(
339
  name='Hermes-2',
340
  system_template='<|im_start|>system\n{system_message}',
341
  # note: The new system prompt was not used here to avoid changes in benchmark performance.
342
+ # system_message='我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。',
343
  system_message='你是由上海人工智能实验室联合商汤科技开发的书生多模态大模型,英文名叫InternVL, 是一个有用无害的人工智能助手。',
344
  roles=('<|im_start|>user\n', '<|im_start|>assistant\n'),
345
  sep_style=SeparatorStyle.MPT,
 
360
  name='internlm2-chat',
361
  system_template='<|im_start|>system\n{system_message}',
362
  # note: The new system prompt was not used here to avoid changes in benchmark performance.
363
+ # system_message='我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。',
364
  system_message='你是由上海人工智能实验室联合商汤科技开发的书生多模态大模型,英文名叫InternVL, 是一个有用无害的人工智能助手。',
365
  roles=('<|im_start|>user\n', '<|im_start|>assistant\n'),
366
  sep_style=SeparatorStyle.MPT,
 
379
  name='phi3-chat',
380
  system_template='<|system|>\n{system_message}',
381
  # note: The new system prompt was not used here to avoid changes in benchmark performance.
382
+ # system_message='我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。',
383
  system_message='你是由上海人工智能实验室联合商汤科技开发的书生多模态大模型,英文名叫InternVL, 是一个有用无害的人工智能助手。',
384
  roles=('<|user|>\n', '<|assistant|>\n'),
385
  sep_style=SeparatorStyle.MPT,
modeling_intern_vit.py CHANGED
@@ -1,6 +1,6 @@
1
  # --------------------------------------------------------
2
  # InternVL
3
- # Copyright (c) 2023 OpenGVLab
4
  # Licensed under The MIT License [see LICENSE for details]
5
  # --------------------------------------------------------
6
  from typing import Optional, Tuple, Union
 
1
  # --------------------------------------------------------
2
  # InternVL
3
+ # Copyright (c) 2024 OpenGVLab
4
  # Licensed under The MIT License [see LICENSE for details]
5
  # --------------------------------------------------------
6
  from typing import Optional, Tuple, Union