myownskyW7's picture
Update README.md
57a2a43 verified
|
raw
history blame
2.39 kB
metadata
license: other
pipeline_tag: text-generation

InternLM-XComposer2

InternLM-XComposer2 is a vision-language large model (VLLM) based on InternLM2 for advanced text-image comprehension and composition.

We release InternLM-XComposer2 series in two versions:

  • InternLM-XComposer2-VL: The pretrained VLLM model with InternLM2 as the initialization of the LLM, achieving strong performance on various multimodal benchmarks.
  • InternLM-XComposer2: The finetuned VLLM for Free-from Interleaved Text-Image Composition.

Import from Transformers

To load the InternLM-XComposer2-VL-7B model using Transformers, use the following code:

import torch
from PIL import image
from transformers import AutoTokenizer, AutoModelForCausalLM
ckpt_path = "internlm/internlm-xcomposer2-vl-7b"
tokenizer = AutoTokenizer.from_pretrained(ckpt_path, trust_remote_code=True).cuda()
# Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and might cause OOM Error.
model = AutoModelForCausalLM.from_pretrained(ckpt_path, torch_dtype=torch.float16, trust_remote_code=True).cuda()
model = model.eval()

通过 Transformers 加载

通过以下的代码加载 InternLM-XComposer2-VL-7B 模型

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
ckpt_path = "internlm/internlm-xcomposer2-vl-7b"
tokenizer = AutoTokenizer.from_pretrained(ckpt_path, trust_remote_code=True).cuda()
# `torch_dtype=torch.float16` 可以令模型以 float16 精度加载,否则 transformers 会将模型加载为 float32,导致显存不足
model = AutoModelForCausalLM.from_pretrained(ckpt_path, torch_dtype=torch.float16, trust_remote_code=True).cuda()
model = model.eval()

Open Source License

The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow free commercial usage. To apply for a commercial license, please fill in the application form (English)/申请表(中文). For other questions or collaborations, please contact internlm@pjlab.org.cn.