mjbuehler commited on
Commit
97bcd10
1 Parent(s): abc2829

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -6
README.md CHANGED
@@ -36,11 +36,11 @@ The model is developed to process diverse inputs, including images and text, fac
36
 
37
  Cephalo provides a robust framework for multimodal interaction and understanding, including the development of complex generative pipelines to create 2D and 3D renderings of material microstructures as input for additive manufacturing methods.
38
 
39
- This version of Cephalo, lamm-mit/Cephalo-Phi-3-vision-128k-4b-alpha, is based on the HuggingFaceM4/idefics2-8b-chatty model. The model was trained on a combination of scientific text-image data extracted from Wikipedia and scientific papers. For further details on the base model, see: https://huggingface.co/HuggingFaceM4/idefics2-8b-chatty. More details about technical aspects of the model, training and example applications to materials science problems are provided in the paper (reference at the bottom).
40
 
41
  ### Chat Format
42
 
43
- The lamm-mit/Cephalo-Idefics-2-vision-8b-alpha is suiteable for one or more image inputs, wih prompts using the chat format as follows:
44
 
45
  ```raw
46
  User: You carefully study the image, and respond accurately, but succinctly. Think step-by-step.
@@ -76,7 +76,7 @@ DEVICE='cuda:0'
76
  from transformers import AutoProcessor, Idefics2ForConditionalGeneration
77
  from tqdm.notebook import tqdm
78
 
79
- model_id='lamm-mit/Cephalo-Idefics-2-vision-8b-alpha'
80
 
81
  model = Idefics2ForConditionalGeneration.from_pretrained( model_id,
82
  torch_dtype=torch.bfloat16, #if your GPU allows
@@ -256,7 +256,7 @@ If your GPU allows, load and run inference in half precision (`torch.float16` or
256
 
257
  ```diff
258
  model = AutoModelForVision2Seq.from_pretrained(
259
- "lamm-mit/Cephalo-Idefics-2-vision-8b-alpha",
260
  + torch_dtype=torch.float16,
261
  ).to(DEVICE)
262
  ```
@@ -277,7 +277,7 @@ Mke sure to install `flash-attn`. Refer to the [original repository of Flash Att
277
 
278
  ```diff
279
  model = AutoModelForVision2Seq.from_pretrained(
280
- "lamm-mit/Cephalo-Idefics-2-vision-8b-alpha",
281
  + torch_dtype=torch.bfloat16,
282
  + _attn_implementation="flash_attention_2",
283
  ).to(DEVICE)
@@ -300,7 +300,7 @@ quantization_config = BitsAndBytesConfig(
300
  bnb_4bit_compute_dtype=torch.bfloat16
301
  )
302
  model = AutoModelForVision2Seq.from_pretrained(
303
- "lamm-mit/Cephalo-Idefics-2-vision-8b-alpha",
304
  + torch_dtype=torch.bfloat16,
305
  + quantization_config=quantization_config,
306
  ).to(DEVICE)
 
36
 
37
  Cephalo provides a robust framework for multimodal interaction and understanding, including the development of complex generative pipelines to create 2D and 3D renderings of material microstructures as input for additive manufacturing methods.
38
 
39
+ This version of Cephalo, lamm-mit/Cephalo-Idefics-2-vision-8b-beta, is based on the HuggingFaceM4/idefics2-8b-chatty model. The model was trained on a combination of scientific text-image data extracted from Wikipedia and scientific papers. For further details on the base model, see: https://huggingface.co/HuggingFaceM4/idefics2-8b-chatty. More details about technical aspects of the model, training and example applications to materials science problems are provided in the paper (reference at the bottom).
40
 
41
  ### Chat Format
42
 
43
+ The lamm-mit/Cephalo-Idefics-2-vision-8b-beta is suiteable for one or more image inputs, wih prompts using the chat format as follows:
44
 
45
  ```raw
46
  User: You carefully study the image, and respond accurately, but succinctly. Think step-by-step.
 
76
  from transformers import AutoProcessor, Idefics2ForConditionalGeneration
77
  from tqdm.notebook import tqdm
78
 
79
+ model_id='lamm-mit/Cephalo-Idefics-2-vision-8b-beta'
80
 
81
  model = Idefics2ForConditionalGeneration.from_pretrained( model_id,
82
  torch_dtype=torch.bfloat16, #if your GPU allows
 
256
 
257
  ```diff
258
  model = AutoModelForVision2Seq.from_pretrained(
259
+ "lamm-mit/Cephalo-Idefics-2-vision-8b-beta",
260
  + torch_dtype=torch.float16,
261
  ).to(DEVICE)
262
  ```
 
277
 
278
  ```diff
279
  model = AutoModelForVision2Seq.from_pretrained(
280
+ "lamm-mit/Cephalo-Idefics-2-vision-8b-beta",
281
  + torch_dtype=torch.bfloat16,
282
  + _attn_implementation="flash_attention_2",
283
  ).to(DEVICE)
 
300
  bnb_4bit_compute_dtype=torch.bfloat16
301
  )
302
  model = AutoModelForVision2Seq.from_pretrained(
303
+ "lamm-mit/Cephalo-Idefics-2-vision-8b-beta",
304
  + torch_dtype=torch.bfloat16,
305
  + quantization_config=quantization_config,
306
  ).to(DEVICE)