RaushanTurganbay HF staff commited on
Commit
03b8560
·
verified ·
1 Parent(s): 1396728

Update pipeline example

Browse files
Files changed (1) hide show
  1. README.md +7 -17
README.md CHANGED
@@ -1,8 +1,7 @@
1
  ---
2
  language:
3
  - en
4
- pipeline_tag: image-to-text
5
- inference: false
6
  arxiv: 2304.08485
7
  tags:
8
  - vision
@@ -51,30 +50,21 @@ Where `<prompt>` denotes the prompt asked by the user
51
 
52
  ```python
53
  from transformers import pipeline
54
- from PIL import Image
55
- import requests
56
-
57
- model_id = "llava-hf/vip-llava-13b-hf"
58
- pipe = pipeline("image-to-text", model=model_id)
59
- url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/ai2d-demo.jpg"
60
- image = Image.open(requests.get(url, stream=True).raw)
61
 
62
- # Define a chat histiry and use `apply_chat_template` to get correctly formatted prompt
63
- # Each value in "content" has to be a list of dicts with types ("text", "image")
64
- conversation = [
65
  {
66
-
67
  "role": "user",
68
  "content": [
 
69
  {"type": "text", "text": "What does the label 15 represent? (1) lava (2) core (3) tunnel (4) ash cloud"},
70
- {"type": "image"},
71
  ],
72
  },
73
  ]
74
- prompt = processor.apply_chat_template(conversation, add_generation_prompt=True)
75
 
76
- outputs = pipe(image, prompt=prompt, generate_kwargs={"max_new_tokens": 200})
77
- print(outputs)
 
78
  ```
79
 
80
  ### Using pure `transformers`:
 
1
  ---
2
  language:
3
  - en
4
+ pipeline_tag: image-text-to-text
 
5
  arxiv: 2304.08485
6
  tags:
7
  - vision
 
50
 
51
  ```python
52
  from transformers import pipeline
 
 
 
 
 
 
 
53
 
54
+ pipe = pipeline("image-text-to-text", model="llava-hf/vip-llava-13b-hf")
55
+ messages = [
 
56
  {
 
57
  "role": "user",
58
  "content": [
59
+ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/ai2d-demo.jpg"},
60
  {"type": "text", "text": "What does the label 15 represent? (1) lava (2) core (3) tunnel (4) ash cloud"},
 
61
  ],
62
  },
63
  ]
 
64
 
65
+ out = pipe(text=messages, max_new_tokens=20)
66
+ print(out)
67
+ >>> [{'input_text': [{'role': 'user', 'content': [{'type': 'image', 'url': 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/ai2d-demo.jpg'}, {'type': 'text', 'text': 'What does the label 15 represent? (1) lava (2) core (3) tunnel (4) ash cloud'}]}], 'generated_text': 'Lava'}]
68
  ```
69
 
70
  ### Using pure `transformers`: