Update READMD.md
Browse files
README.md
CHANGED
@@ -1,3 +1,156 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
---
|
4 |
+
|
5 |
+
# **Phi-3.5-vision-instruct-onnx-cpu**
|
6 |
+
|
7 |
+
<b><ul>Note: This is unoffical version,just for test and dev.</ul></b>
|
8 |
+
|
9 |
+
This is the ONNX format FP16/FP32 quantized version of the Microsoft Phi-3.5 Vision with GPU. You can use run this script to convert
|
10 |
+
|
11 |
+
|
12 |
+
**Convert Step by step**
|
13 |
+
|
14 |
+
1. Installation
|
15 |
+
|
16 |
+
```bash
|
17 |
+
|
18 |
+
pip install torch transformers onnx onnxruntime
|
19 |
+
|
20 |
+
pip install --pre onnxruntime-genai
|
21 |
+
|
22 |
+
```
|
23 |
+
|
24 |
+
2. Set environment in terminal
|
25 |
+
|
26 |
+
|
27 |
+
```bash
|
28 |
+
|
29 |
+
mkdir models
|
30 |
+
|
31 |
+
cd models
|
32 |
+
|
33 |
+
```
|
34 |
+
|
35 |
+
|
36 |
+
|
37 |
+
3. Download **microsoft/Phi-3.5-vision-instruct** in models folder
|
38 |
+
|
39 |
+
[https://huggingface.co/microsoft/Phi-3.5-vision-instruct](https://huggingface.co/microsoft/Phi-3.5-vision-instruct)
|
40 |
+
|
41 |
+
|
42 |
+
|
43 |
+
4. Please download these files to Your Phi-3.5-vision-instruct folder
|
44 |
+
|
45 |
+
https://huggingface.co/lokinfey/Phi-3.5-vision-instruct-onnx-cpu/resolve/main/onnx/config.json
|
46 |
+
|
47 |
+
https://huggingface.co/lokinfey/Phi-3.5-vision-instruct-onnx-cpu/blob/main/onnx/image_embedding_phi3_v_for_onnx.py
|
48 |
+
|
49 |
+
https://huggingface.co/lokinfey/Phi-3.5-vision-instruct-onnx-cpu/blob/main/onnx/modeling_phi3_v.py
|
50 |
+
|
51 |
+
|
52 |
+
5. Download this file to models folder
|
53 |
+
|
54 |
+
https://huggingface.co/lokinfey/Phi-3.5-vision-instruct-onnx-cpu/blob/main/onnx/build.py
|
55 |
+
|
56 |
+
6. Go to terminal
|
57 |
+
|
58 |
+
|
59 |
+
Convert ONNX support with FP16
|
60 |
+
|
61 |
+
|
62 |
+
```bash
|
63 |
+
|
64 |
+
python build.py -i .\Your Phi-3.5-vision-instruct Path\ -o .\vision-cpu-fp16 -p f16 -e cpu
|
65 |
+
|
66 |
+
```
|
67 |
+
|
68 |
+
|
69 |
+
Convert ONNX support with FP32
|
70 |
+
|
71 |
+
```bash
|
72 |
+
|
73 |
+
python build.py -i .\Your Phi-3.5-vision-instruct Path\ -o .\vision-cpu-fp32 -p f32 -e cpu
|
74 |
+
|
75 |
+
```
|
76 |
+
|
77 |
+
|
78 |
+
|
79 |
+
**Runing it with ORT for GenAI**
|
80 |
+
|
81 |
+
|
82 |
+
```python
|
83 |
+
|
84 |
+
import onnxruntime_genai as og
|
85 |
+
|
86 |
+
model_path = './Your Phi-3.5-vision-instruct Path'
|
87 |
+
|
88 |
+
# Define the path to the image file
|
89 |
+
# This path points to an image file that will be used for demonstration or testing
|
90 |
+
img_path = './Your Image Path'
|
91 |
+
|
92 |
+
|
93 |
+
# Create an instance of the Model class from the onnxruntime_genai module
|
94 |
+
# This instance is initialized with the path to the model file
|
95 |
+
model = og.Model(model_path)
|
96 |
+
|
97 |
+
# Create a multimodal processor using the model instance
|
98 |
+
# This processor will handle different types of input data (e.g., text, images)
|
99 |
+
processor = model.create_multimodal_processor()
|
100 |
+
|
101 |
+
# Create a stream for tokenizing input data using the processor
|
102 |
+
# This stream will be used to process and tokenize the input data for the model
|
103 |
+
tokenizer_stream = processor.create_stream()
|
104 |
+
|
105 |
+
text = "Your Prompt"
|
106 |
+
|
107 |
+
# Initialize a string variable for the prompt with a user tag
|
108 |
+
prompt = "<|user|>\n"
|
109 |
+
|
110 |
+
# Append an image tag to the prompt
|
111 |
+
prompt += "<|image_1|>\n"
|
112 |
+
|
113 |
+
# Append the text prompt to the prompt string, followed by an end tag
|
114 |
+
prompt += f"{text}<|end|>\n"
|
115 |
+
|
116 |
+
# Append an assistant tag to the prompt, indicating the start of the assistant's response
|
117 |
+
prompt += "<|assistant|>\n"
|
118 |
+
|
119 |
+
image = og.Images.open(img_path)
|
120 |
+
|
121 |
+
inputs = processor(prompt, images=image)
|
122 |
+
|
123 |
+
# Create an instance of the GeneratorParams class from the onnxruntime_genai module
|
124 |
+
# This instance is initialized with the model object
|
125 |
+
params = og.GeneratorParams(model)
|
126 |
+
|
127 |
+
# Set the inputs for the generator parameters using the processed inputs
|
128 |
+
params.set_inputs(inputs)
|
129 |
+
|
130 |
+
# Set the search options for the generator parameters
|
131 |
+
# The max_length parameter specifies the maximum length of the generated output
|
132 |
+
params.set_search_options(max_length=3072)
|
133 |
+
|
134 |
+
generator = og.Generator(model, params)
|
135 |
+
|
136 |
+
# Loop until the generator has finished generating tokens
|
137 |
+
while not generator.is_done():
|
138 |
+
# Compute the logits (probabilities) for the next token
|
139 |
+
generator.compute_logits()
|
140 |
+
|
141 |
+
# Generate the next token based on the computed logits
|
142 |
+
generator.generate_next_token()
|
143 |
+
|
144 |
+
# Retrieve the newly generated token
|
145 |
+
new_token = generator.get_next_tokens()[0]
|
146 |
+
|
147 |
+
# Decode the new token and append it to the code string
|
148 |
+
code += tokenizer_stream.decode(new_token)
|
149 |
+
|
150 |
+
# Print the decoded token to the console without a newline, and flush the output buffer
|
151 |
+
print(tokenizer_stream.decode(new_token), end='', flush=True)
|
152 |
+
|
153 |
+
```
|
154 |
+
|
155 |
+
|
156 |
+
|