tjellm commited on
Commit
1c9319a
1 Parent(s): ad345a0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -3
README.md CHANGED
@@ -20,6 +20,26 @@ tags:
20
  inference: false
21
 
22
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
  # Phi-3-vision-128k-instruct ONNX
24
 
25
  This repository hosts the optimized versions of [microsoft/Phi-3-vision-128k-instruct](https://huggingface.co/microsoft/Phi-3-vision-128k-instruct/) to accelerate inference with DirectML and ONNX Runtime.
@@ -78,14 +98,14 @@ pip install huggingface-hub[cli]
78
 
79
  4. **Download the model:**
80
  ```sh
81
- huggingface-cli download EmbeddedLLM/Phi-3-vision-128k-instruct-onnx --include="onnx/directml/*" --local-dir .\Phi-3-vision-128k-instruct
82
  ```
83
 
84
  5. **Install necessary Python packages:**
85
  ```sh
86
  pip install numpy==1.26.4
87
- pip install onnxruntime-directml
88
- pip install --pre onnxruntime-genai-directml
89
  ```
90
 
91
  6. **Install Visual Studio 2015 runtime:**
 
20
  inference: false
21
 
22
  ---
23
+ # Phi-3-vision-128k-instruct ONNX models for CPU and CUDA
24
+ This repository hosts the optimized versions of [microsoft/Phi-3-vision-128k-instruct](https://huggingface.co/microsoft/Phi-3-vision-128k-instruct/) to accelerate inference with ONNX Runtime.
25
+ This repository is a clone from [microsoft/Phi-3-vision-128k-instruct-onnx-cpu](https://huggingface.co/microsoft/Phi-3-vision-128k-instruct-onnx-cpu), with extra files necessary for deploying the model with OpenAI-API-Compatible endpoints through [`embeddedllm`](https://github.com/EmbeddedLLM/embeddedllm) pypi library.
26
+
27
+ ## Usage on Windows (Intel / AMD / Nvidia / Qualcomm)
28
+ ```powershell
29
+ conda create -n onnx python=3.10
30
+ conda activate onnx
31
+ winget install -e --id GitHub.GitLFS
32
+ pip install huggingface-hub[cli]
33
+ huggingface-cli download EmbeddedLLM/Phi-3-vision-128k-instruct-onnx --include='onnx/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4' --local-dir .\Phi-3-vision-128k-instruct-onnx
34
+ pip install numpy==1.26.4
35
+ Invoke-WebRequest -Uri "https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/phi3v.py" -OutFile "phi3v.py"
36
+ pip install onnxruntime
37
+ pip install --pre onnxruntime-genai==0.3.0rc2
38
+ python phi3v.py -m .\Phi-3-vision-128k-instruct-onnx
39
+ ```
40
+
41
+ # UPSTREAM README.md
42
+
43
  # Phi-3-vision-128k-instruct ONNX
44
 
45
  This repository hosts the optimized versions of [microsoft/Phi-3-vision-128k-instruct](https://huggingface.co/microsoft/Phi-3-vision-128k-instruct/) to accelerate inference with DirectML and ONNX Runtime.
 
98
 
99
  4. **Download the model:**
100
  ```sh
101
+ huggingface-cli download EmbeddedLLM/Phi-3-vision-128k-instruct-onnx --include="onnx/cpu_and_mobile/*" --local-dir .\Phi-3-vision-128k-instruct
102
  ```
103
 
104
  5. **Install necessary Python packages:**
105
  ```sh
106
  pip install numpy==1.26.4
107
+ pip install onnxruntime
108
+ pip install --pre onnxruntime-genai==0.3.0rc2
109
  ```
110
 
111
  6. **Install Visual Studio 2015 runtime:**