pdufour commited on
Commit
b9ab9cc
1 Parent(s): 5fca4b3

Update EXPORT.md

Browse files
Files changed (1) hide show
  1. EXPORT.md +48 -2
EXPORT.md CHANGED
@@ -2,7 +2,7 @@
2
 
3
  The original model was exported using the following process:
4
 
5
- THe following repos were used:
6
  - https://github.com/pdufour/Native-LLM-for-Android
7
  - https://github.com/pdufour/transformers.js/tree/add-block-list
8
  -
@@ -11,4 +11,50 @@ If you close this repo and the above 2 to the same directory you can run the fol
11
  From `Qwen2-VL-2B-Instruct-ONNX-Q4-F16`:
12
  `make all-in-one`
13
 
14
- This will create an export of the onnx models.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
  The original model was exported using the following process:
4
 
5
+ The following repos were used:
6
  - https://github.com/pdufour/Native-LLM-for-Android
7
  - https://github.com/pdufour/transformers.js/tree/add-block-list
8
  -
 
11
  From `Qwen2-VL-2B-Instruct-ONNX-Q4-F16`:
12
  `make all-in-one`
13
 
14
+ This will create an export of the onnx models.
15
+
16
+ The following commands are all available:
17
+
18
+ **General Commands**
19
+
20
+ **all-in-one**
21
+ Runs all steps (exporting, slimming, quantizing, cleaning, fixing GPU buffers) to produce fully prepared ONNX models.
22
+
23
+ **Export Commands**
24
+
25
+ **export**
26
+ Combines export-abcd and export-e to generate ONNX models for all parts.
27
+
28
+ **export-abcd**
29
+ Exports model parts A, B, C, and D by running QwenVL_Export_ABCD.py.
30
+
31
+ **export-e**
32
+ Exports model part E by running QwenVL_Export_E.py.
33
+
34
+ **Slimming Commands**
35
+
36
+ ***slim**
37
+ Reduces ONNX model size by removing unnecessary elements for optimized deployment.
38
+
39
+ **Quantization Commands**
40
+
41
+ **quantize**
42
+ Quantizes all model parts (A, B, C, D, and E) to optimize size and performance.
43
+
44
+ **quantize-%**
45
+ Quantizes a specific model part (% can be A, B, C, D, or E) with targeted configurations.
46
+
47
+ **Cleanup Commands**
48
+
49
+ **clean-large-files**
50
+ Deletes ONNX files larger than 2GB from the destination directory to retain appropriately sized models.
51
+
52
+ **GPU Buffer Fix Command**
53
+
54
+ **fix-gpu-buffers**
55
+ Applies fixes to GPU buffers in ONNX files for part E to ensure GPU memory compatibility.
56
+
57
+ **Combined Target**
58
+
59
+ all
60
+ Alias for all-in-one to run the full ONNX model preparation pipeline.