Update EXPORT.md
Browse files
EXPORT.md
CHANGED
@@ -15,46 +15,51 @@ This will create an export of the onnx models.
|
|
15 |
|
16 |
The following commands are all available:
|
17 |
|
18 |
-
|
19 |
|
20 |
**all-in-one**
|
21 |
Runs all steps (exporting, slimming, quantizing, cleaning, fixing GPU buffers) to produce fully prepared ONNX models.
|
22 |
|
23 |
-
|
24 |
|
25 |
**export**
|
|
|
26 |
Combines export-abcd and export-e to generate ONNX models for all parts.
|
27 |
|
28 |
**export-abcd**
|
|
|
29 |
Exports model parts A, B, C, and D by running QwenVL_Export_ABCD.py.
|
30 |
|
31 |
**export-e**
|
|
|
32 |
Exports model part E by running QwenVL_Export_E.py.
|
33 |
|
34 |
-
|
35 |
|
36 |
***slim**
|
|
|
37 |
Reduces ONNX model size by removing unnecessary elements for optimized deployment.
|
38 |
|
39 |
-
|
40 |
|
41 |
**quantize**
|
|
|
42 |
Quantizes all model parts (A, B, C, D, and E) to optimize size and performance.
|
43 |
|
44 |
**quantize-%**
|
45 |
Quantizes a specific model part (% can be A, B, C, D, or E) with targeted configurations.
|
46 |
|
47 |
-
|
48 |
|
49 |
**clean-large-files**
|
50 |
Deletes ONNX files larger than 2GB from the destination directory to retain appropriately sized models.
|
51 |
|
52 |
-
|
53 |
|
54 |
**fix-gpu-buffers**
|
55 |
Applies fixes to GPU buffers in ONNX files for part E to ensure GPU memory compatibility.
|
56 |
|
57 |
-
|
58 |
|
59 |
-
all
|
60 |
Alias for all-in-one to run the full ONNX model preparation pipeline.
|
|
|
15 |
|
16 |
The following commands are all available:
|
17 |
|
18 |
+
## General Commands
|
19 |
|
20 |
**all-in-one**
|
21 |
Runs all steps (exporting, slimming, quantizing, cleaning, fixing GPU buffers) to produce fully prepared ONNX models.
|
22 |
|
23 |
+
## Export Commands
|
24 |
|
25 |
**export**
|
26 |
+
|
27 |
Combines export-abcd and export-e to generate ONNX models for all parts.
|
28 |
|
29 |
**export-abcd**
|
30 |
+
|
31 |
Exports model parts A, B, C, and D by running QwenVL_Export_ABCD.py.
|
32 |
|
33 |
**export-e**
|
34 |
+
|
35 |
Exports model part E by running QwenVL_Export_E.py.
|
36 |
|
37 |
+
## Slimming Commands
|
38 |
|
39 |
***slim**
|
40 |
+
|
41 |
Reduces ONNX model size by removing unnecessary elements for optimized deployment.
|
42 |
|
43 |
+
## Quantization Commands
|
44 |
|
45 |
**quantize**
|
46 |
+
|
47 |
Quantizes all model parts (A, B, C, D, and E) to optimize size and performance.
|
48 |
|
49 |
**quantize-%**
|
50 |
Quantizes a specific model part (% can be A, B, C, D, or E) with targeted configurations.
|
51 |
|
52 |
+
## Cleanup Commands
|
53 |
|
54 |
**clean-large-files**
|
55 |
Deletes ONNX files larger than 2GB from the destination directory to retain appropriately sized models.
|
56 |
|
57 |
+
## GPU Buffer Fix Command
|
58 |
|
59 |
**fix-gpu-buffers**
|
60 |
Applies fixes to GPU buffers in ONNX files for part E to ensure GPU memory compatibility.
|
61 |
|
62 |
+
## Combined Target
|
63 |
|
64 |
+
**all**
|
65 |
Alias for all-in-one to run the full ONNX model preparation pipeline.
|