Update EXPORT.md
Browse files
EXPORT.md
CHANGED
@@ -15,13 +15,10 @@ This will create an export of the onnx models.
|
|
15 |
|
16 |
The following commands are all available:
|
17 |
|
18 |
-
## General Commands
|
19 |
-
|
20 |
**all-in-one**
|
21 |
|
22 |
Runs all steps (exporting, slimming, quantizing, cleaning, fixing GPU buffers) to produce fully prepared ONNX models.
|
23 |
|
24 |
-
## Export Commands
|
25 |
|
26 |
**export**
|
27 |
|
@@ -35,13 +32,10 @@ Exports model parts A, B, C, and D by running QwenVL_Export_ABCD.py.
|
|
35 |
|
36 |
Exports model part E by running QwenVL_Export_E.py.
|
37 |
|
38 |
-
## Slimming Commands
|
39 |
-
|
40 |
***slim**
|
41 |
|
42 |
Reduces ONNX model size by removing unnecessary elements for optimized deployment.
|
43 |
|
44 |
-
## Quantization Commands
|
45 |
|
46 |
**quantize**
|
47 |
|
@@ -50,17 +44,11 @@ Quantizes all model parts (A, B, C, D, and E) to optimize size and performance.
|
|
50 |
**quantize-%**
|
51 |
Quantizes a specific model part (% can be A, B, C, D, or E) with targeted configurations.
|
52 |
|
53 |
-
## Cleanup Commands
|
54 |
-
|
55 |
**clean-large-files**
|
56 |
-
Deletes ONNX files larger than 2GB from the destination directory to retain
|
57 |
-
|
58 |
-
## GPU Buffer Fix Command
|
59 |
|
60 |
**fix-gpu-buffers**
|
61 |
Applies fixes to GPU buffers in ONNX files for part E to ensure GPU memory compatibility.
|
62 |
|
63 |
-
## Combined Target
|
64 |
-
|
65 |
**all**
|
66 |
Alias for all-in-one to run the full ONNX model preparation pipeline.
|
|
|
15 |
|
16 |
The following commands are all available:
|
17 |
|
|
|
|
|
18 |
**all-in-one**
|
19 |
|
20 |
Runs all steps (exporting, slimming, quantizing, cleaning, fixing GPU buffers) to produce fully prepared ONNX models.
|
21 |
|
|
|
22 |
|
23 |
**export**
|
24 |
|
|
|
32 |
|
33 |
Exports model part E by running QwenVL_Export_E.py.
|
34 |
|
|
|
|
|
35 |
***slim**
|
36 |
|
37 |
Reduces ONNX model size by removing unnecessary elements for optimized deployment.
|
38 |
|
|
|
39 |
|
40 |
**quantize**
|
41 |
|
|
|
44 |
**quantize-%**
|
45 |
Quantizes a specific model part (% can be A, B, C, D, or E) with targeted configurations.
|
46 |
|
|
|
|
|
47 |
**clean-large-files**
|
48 |
+
Deletes ONNX files larger than 2GB from the destination directory to retain models that will work for onnx environments.
|
|
|
|
|
49 |
|
50 |
**fix-gpu-buffers**
|
51 |
Applies fixes to GPU buffers in ONNX files for part E to ensure GPU memory compatibility.
|
52 |
|
|
|
|
|
53 |
**all**
|
54 |
Alias for all-in-one to run the full ONNX model preparation pipeline.
|