Update EXPORT.md
Browse files
EXPORT.md
CHANGED
@@ -36,19 +36,22 @@ Exports model part E by running QwenVL_Export_E.py.
|
|
36 |
|
37 |
Reduces ONNX model size by removing unnecessary elements for optimized deployment.
|
38 |
|
39 |
-
|
40 |
**quantize**
|
41 |
|
42 |
Quantizes all model parts (A, B, C, D, and E) to optimize size and performance.
|
43 |
|
44 |
**quantize-%**
|
|
|
45 |
Quantizes a specific model part (% can be A, B, C, D, or E) with targeted configurations.
|
46 |
|
47 |
**clean-large-files**
|
|
|
48 |
Deletes ONNX files larger than 2GB from the destination directory to retain models that will work for onnx environments.
|
49 |
|
50 |
**fix-gpu-buffers**
|
|
|
51 |
Applies fixes to GPU buffers in ONNX files for part E to ensure GPU memory compatibility.
|
52 |
|
53 |
**all**
|
|
|
54 |
Alias for all-in-one to run the full ONNX model preparation pipeline.
|
|
|
36 |
|
37 |
Reduces ONNX model size by removing unnecessary elements for optimized deployment.
|
38 |
|
|
|
39 |
**quantize**
|
40 |
|
41 |
Quantizes all model parts (A, B, C, D, and E) to optimize size and performance.
|
42 |
|
43 |
**quantize-%**
|
44 |
+
|
45 |
Quantizes a specific model part (% can be A, B, C, D, or E) with targeted configurations.
|
46 |
|
47 |
**clean-large-files**
|
48 |
+
|
49 |
Deletes ONNX files larger than 2GB from the destination directory to retain models that will work for onnx environments.
|
50 |
|
51 |
**fix-gpu-buffers**
|
52 |
+
|
53 |
Applies fixes to GPU buffers in ONNX files for part E to ensure GPU memory compatibility.
|
54 |
|
55 |
**all**
|
56 |
+
|
57 |
Alias for all-in-one to run the full ONNX model preparation pipeline.
|