pdufour
/

Qwen2-VL-2B-Instruct-ONNX-Q4-F16

Model card Files Files and versions Community

pdufour commited on 15 days ago

Commit

6dfb44c

•

1 Parent(s): b2b04a4

Update EXPORT.md

Files changed (1) hide show

EXPORT.md +1 -13

EXPORT.md CHANGED Viewed

@@ -15,13 +15,10 @@ This will create an export of the onnx models.
 The following commands are all available:
-## General Commands
 **all-in-one**
 Runs all steps (exporting, slimming, quantizing, cleaning, fixing GPU buffers) to produce fully prepared ONNX models.
-## Export Commands
 **export**
@@ -35,13 +32,10 @@ Exports model parts A, B, C, and D by running QwenVL_Export_ABCD.py.
 Exports model part E by running QwenVL_Export_E.py.
-## Slimming Commands
 ***slim**
 Reduces ONNX model size by removing unnecessary elements for optimized deployment.
-## Quantization Commands
 **quantize**
@@ -50,17 +44,11 @@ Quantizes all model parts (A, B, C, D, and E) to optimize size and performance.
 **quantize-%**
 Quantizes a specific model part (% can be A, B, C, D, or E) with targeted configurations.
-## Cleanup Commands
 **clean-large-files**
-Deletes ONNX files larger than 2GB from the destination directory to retain appropriately sized models.
-## GPU Buffer Fix Command
 **fix-gpu-buffers**
 Applies fixes to GPU buffers in ONNX files for part E to ensure GPU memory compatibility.
-## Combined Target
 **all**
 Alias for all-in-one to run the full ONNX model preparation pipeline.

 The following commands are all available:
 **all-in-one**
 Runs all steps (exporting, slimming, quantizing, cleaning, fixing GPU buffers) to produce fully prepared ONNX models.
 **export**
 Exports model part E by running QwenVL_Export_E.py.
 ***slim**
 Reduces ONNX model size by removing unnecessary elements for optimized deployment.
 **quantize**
 **quantize-%**
 Quantizes a specific model part (% can be A, B, C, D, or E) with targeted configurations.
 **clean-large-files**
+Deletes ONNX files larger than 2GB from the destination directory to retain models that will work for onnx environments.
 **fix-gpu-buffers**
 Applies fixes to GPU buffers in ONNX files for part E to ensure GPU memory compatibility.
 **all**
 Alias for all-in-one to run the full ONNX model preparation pipeline.