pdufour
/

Qwen2-VL-2B-Instruct-ONNX-Q4-F16

Model card Files Files and versions Community

pdufour commited on 15 days ago

Commit

a8fa7ff

•

1 Parent(s): b9ab9cc

Update EXPORT.md

Files changed (1) hide show

EXPORT.md +13 -8

EXPORT.md CHANGED Viewed

@@ -15,46 +15,51 @@ This will create an export of the onnx models.
 The following commands are all available:
-**General Commands**
 **all-in-one**
 Runs all steps (exporting, slimming, quantizing, cleaning, fixing GPU buffers) to produce fully prepared ONNX models.
-**Export Commands**
 **export**
 Combines export-abcd and export-e to generate ONNX models for all parts.
 **export-abcd**
 Exports model parts A, B, C, and D by running QwenVL_Export_ABCD.py.
 **export-e**
 Exports model part E by running QwenVL_Export_E.py.
-**Slimming Commands**
 ***slim**
 Reduces ONNX model size by removing unnecessary elements for optimized deployment.
-**Quantization Commands**
 **quantize**
 Quantizes all model parts (A, B, C, D, and E) to optimize size and performance.
 **quantize-%**
 Quantizes a specific model part (% can be A, B, C, D, or E) with targeted configurations.
-**Cleanup Commands**
 **clean-large-files**
 Deletes ONNX files larger than 2GB from the destination directory to retain appropriately sized models.
-**GPU Buffer Fix Command**
 **fix-gpu-buffers**
 Applies fixes to GPU buffers in ONNX files for part E to ensure GPU memory compatibility.
-**Combined Target**
-all
 Alias for all-in-one to run the full ONNX model preparation pipeline.

 The following commands are all available:
+## General Commands
 **all-in-one**
 Runs all steps (exporting, slimming, quantizing, cleaning, fixing GPU buffers) to produce fully prepared ONNX models.
+## Export Commands
 **export**
 Combines export-abcd and export-e to generate ONNX models for all parts.
 **export-abcd**
 Exports model parts A, B, C, and D by running QwenVL_Export_ABCD.py.
 **export-e**
 Exports model part E by running QwenVL_Export_E.py.
+## Slimming Commands
 ***slim**
 Reduces ONNX model size by removing unnecessary elements for optimized deployment.
+## Quantization Commands
 **quantize**
 Quantizes all model parts (A, B, C, D, and E) to optimize size and performance.
 **quantize-%**
 Quantizes a specific model part (% can be A, B, C, D, or E) with targeted configurations.
+## Cleanup Commands
 **clean-large-files**
 Deletes ONNX files larger than 2GB from the destination directory to retain appropriately sized models.
+## GPU Buffer Fix Command
 **fix-gpu-buffers**
 Applies fixes to GPU buffers in ONNX files for part E to ensure GPU memory compatibility.
+## Combined Target
+**all**
 Alias for all-in-one to run the full ONNX model preparation pipeline.