Error while converting PaliGemma 2

#6
by NSTiwari - opened

I'm trying to convert the PaliGemma 2 model (google/paligemma2-3b-pt-224) but get an error while running the script:

!python3 -m scripts.convert --quantize --model_id "google/paligemma2-3b-pt-224"

2024-12-30 15:10:39.561699: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-30 15:10:39.584457: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-30 15:10:39.591256: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-30 15:10:39.607611: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-12-30 15:10:40.829757: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/content/transformers.js/scripts/convert.py", line 454, in
main()
File "/content/transformers.js/scripts/convert.py", line 341, in main
main_export(**export_kwargs)
File "/usr/local/lib/python3.10/dist-packages/optimum/exporters/onnx/main.py", line 303, in main_export
model = TasksManager.get_model_from_task(
File "/usr/local/lib/python3.10/dist-packages/optimum/exporters/tasks.py", line 2071, in get_model_from_task
model_class = TasksManager.get_model_class_for_task(
File "/usr/local/lib/python3.10/dist-packages/optimum/exporters/tasks.py", line 1394, in get_model_class_for_task
raise KeyError(
KeyError: "Unknown task: image-text-to-text. Possible values are: audio-classification for AutoModelForAudioClassification, audio-frame-classification for AutoModelForAudioFrameClassification, audio-xvector for AutoModelForAudioXVector, automatic-speech-recognition for ('AutoModelForSpeechSeq2Seq', 'AutoModelForCTC'), depth-estimation for AutoModelForDepthEstimation, feature-extraction for AutoModel, fill-mask for AutoModelForMaskedLM, image-classification for AutoModelForImageClassification, image-segmentation for ('AutoModelForImageSegmentation', 'AutoModelForSemanticSegmentation'), image-to-image for AutoModelForImageToImage, image-to-text for AutoModelForVision2Seq, mask-generation for AutoModel, masked-im for AutoModelForMaskedImageModeling, multiple-choice for AutoModelForMultipleChoice, object-detection for AutoModelForObjectDetection, question-answering for AutoModelForQuestionAnswering, semantic-segmentation for AutoModelForSemanticSegmentation, text-to-audio for ('AutoModelForTextToSpectrogram', 'AutoModelForTextToWaveform'), text-generation for AutoModelForCausalLM, text2text-generation for AutoModelForSeq2SeqLM, text-classification for AutoModelForSequenceClassification, token-classification for AutoModelForTokenClassification, zero-shot-image-classification for AutoModelForZeroShotImageClassification, zero-shot-object-detection for AutoModelForZeroShotObjectDetection"

Reproduction
Ran the script below
!python3 -m scripts.convert --quantize --model_id "google/paligemma2-3b-pt-224"

I've also raised a GitHub issue for the same:
https://github.com/huggingface/transformers.js/issues/1126

@Xenova could you please help with this?

Sign up or log in to comment