onnx-community
/

paligemma2-3b-pt-896

Image-Text-to-Text

Transformers.js

Model card Files Files and versions Community

Xenova HF staff commited on Dec 6, 2024

Commit

4058e59

·

verified ·

1 Parent(s): 8a33316

Update README.md

Files changed (1) hide show

README.md +45 -0

README.md CHANGED Viewed

@@ -5,4 +5,49 @@ base_model: google/paligemma2-3b-pt-896
 https://huggingface.co/google/paligemma2-3b-pt-896 with ONNX weights to be compatible with Transformers.js.
 Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).

 https://huggingface.co/google/paligemma2-3b-pt-896 with ONNX weights to be compatible with Transformers.js.
+## Usage (Transformers.js)
+If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
+```bash
+npm i @huggingface/transformers
+```
+**Example:** Image captioning with `onnx-community/paligemma2-3b-pt-896`.
+```js
+import { AutoProcessor, PaliGemmaForConditionalGeneration, load_image } from '@huggingface/transformers';
+// Load processor and model
+const model_id = 'onnx-community/paligemma2-3b-pt-896';
+const processor = await AutoProcessor.from_pretrained(model_id);
+const model = await PaliGemmaForConditionalGeneration.from_pretrained(model_id, {
+    dtype: {
+        embed_tokens: 'fp16', // or 'q8'
+        vision_encoder: 'q4', // or 'fp16', 'q8'
+        decoder_model_merged: 'q4', // or 'q4f16'
+    },
+});
+// Prepare inputs
+const url = 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg'
+const raw_image = await load_image(url);
+const prompt = '<image>'; // Caption, by default
+const inputs = await processor(raw_image, prompt);
+// Generate a response
+const output = await model.generate({
+    ...inputs,
+    max_new_tokens: 100,
+})
+const generated_ids = output.slice(null, [inputs.input_ids.dims[1], null]);
+const answer = processor.batch_decode(
+    generated_ids,
+    { skip_special_tokens: true },
+);
+console.log(answer[0]);
+// a classic car parked in front of a house
+```
+---
 Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).