metadata
base_model: jinaai/jina-embeddings-v2-base-zh
language:
- zh
- en
library_name: transformers.js
license: apache-2.0
tags:
- feature-extraction
- sentence-similarity
- mteb
- sentence_transformers
- transformers
inference: false
https://huggingface.co/jinaai/jina-embeddings-v2-base-zh with ONNX weights to be compatible with Transformers.js.
Usage (Transformers.js)
If you haven't already, you can install the Transformers.js JavaScript library from NPM using:
npm i @xenova/transformers
You can then use the model to compute embeddings, as follows:
import { pipeline, cos_sim } from '@xenova/transformers';
// Create a feature extraction pipeline
const extractor = await pipeline('feature-extraction', 'Xenova/jina-embeddings-v2-base-zh', {
quantized: false, // Comment out this line to use the quantized version
});
// Compute sentence embeddings
const texts = ['How is the weather today?', '今天天气怎么样?'];
const output = await extractor(texts, { pooling: 'mean', normalize: true });
// Tensor {
// dims: [2, 768],
// type: 'float32',
// data: Float32Array(1536)[...],
// size: 1536
// }
// Compute cosine similarity between the two embeddings
const score = cos_sim(output[0].data, output[1].data);
console.log(score);
// 0.7860610759096025
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named onnx
).