README.md · Xenova/jina-embeddings-v2-base-zh at main

metadata

base_model: jinaai/jina-embeddings-v2-base-zh
language:
  - zh
  - en
library_name: transformers.js
license: apache-2.0
tags:
  - feature-extraction
  - sentence-similarity
  - mteb
  - sentence_transformers
  - transformers
inference: false

https://huggingface.co/jinaai/jina-embeddings-v2-base-zh with ONNX weights to be compatible with Transformers.js.

Usage (Transformers.js)

If you haven't already, you can install the Transformers.js JavaScript library from NPM using:

npm i @xenova/transformers

You can then use the model to compute embeddings, as follows:

import { pipeline, cos_sim } from '@xenova/transformers';

// Create a feature extraction pipeline
const extractor = await pipeline('feature-extraction', 'Xenova/jina-embeddings-v2-base-zh', {
    quantized: false, // Comment out this line to use the quantized version
});

// Compute sentence embeddings
const texts = ['How is the weather today?', '今天天气怎么样?'];
const output = await extractor(texts, { pooling: 'mean', normalize: true });
// Tensor {
//   dims: [2, 768],
// 	 type: 'float32',
//   data: Float32Array(1536)[...],
// 	 size: 1536
// }

// Compute cosine similarity between the two embeddings
const score = cos_sim(output[0].data, output[1].data);
console.log(score);
// 0.7860610759096025

Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named onnx).