bwang0911 commited on
Commit
182195d
1 Parent(s): 1fdda36

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -6
README.md CHANGED
@@ -1081,16 +1081,12 @@ model-index:
1081
  `jina-embeddings-v2-base-zh` is a Chinese/English bilingual text **embedding model** supporting **8192 sequence length**.
1082
  It is based on a BERT architecture (JinaBERT) that supports the symmetric bidirectional variant of [ALiBi](https://arxiv.org/abs/2108.12409) to allow longer sequence length.
1083
  We have designed it for high performance in mongolingual & cross-language applications and trained it specifically to support mixed Chinese-English input without bias.
 
1084
 
1085
  `jina-embeddings-v2-base-zh` 是支持中英双语的文本向量模型,它支持长达8192字符的文本编码。
1086
  该模型的研发基于BERT架构(JinaBERT),JinaBERT是在BERT架构基础上的改进,首次将[ALiBi](https://arxiv.org/abs/2108.12409)应用到编码器架构中以支持更长的序列。
1087
  不同于以往的单语言/多语言向量模型,我们设计双语模型来更好的支持单语言(中搜中)以及跨语言(中搜英)文档检索。
1088
-
1089
- The embedding model was trained using 512 sequence length, but extrapolates to 8k sequence length (or even longer) thanks to ALiBi.
1090
- This makes our model useful for a range of use cases, especially when processing long documents is needed, including long document retrieval, semantic textual similarity, text reranking, recommendation, RAG and LLM-based generative search, etc.
1091
-
1092
- With a standard size of 161 million parameters, the model enables fast inference while delivering better performance than our small model. It is recommended to use a single GPU for inference.
1093
- Additionally, we provide the following embedding models:
1094
 
1095
  - [`jina-embeddings-v2-small-en`](https://huggingface.co/jinaai/jina-embeddings-v2-small-en): 33 million parameters.
1096
  - [`jina-embeddings-v2-base-en`](https://huggingface.co/jinaai/jina-embeddings-v2-base-en): 137 million parameters.
 
1081
  `jina-embeddings-v2-base-zh` is a Chinese/English bilingual text **embedding model** supporting **8192 sequence length**.
1082
  It is based on a BERT architecture (JinaBERT) that supports the symmetric bidirectional variant of [ALiBi](https://arxiv.org/abs/2108.12409) to allow longer sequence length.
1083
  We have designed it for high performance in mongolingual & cross-language applications and trained it specifically to support mixed Chinese-English input without bias.
1084
+ Additionally, we provide the following embedding models:
1085
 
1086
  `jina-embeddings-v2-base-zh` 是支持中英双语的文本向量模型,它支持长达8192字符的文本编码。
1087
  该模型的研发基于BERT架构(JinaBERT),JinaBERT是在BERT架构基础上的改进,首次将[ALiBi](https://arxiv.org/abs/2108.12409)应用到编码器架构中以支持更长的序列。
1088
  不同于以往的单语言/多语言向量模型,我们设计双语模型来更好的支持单语言(中搜中)以及跨语言(中搜英)文档检索。
1089
+ 除此之外,我们也提供其它向量模型:
 
 
 
 
 
1090
 
1091
  - [`jina-embeddings-v2-small-en`](https://huggingface.co/jinaai/jina-embeddings-v2-small-en): 33 million parameters.
1092
  - [`jina-embeddings-v2-base-en`](https://huggingface.co/jinaai/jina-embeddings-v2-base-en): 137 million parameters.