michaelfeil commited on
Commit
4bada5a
1 Parent(s): 045e0b3

Upload intfloat/e5-small-v2 ctranslate fp16 weights

Browse files
Files changed (1) hide show
  1. README.md +20 -15
README.md CHANGED
@@ -2608,21 +2608,11 @@ Speedup inference while reducing memory by 2x-4x using int8 inference in C++ on
2608
 
2609
  quantized version of [intfloat/e5-small-v2](https://huggingface.co/intfloat/e5-small-v2)
2610
  ```bash
2611
- pip install hf-hub-ctranslate2>=2.0.8 ctranslate2>=3.16.0
2612
- ```
2613
- Converted on 2023-06-15 using
2614
- ```
2615
- ct2-transformers-converter --model intfloat/e5-small-v2 --output_dir ~/tmp-ct2fast-e5-small-v2 --force --copy_files tokenizer.json README.md tokenizer_config.json vocab.txt special_tokens_map.json .gitattributes --quantization float16 --trust_remote_code
2616
  ```
2617
 
2618
- Checkpoint compatible to [ctranslate2>=3.16.0](https://github.com/OpenNMT/CTranslate2)
2619
- and [hf-hub-ctranslate2>=2.0.8](https://github.com/michaelfeil/hf-hub-ctranslate2)
2620
- - `compute_type=int8_float16` for `device="cuda"`
2621
- - `compute_type=int8` for `device="cpu"`
2622
-
2623
  ```python
2624
- from transformers import AutoTokenizer
2625
-
2626
  model_name = "michaelfeil/ct2fast-e5-small-v2"
2627
 
2628
  from hf_hub_ctranslate2 import EncoderCT2fromHfHub
@@ -2633,10 +2623,25 @@ model = EncoderCT2fromHfHub(
2633
  compute_type="float16",
2634
  # tokenizer=AutoTokenizer.from_pretrained("{ORG}/{NAME}")
2635
  )
2636
- outputs = model.generate(
2637
- text=["I like soccer", "I like tennis", "The eiffel tower is in Paris"],
 
 
 
2638
  )
2639
- print(outputs.shape, outputs)
 
 
 
 
 
 
 
 
 
 
 
 
2640
  ```
2641
 
2642
  # Licence and other remarks:
 
2608
 
2609
  quantized version of [intfloat/e5-small-v2](https://huggingface.co/intfloat/e5-small-v2)
2610
  ```bash
2611
+ pip install hf-hub-ctranslate2>=2.10.0 ctranslate2>=3.16.0
 
 
 
 
2612
  ```
2613
 
 
 
 
 
 
2614
  ```python
2615
+ # from transformers import AutoTokenizer
 
2616
  model_name = "michaelfeil/ct2fast-e5-small-v2"
2617
 
2618
  from hf_hub_ctranslate2 import EncoderCT2fromHfHub
 
2623
  compute_type="float16",
2624
  # tokenizer=AutoTokenizer.from_pretrained("{ORG}/{NAME}")
2625
  )
2626
+ embeddings = model.encode(
2627
+ ["I like soccer", "I like tennis", "The eiffel tower is in Paris"],
2628
+ batch_size=32,
2629
+ convert_to_numpy=True,
2630
+ normalize_embeddings=True,
2631
  )
2632
+ print(embeddings.shape, embeddings)
2633
+ scores = (embeddings @ embeddings.T) * 100
2634
+
2635
+ ```
2636
+
2637
+ Checkpoint compatible to [ctranslate2>=3.16.0](https://github.com/OpenNMT/CTranslate2)
2638
+ and [hf-hub-ctranslate2>=2.10.0](https://github.com/michaelfeil/hf-hub-ctranslate2)
2639
+ - `compute_type=int8_float16` for `device="cuda"`
2640
+ - `compute_type=int8` for `device="cpu"`
2641
+
2642
+ Converted on 2023-06-16 using
2643
+ ```
2644
+ ct2-transformers-converter --model intfloat/e5-small-v2 --output_dir ~/tmp-ct2fast-e5-small-v2 --force --copy_files tokenizer.json README.md tokenizer_config.json vocab.txt special_tokens_map.json .gitattributes --quantization float16 --trust_remote_code
2645
  ```
2646
 
2647
  # Licence and other remarks: