Add python code example
Browse files
README.md
CHANGED
@@ -2613,11 +2613,15 @@ The Large Language Model (LLM) is not for research and experimentation. We offer
|
|
2613 |
|
2614 |
## How to get embeddings
|
2615 |
|
2616 |
-
|
|
|
|
|
|
|
2617 |
|
2618 |
API Endpoint : https://api.sionic.ai/v1/embedding
|
2619 |
|
2620 |
-
Example
|
|
|
2621 |
```shell
|
2622 |
curl https://api.sionic.ai/v1/embedding \
|
2623 |
-H "Content-Type: application/json" \
|
@@ -2626,7 +2630,7 @@ curl https://api.sionic.ai/v1/embedding \
|
|
2626 |
}'
|
2627 |
```
|
2628 |
|
2629 |
-
|
2630 |
```shell
|
2631 |
{
|
2632 |
"embedding": [
|
@@ -2658,6 +2662,54 @@ Example response:
|
|
2658 |
}
|
2659 |
```
|
2660 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2661 |
## Massive Text Embedding Benchmark (MTEB) Evaluation
|
2662 |
|
2663 |
Both versions of Sionic AI's embedding show the state-of-the-art performances on the MTEB!
|
|
|
2613 |
|
2614 |
## How to get embeddings
|
2615 |
|
2616 |
+
Currently, we open the beta version of embedding API v1 and v2.
|
2617 |
+
To get embeddings, you should call API endpoint to send your text.
|
2618 |
+
You can send either a single sentence or multiple sentences.
|
2619 |
+
The embeddings that correspond to the inputs will be returned.
|
2620 |
|
2621 |
API Endpoint : https://api.sionic.ai/v1/embedding
|
2622 |
|
2623 |
+
### Command line Example
|
2624 |
+
Request:
|
2625 |
```shell
|
2626 |
curl https://api.sionic.ai/v1/embedding \
|
2627 |
-H "Content-Type: application/json" \
|
|
|
2630 |
}'
|
2631 |
```
|
2632 |
|
2633 |
+
Response:
|
2634 |
```shell
|
2635 |
{
|
2636 |
"embedding": [
|
|
|
2662 |
}
|
2663 |
```
|
2664 |
|
2665 |
+
### Python code Example
|
2666 |
+
Get embeddings by directly calling Sionic's embedding API.
|
2667 |
+
```python
|
2668 |
+
def get_embedding(queries: List[str], url):
|
2669 |
+
response = requests.post(url=url, json={'inputs': queries})
|
2670 |
+
return np.asarray(response.json()['embedding'], dtype=np.float32)
|
2671 |
+
|
2672 |
+
url = "https://api.sionic.ai/v1/embedding"
|
2673 |
+
inputs1 = ["first query", "second query"]
|
2674 |
+
inputs2 = ["third query", "fourth query"]
|
2675 |
+
embedding1 = get_embedding(inputs1, url=url)
|
2676 |
+
embedding2 = get_embedding(inputs2, url=url)
|
2677 |
+
similarity = embedding1 @ embedding2.T
|
2678 |
+
print(similarity)
|
2679 |
+
```
|
2680 |
+
|
2681 |
+
Using pre-defined [SionicEmbeddingModel]() to obtain embeddings.
|
2682 |
+
|
2683 |
+
```python
|
2684 |
+
import SionicEmbeddingModel
|
2685 |
+
|
2686 |
+
inputs1 = ["first query", "second query"]
|
2687 |
+
inputs2 = ["third query", "fourth query"]
|
2688 |
+
model - SionicEmbeddingModel(url="https://api.sionic.ai/v1/embedding",
|
2689 |
+
dimension=2048)
|
2690 |
+
embedding1 = model.encode(inputs1)
|
2691 |
+
embedding2 = model.encode(inputs2)
|
2692 |
+
similarity = embedding1 @ embedding2.T
|
2693 |
+
print(similarity)
|
2694 |
+
```
|
2695 |
+
Inspired by [FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding), we also apply instruction to encode short queries for retrieval task.
|
2696 |
+
By using `encode_queries()`, you can use instruction to encode queries which is added at the beginning of each query.
|
2697 |
+
The instruction to use for both v1 and v2 models is `"query: "`.
|
2698 |
+
|
2699 |
+
```python
|
2700 |
+
import SionicEmbeddingModel
|
2701 |
+
|
2702 |
+
query = ["first query", "second query"]
|
2703 |
+
passage = ["This is a passage related to the first query", "This is a passage related to the second query"]
|
2704 |
+
model - SionicEmbeddingModel(url="https://api.sionic.ai/v1/embedding",
|
2705 |
+
instruction="query: ",
|
2706 |
+
dimension=2048)
|
2707 |
+
query_embedding = model.encode(query)
|
2708 |
+
passage_embedding = model.encode(passage)
|
2709 |
+
similarity = query_embedding @ passage_embedding.T
|
2710 |
+
print(similarity)
|
2711 |
+
```
|
2712 |
+
|
2713 |
## Massive Text Embedding Benchmark (MTEB) Evaluation
|
2714 |
|
2715 |
Both versions of Sionic AI's embedding show the state-of-the-art performances on the MTEB!
|