jinaai
/

jina-clip-implementation

Inference Endpoints

🇪🇺 Region: EU

Model card Files Files and versions Community

bwang0911 commited on May 27, 2024

Commit

0e50fd1

·

1 Parent(s): d42d28c

refactor: add docstring

Files changed (1) hide show

modeling_clip.py +31 -1

modeling_clip.py CHANGED Viewed

@@ -260,7 +260,37 @@ class JinaCLIPModel(JinaCLIPPreTrainedModel):
         normalize_embeddings: bool = False,
         **tokenizer_kwargs,
     ) -> Union[List[torch.Tensor], np.ndarray, torch.Tensor]::
         self.eval()
         if show_progress_bar is None:

         normalize_embeddings: bool = False,
         **tokenizer_kwargs,
     ) -> Union[List[torch.Tensor], np.ndarray, torch.Tensor]::
+        """
+       Computes sentence embeddings
+        Args:
+            sentences(`str` or `List[str]`):
+                Sentence or sentences to be encoded
+            batch_size(`int`, *optional*, defaults to 32):
+                Batch size for the computation
+            show_progress_bar(`bool`, *optional*, defaults to None):
+                Show a progress bar when encoding sentences.
+                If set to None, progress bar is only shown when `logger.level == logging.INFO` or `logger.level == logging.DEBUG`.
+            output_value(`str`, *optional*, defaults to 'sentence_embedding'):
+                Default sentence_embedding, to get sentence embeddings.
+                Can be set to token_embeddings to get wordpiece token embeddings.
+                Set to None, to get all output values
+            convert_to_numpy(`bool`, *optional*, defaults to True):
+                If true, the output is a list of numpy vectors.
+                Else, it is a list of pytorch tensors.
+            convert_to_tensor(`bool`, *optional*, defaults to False):
+                If true, you get one large tensor as return.
+                Overwrites any setting from convert_to_numpy
+            device(`torch.device`, *optional*, defaults to None):
+                Which torch.device to use for the computation
+            normalize_embeddings(`bool`, *optional*, defaults to False):
+                If set to true, returned vectors will have length 1. In that case, the faster dot-product (util.dot_score) instead of cosine similarity can be used.
+            tokenizer_kwargs(`Dict[str, Any]`, *optional*, defaults to {}):
+                Keyword arguments for the tokenizer
+        Returns:
+            By default, a list of tensors is returned.
+            If convert_to_tensor, a stacked tensor is returned.
+            If convert_to_numpy, a numpy matrix is returned.
+        """
         self.eval()
         if show_progress_bar is None: