Zeitstaub commited on
Commit
d2f8be4
1 Parent(s): 9c4539c

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +6 -4
app.py CHANGED
@@ -53,11 +53,11 @@ def find_similar_texts(model_name, input_text):
53
 
54
  # Create Gradio interface using Blocks
55
  with gr.Blocks() as demo:
56
- gr.Markdown("## Sentence-Transformer based Patent-Abstract Search")
57
  with gr.Row():
58
  with gr.Column():
59
  model_selector = gr.Dropdown(choices=list(model_options.keys()), label="Chose Sentence-Transformer")
60
- text_input = gr.Textbox(lines=2, placeholder="machine learning for drug dosing", label="input_text (example: machine learning for drug dosing. Remember, this is only a small selection of machine learning patents!)")
61
  submit_button = gr.Button("search")
62
 
63
  with gr.Column():
@@ -68,7 +68,7 @@ with gr.Blocks() as demo:
68
 
69
  gr.Markdown("""
70
  ### Description
71
- This demo app leverages several Sentence Transformer models to compute the semantic distance between user input and a small number of patent abstracts in the field of machine learning and AI.
72
 
73
  - 'all-MiniLM-L6-v2': embedding size is 384. [More info](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) and [here](https://sbert.net/).
74
  - 'intfloat/e5-large-v2'. Text Embeddings by Weakly-Supervised Contrastive Pre-training, embedding size is 1024. [More info](https://huggingface.co/intfloat/e5-large-v2).
@@ -76,7 +76,7 @@ with gr.Blocks() as demo:
76
  - 'thenlper/gte-large': General Text Embeddings (GTE) model, embedding size is 1024. [More info](https://huggingface.co/thenlper/gte-large) and [here](https://arxiv.org/abs/2308.03281).
77
  - 'avsolatorio/GIST-large-Embedding-v0': Fine-tuned on top of the BAAI/bge-large-en-v1.5 using the MEDI dataset augmented with mined triplets from the MTEB Classification training dataset, embedding size is 1024. [More info](https://huggingface.co/avsolatorio/GIST-large-Embedding-v0) and [here](https://arxiv.org/abs/2402.16829).
78
 
79
- The patents can be viewed at [Espacenet](https://worldwide.espacenet.com/?locale=en_EP), the free onine service by the European Patent Office.
80
 
81
  Please note: The data used in this demo contains only a very limited subset of patent abstracts and is intended only for demonstration purposes. It does by far not cover all patents or their complete data.
82
  """)
@@ -84,3 +84,5 @@ Please note: The data used in this demo contains only a very limited subset of p
84
  text_input.submit(find_similar_texts, inputs=[model_selector, text_input], outputs=output)
85
 
86
  demo.launch()
 
 
 
53
 
54
  # Create Gradio interface using Blocks
55
  with gr.Blocks() as demo:
56
+ gr.Markdown("## Sentence-Transformer based AI-Generated-Patent-Abstract Search")
57
  with gr.Row():
58
  with gr.Column():
59
  model_selector = gr.Dropdown(choices=list(model_options.keys()), label="Chose Sentence-Transformer")
60
+ text_input = gr.Textbox(lines=2, placeholder="machine learning for drug dosing", label="input_text (example: machine learning for drug dosing. Remark: This is only a small number of AI generated machine learning patents!)")
61
  submit_button = gr.Button("search")
62
 
63
  with gr.Column():
 
68
 
69
  gr.Markdown("""
70
  ### Description
71
+ This demo app leverages several Sentence Transformer models to compute the semantic distance between user input and a small number of AI generated patent abstracts in the field of machine learning and AI.
72
 
73
  - 'all-MiniLM-L6-v2': embedding size is 384. [More info](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) and [here](https://sbert.net/).
74
  - 'intfloat/e5-large-v2'. Text Embeddings by Weakly-Supervised Contrastive Pre-training, embedding size is 1024. [More info](https://huggingface.co/intfloat/e5-large-v2).
 
76
  - 'thenlper/gte-large': General Text Embeddings (GTE) model, embedding size is 1024. [More info](https://huggingface.co/thenlper/gte-large) and [here](https://arxiv.org/abs/2308.03281).
77
  - 'avsolatorio/GIST-large-Embedding-v0': Fine-tuned on top of the BAAI/bge-large-en-v1.5 using the MEDI dataset augmented with mined triplets from the MTEB Classification training dataset, embedding size is 1024. [More info](https://huggingface.co/avsolatorio/GIST-large-Embedding-v0) and [here](https://arxiv.org/abs/2402.16829).
78
 
79
+
80
 
81
  Please note: The data used in this demo contains only a very limited subset of patent abstracts and is intended only for demonstration purposes. It does by far not cover all patents or their complete data.
82
  """)
 
84
  text_input.submit(find_similar_texts, inputs=[model_selector, text_input], outputs=output)
85
 
86
  demo.launch()
87
+
88
+ #The patents can be viewed at [Espacenet](https://worldwide.espacenet.com/?locale=en_EP), the free onine service by the European Patent Office.