ml6team
/

keyphrase-extraction-distilbert-openkp

@@ -32,7 +32,7 @@ Keyphrase extraction is a technique in text analysis where you extract the impor
 ## 📓 Model Description
-This model is a fine-tuned distilbert model on the openkp dataset. More information can be found here: https://huggingface.co/distilbert-base-uncased.
 The model is fine-tuned as a token classification problem where the text is labeled using the BIO scheme.
@@ -79,18 +79,20 @@ class KeyphraseExtractionPipeline(TokenClassificationPipeline):
 ```python
 # Load pipeline
-model_name = "DeDeckerThomas/keyphrase-extraction-distilbert-openkp"
 extractor = KeyphraseExtractionPipeline(model=model_name)
 ```
 ```python
 # Inference
 text = """
 Keyphrase extraction is a technique in text analysis where you extract the important keyphrases from a text.
-Since this is a time-consuming process, Artificial Intelligence is used to automate it.
-Currently, classical machine learning methods, that use statistics and linguistics, are widely used for the extraction process.
-The fact that these methods have been widely used in the community has the advantage that there are many easy-to-use libraries.
-Now with the recent innovations in deep learning methods (such as recurrent neural networks and transformers, GANS, …),
-keyphrase extraction can be improved. These new methods also focus on the semantics and context of a document, which is quite an improvement.
 """.replace(
     "\n", ""
 )
@@ -102,10 +104,7 @@ print(keyphrases)
 ```
 # Output
-['Artificial Intelligence' 'GANS' 'Keyphrase extraction'
- 'classical machine learning' 'deep learning methods'
- 'keyphrase extraction' 'linguistics' 'recurrent neural networks'
- 'semantics' 'statistics' 'text analysis' 'transformers']
 ```
 ## 📚 Training Dataset
@@ -163,7 +162,7 @@ def preprocess_fuction(all_samples_per_split):
 ```
 ### Postprocessing
-For the post-processing, you will need to filter out the B and I labeled tokens and concat the consecutive B and Is. As last you strip the keyphrase to ensure all spaces are removed.
 ```python
 # Define post_process functions
 def concat_tokens_by_tag(keyphrases):
@@ -207,4 +206,4 @@ The model achieves the following results on the OpenKP test set:
 For more information on the evaluation process, you can take a look at the keyphrase extraction evaluation notebook.
 ## 🚨 Issues
-Please feel free to contact Thomas De Decker for any problems with this model.

 ## 📓 Model Description
+This model is a fine-tuned distilbert model on the OpenKP dataset. More information can be found here: https://huggingface.co/distilbert-base-uncased.
 The model is fine-tuned as a token classification problem where the text is labeled using the BIO scheme.
 ```python
 # Load pipeline
+model_name = "ml6team/keyphrase-extraction-distilbert-openkp"
 extractor = KeyphraseExtractionPipeline(model=model_name)
 ```
 ```python
 # Inference
 text = """
 Keyphrase extraction is a technique in text analysis where you extract the important keyphrases from a text.
+Since this is a time-consuming process, Artificial Intelligence is used to automate it.
+Currently, classical machine learning methods, that use statistics and linguistics,
+are widely used for the extraction process. The fact that these methods have been widely used in the community
+has the advantage that there are many easy-to-use libraries. Now with the recent innovations in NLP,
+transformers can be used to improve keyphrase extraction. Transformers also focus on the semantics
+and context of a document, which is quite an improvement.
 """.replace(
     "\n", ""
 )
 ```
 # Output
+['keyphrase extraction', 'text analysis']
 ```
 ## 📚 Training Dataset
 ```
 ### Postprocessing
+For the post-processing, you will need to filter out the B and I labeled tokens and concat the consecutive Bs and Is. As last you strip the keyphrase to ensure all spaces are removed.
 ```python
 # Define post_process functions
 def concat_tokens_by_tag(keyphrases):
 For more information on the evaluation process, you can take a look at the keyphrase extraction evaluation notebook.
 ## 🚨 Issues
+Please feel free to start discussions in the Community Tab.