tner
/

twitter-roberta-base-dec2021-tweetner7-2020

Token Classification

Inference Endpoints

Model card Files Files and versions Community

asahi417 commited on Sep 26, 2022

Commit

7425566

•

1 Parent(s): b4c884e

model update

Files changed (1) hide show

README.md +24 -4

README.md CHANGED Viewed

@@ -73,7 +73,7 @@ model-index:
 pipeline_tag: token-classification
 widget:
-- text: "Get the all-analog Classic Vinyl Edition of `Takin' Off` Album from {{@Herbie Hancock@}} via {{USERNAME}} link below: {{URL}}"
   example_title: "NER Example 1"
 ---
 # tner/twitter-roberta-base-dec2021-tweetner7-2020
@@ -112,15 +112,34 @@ Full evaluation can be found at [metric file of NER](https://huggingface.co/tner
 and [metric file of entity span](https://huggingface.co/tner/twitter-roberta-base-dec2021-tweetner7-2020/raw/main/eval/metric_span.json).
 ### Usage
-This model can be used through the [tner library](https://github.com/asahi417/tner). Install the library via pip
 ```shell
 pip install tner
 ```
-and activate model as below.
 ```python
 from tner import TransformersNER
 model = TransformersNER("tner/twitter-roberta-base-dec2021-tweetner7-2020")
-model.predict(["Jacob Collier is a Grammy awarded English artist from London"])
 ```
 It can be used via transformers library but it is not recommended as CRF layer is not supported at the moment.
@@ -166,3 +185,4 @@ If you use any resource from T-NER, please consider to cite our [paper](https://
 }
 ```

 pipeline_tag: token-classification
 widget:
+- text: "Get the all-analog Classic Vinyl Edition of `Takin' Off` Album from {@herbiehancock@} via {@bluenoterecords@} link below: {{URL}}"
   example_title: "NER Example 1"
 ---
 # tner/twitter-roberta-base-dec2021-tweetner7-2020
 and [metric file of entity span](https://huggingface.co/tner/twitter-roberta-base-dec2021-tweetner7-2020/raw/main/eval/metric_span.json).
 ### Usage
+This model can be used through the [tner library](https://github.com/asahi417/tner). Install the library via pip.
 ```shell
 pip install tner
 ```
+[TweetNER7](https://huggingface.co/datasets/tner/tweetner7) pre-processed tweets where the account name and URLs are
+converted into special formats (see the dataset page for more detail), so we process tweets accordingly and then run the model prediction as below.
 ```python
+import re
+from urlextract import URLExtract
 from tner import TransformersNER
+extractor = URLExtract()
+def format_tweet(tweet):
+    # mask web urls
+    urls = extractor.find_urls(tweet)
+    for url in urls:
+        tweet = tweet.replace(url, "{{URL}}")
+    # format twitter account
+    tweet = re.sub(r"\b(\s*)(@[\S]+)\b", r'\1{\2@}', tweet)
+    return tweet
+text = "Get the all-analog Classic Vinyl Edition of `Takin' Off` Album from @herbiehancock via @bluenoterecords link below: http://bluenote.lnk.to/AlbumOfTheWeek"
+text_format = format_tweet(text)
 model = TransformersNER("tner/twitter-roberta-base-dec2021-tweetner7-2020")
+model.predict([text_format])
 ```
 It can be used via transformers library but it is not recommended as CRF layer is not supported at the moment.
 }
 ```