File size: 4,303 Bytes
b397e23
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9d06434
 
 
 
 
b397e23
 
 
 
 
 
 
 
 
 
101fc33
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3c3cffe
 
 
101fc33
 
 
 
 
 
 
 
 
 
 
 
67cd87a
101fc33
b397e23
 
 
 
 
 
 
 
101fc33
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b397e23
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
---
tags:
- spacy
- token-classification
language:
- en
license: apache-2.0
model-index:
- name: en_roberta_base_plant_ner_case
  results:
  - task:
      name: NER
      type: token-classification
    metrics:
    - name: NER Precision
      type: precision
      value: 0.9697542533
    - name: NER Recall
      type: recall
      value: 0.9752851711
    - name: NER F Score
      type: f_score
      value: 0.9725118483
widget:
  - text: "I bought some bananas, apples and oranges from the market"
  - text: "He snacked on some grapes and sliced an apple during the movie."
  - text: "Pineapple is a tropical fruit with a sweet and juicy flesh and a tangy, tropical flavour."
---
| Feature | Description |
| --- | --- |
| **Name** | `en_roberta_base_plant_ner_case` |
| **Version** | `1.0.0` |
| **spaCy** | `>=3.5.2,<3.6.0` |
| **Default Pipeline** | `transformer`, `ner` |
| **Components** | `transformer`, `ner` |
| **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
| **Sources** | n/a |
| **License** | `Apache-2.0` |
| **Author** | [Mohammad Othman](https://mohammadothman.com) |
| **GitHub** | [Github](https://github.com/OthmanMohammad) |


## Model Architecture and Training

The Named Entity Recognition (NER) model is based on a pipeline architecture consisting of a Transformer component and an NER component. The Transformer component uses a pre-trained **RoBERTa-base** model, which is based on the BERT architecture. This component uses a fast tokenizer and processes input text in windows of **128 tokens** with a stride of **96 tokens**. 

The NER component is a **Transition-Based Parser (v2)** with a hidden width of **64** and maxout pieces set to **2**. It uses a Transformer Listener for the tok2vec layer with a grad_factor of **1.0** and mean pooling. 

During training, a **Tesla V100 GPU** was used for its superior performance. The optimizer used was **Adam** with a warmup-linear learning rate schedule, **L2 regularization of 0.01**, and gradient clipping of **1.0**. A batch size of **128** was used, with accumulated gradients for **3** steps and a dropout rate of **0.1**. The model was trained with a patience of **1600**, max steps of **20,000**, and an evaluation frequency of **200**.

A warmup period of **250** steps was used with an initial learning rate of **0.00005**, followed by a linear increase until the total steps of **20,000** were reached. This training process allowed for excellent results in terms of both accuracy and efficiency.


## Model Capabilities

This model is capable of identifying more than 500 different fruits and vegetables, including various kinds and variations. The model has been thoroughly tested and provides high accuracy for plant named entity recognition.


### Requirements

- **spaCy**: `>=3.5.2,<3.6.0`
- **spaCy Transformers**: `>=1.2.3,<1.3.0`


### Example Usage
```python
!pip install https://huggingface.co/MohammadOthman/en_roberta_base_plant_ner_case/resolve/main/en_roberta_base_plant_ner_case-any-py3-none-any.whl
```

```python
import spacy
from spacy import displacy

nlp = spacy.load("en_roberta_base_plant_ner_case")
text = "I bought some bananas, apples, and oranges from the market."
doc = nlp(text)

displacy.render(doc, style='ent', jupyter=True)

```
<img src="https://i.ibb.co/8cmhVG7/displacy-output.png" alt="Displacy Example" width="600"/>


### Label Scheme

| Component | Labels |
| --- | --- |
| **`ner`** | `PLANT` |


### Citation

If you use this model in your research or applications, please cite it as follows:

```
@misc{othman2023en_roberta_base_plant_ner_case,
  author = {Mohammad Othman},
  title = {en_roberta_base_plant_ner_case: A Named Entity Recognition Model for Identifying Plant Names},
  year = {2023},
  publisher = {Hugging Face Model Hub},
  url = {https://huggingface.co/MohammadOthman/en_roberta_base_plant_ner_case}
}

```

### Feedback and Support

For any questions, issues, or suggestions related to this model, please feel free to start a discussion on the [model's discussion board](https://huggingface.co/MohammadOthman/en_roberta_base_plant_ner_case/discussions).

If you need further assistance or would like to provide feedback directly to the author, you can contact Mohammad Othman via email at: [Mo@MohammadOthman.com](mailto:Mo@MohammadOthman.com)