--- tags: - spacy - token-classification language: - en model-index: - name: en_spacy_pii_distilbert results: - task: name: NER type: token-classification metrics: - name: NER Precision type: precision value: 0.9530385872 - name: NER Recall type: recall value: 0.9554103008 - name: NER F Score type: f_score value: 0.9542229703 widget: - text: >- SELECT shipping FROM users WHERE shipping = '201 Thayer St Providence RI 02912' datasets: - beki/privy --- | Feature | Description | | --- | --- | | **Name** | `en_spacy_pii_distilbert` | | **Version** | `0.0.1` | | **spaCy** | `>=3.4.1,<=3.8.2` | | **Default Pipeline** | `transformer`, `ner` | | **Components** | `transformer`, `ner` | | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) | | **Sources** | Trained on a new [dataset for structured PII](https://huggingface.co/datasets/beki/privy) generated by [Privy](https://github.com/pixie-io/pixie/tree/main/src/datagen/pii/privy). For more details, see this [blog post](https://blog.px.dev/detect-pii/) | | **License** | MIT | | **Author** | [Benjamin Kilimnik](https://www.linkedin.com/in/benkilimnik/) | ### Label Scheme
View label scheme (5 labels for 1 components) | Component | Labels | | --- | --- | | **`ner`** | `DATE_TIME`, `LOC`, `NRP`, `ORG`, `PER` |
### Accuracy | Type | Score | | --- | --- | | `ENTS_F` | 95.42 | | `ENTS_P` | 95.30 | | `ENTS_R` | 95.54 | | `TRANSFORMER_LOSS` | 61154.85 | | `NER_LOSS` | 56001.88 |