File size: 9,131 Bytes
5b9dc6e
 
 
 
 
 
 
 
90beac4
 
 
5b9dc6e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90beac4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
---
license: apache-2.0
base_model: distilbert-base-uncased
tags:
- generated_from_trainer
model-index:
- name: distilbert_finetuned_ai4privacy_v2
  results: []
datasets:
- ai4privacy/pii-masking-200k
pipeline_tag: token-classification
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# distilbert_finetuned_ai4privacy_v2

This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0451
- Overall Precision: 0.9438
- Overall Recall: 0.9663
- Overall F1: 0.9549
- Overall Accuracy: 0.9838
- Accountname F1: 0.9946
- Accountnumber F1: 0.9940
- Age F1: 0.9624
- Amount F1: 0.9643
- Bic F1: 0.9929
- Bitcoinaddress F1: 0.9948
- Buildingnumber F1: 0.9845
- City F1: 0.9955
- Companyname F1: 0.9962
- County F1: 0.9877
- Creditcardcvv F1: 0.9643
- Creditcardissuer F1: 0.9953
- Creditcardnumber F1: 0.9793
- Currency F1: 0.7811
- Currencycode F1: 0.8850
- Currencyname F1: 0.2281
- Currencysymbol F1: 0.9562
- Date F1: 0.9061
- Dob F1: 0.7914
- Email F1: 1.0
- Ethereumaddress F1: 1.0
- Eyecolor F1: 0.9837
- Firstname F1: 0.9846
- Gender F1: 0.9971
- Height F1: 0.9910
- Iban F1: 0.9906
- Ip F1: 0.4349
- Ipv4 F1: 0.8126
- Ipv6 F1: 0.7679
- Jobarea F1: 0.9880
- Jobtitle F1: 0.9991
- Jobtype F1: 0.9777
- Lastname F1: 0.9684
- Litecoinaddress F1: 0.9721
- Mac F1: 1.0
- Maskednumber F1: 0.9635
- Middlename F1: 0.9330
- Nearbygpscoordinate F1: 1.0
- Ordinaldirection F1: 0.9910
- Password F1: 1.0
- Phoneimei F1: 0.9918
- Phonenumber F1: 0.9962
- Pin F1: 0.9477
- Prefix F1: 0.9546
- Secondaryaddress F1: 0.9892
- Sex F1: 0.9876
- Ssn F1: 0.9976
- State F1: 0.9893
- Street F1: 0.9873
- Time F1: 0.9889
- Url F1: 1.0
- Useragent F1: 0.9953
- Username F1: 0.9975
- Vehiclevin F1: 1.0
- Vehiclevrm F1: 1.0
- Zipcode F1: 0.9873

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine_with_restarts
- lr_scheduler_warmup_ratio: 0.2
- num_epochs: 5

### Training results

| Training Loss | Epoch | Step | Validation Loss | Overall Precision | Overall Recall | Overall F1 | Overall Accuracy | Accountname F1 | Accountnumber F1 | Age F1 | Amount F1 | Bic F1 | Bitcoinaddress F1 | Buildingnumber F1 | City F1 | Companyname F1 | County F1 | Creditcardcvv F1 | Creditcardissuer F1 | Creditcardnumber F1 | Currency F1 | Currencycode F1 | Currencyname F1 | Currencysymbol F1 | Date F1 | Dob F1 | Email F1 | Ethereumaddress F1 | Eyecolor F1 | Firstname F1 | Gender F1 | Height F1 | Iban F1 | Ip F1  | Ipv4 F1 | Ipv6 F1 | Jobarea F1 | Jobtitle F1 | Jobtype F1 | Lastname F1 | Litecoinaddress F1 | Mac F1 | Maskednumber F1 | Middlename F1 | Nearbygpscoordinate F1 | Ordinaldirection F1 | Password F1 | Phoneimei F1 | Phonenumber F1 | Pin F1 | Prefix F1 | Secondaryaddress F1 | Sex F1 | Ssn F1 | State F1 | Street F1 | Time F1 | Url F1 | Useragent F1 | Username F1 | Vehiclevin F1 | Vehiclevrm F1 | Zipcode F1 |
|:-------------:|:-----:|:----:|:---------------:|:-----------------:|:--------------:|:----------:|:----------------:|:--------------:|:----------------:|:------:|:---------:|:------:|:-----------------:|:-----------------:|:-------:|:--------------:|:---------:|:----------------:|:-------------------:|:-------------------:|:-----------:|:---------------:|:---------------:|:-----------------:|:-------:|:------:|:--------:|:------------------:|:-----------:|:------------:|:---------:|:---------:|:-------:|:------:|:-------:|:-------:|:----------:|:-----------:|:----------:|:-----------:|:------------------:|:------:|:---------------:|:-------------:|:----------------------:|:-------------------:|:-----------:|:------------:|:--------------:|:------:|:---------:|:-------------------:|:------:|:------:|:--------:|:---------:|:-------:|:------:|:------------:|:-----------:|:-------------:|:-------------:|:----------:|
| 0.6445        | 1.0   | 1088 | 0.3322          | 0.6449            | 0.7003         | 0.6714     | 0.8900           | 0.7607         | 0.8733           | 0.6576 | 0.1766    | 0.25   | 0.6783            | 0.3621            | 0.6005  | 0.6909         | 0.5586    | 0.0              | 0.2449              | 0.7095              | 0.2889      | 0.0             | 0.0             | 0.3902            | 0.7720  | 0.0    | 0.9862   | 0.8011             | 0.5088      | 0.7740       | 0.7118    | 0.5434    | 0.8088  | 0.0    | 0.8303  | 0.7562  | 0.5318     | 0.7294      | 0.4681     | 0.6779      | 0.0                | 0.8909 | 0.0             | 0.0107        | 0.9985                 | 0.4000              | 0.7307      | 0.9057       | 0.8618         | 0.0    | 0.9127    | 0.8235              | 0.9211 | 0.8026 | 0.4656   | 0.6390    | 0.9383  | 0.9775 | 0.8868       | 0.8201      | 0.4526        | 0.0550        | 0.5368     |
| 0.222         | 2.0   | 2176 | 0.1259          | 0.8170            | 0.8747         | 0.8449     | 0.9478           | 0.9708         | 0.9813           | 0.7638 | 0.7427    | 0.7837 | 0.8908            | 0.8833            | 0.8747  | 0.9814         | 0.8749    | 0.7601           | 0.9777              | 0.8834              | 0.5372      | 0.4828          | 0.0056          | 0.7785            | 0.8149  | 0.3140 | 0.9956   | 0.9935             | 0.9101      | 0.9270       | 0.9450    | 0.9853    | 0.9253  | 0.0650 | 0.0084  | 0.7962  | 0.9013     | 0.9446      | 0.9203     | 0.8555      | 0.6885             | 1.0    | 0.7152          | 0.6442        | 1.0                    | 0.9623              | 0.9349      | 0.9905       | 0.9782         | 0.7656 | 0.9324    | 0.9903              | 0.9736 | 0.9274 | 0.8520   | 0.9138    | 0.9678  | 0.9922 | 0.9893       | 0.9804      | 0.9646        | 0.8556        | 0.8385     |
| 0.1331        | 3.0   | 3264 | 0.0773          | 0.9133            | 0.9371         | 0.9250     | 0.9654           | 0.9822         | 0.9815           | 0.9196 | 0.8852    | 0.9718 | 0.9785            | 0.9215            | 0.9757  | 0.9935         | 0.9651    | 0.8742           | 0.9921              | 0.9438              | 0.7568      | 0.7710          | 0.0             | 0.8998            | 0.7895  | 0.6578 | 0.9994   | 1.0                | 0.9554      | 0.9525       | 0.9823    | 0.9910    | 0.9866  | 0.0435 | 0.8293  | 0.7824  | 0.9671     | 0.9794      | 0.9571     | 0.9447      | 0.9141             | 1.0    | 0.8825          | 0.7988        | 1.0                    | 0.9797              | 0.9921      | 0.9932       | 0.9943         | 0.8726 | 0.9401    | 0.9860              | 0.9792 | 0.9928 | 0.9740   | 0.9604    | 0.9730  | 0.9983 | 0.9964       | 0.9959      | 0.9890        | 0.9774        | 0.9247     |
| 0.0847        | 4.0   | 4352 | 0.0503          | 0.9368            | 0.9614         | 0.9489     | 0.9789           | 0.9955         | 0.9949           | 0.9573 | 0.9480    | 0.9929 | 0.9846            | 0.9808            | 0.9927  | 0.9962         | 0.9811    | 0.9436           | 0.9953              | 0.9695              | 0.7826      | 0.8713          | 0.1653          | 0.9458            | 0.8782  | 0.7996 | 1.0      | 1.0                | 0.9809      | 0.9816       | 0.9941    | 0.9910    | 0.9906  | 0.3389 | 0.8364  | 0.7066  | 0.9862     | 1.0         | 0.9795     | 0.9637      | 0.9429             | 1.0    | 0.9438          | 0.9165        | 1.0                    | 0.9864              | 1.0         | 0.9932       | 0.9962         | 0.9352 | 0.9483    | 0.9860              | 0.9866 | 0.9976 | 0.9884   | 0.9827    | 0.9881  | 1.0    | 0.9953       | 0.9975      | 0.9945        | 0.9915        | 0.9841     |
| 0.0557        | 5.0   | 5440 | 0.0451          | 0.9438            | 0.9663         | 0.9549     | 0.9838           | 0.9946         | 0.9940           | 0.9624 | 0.9643    | 0.9929 | 0.9948            | 0.9845            | 0.9955  | 0.9962         | 0.9877    | 0.9643           | 0.9953              | 0.9793              | 0.7811      | 0.8850          | 0.2281          | 0.9562            | 0.9061  | 0.7914 | 1.0      | 1.0                | 0.9837      | 0.9846       | 0.9971    | 0.9910    | 0.9906  | 0.4349 | 0.8126  | 0.7679  | 0.9880     | 0.9991      | 0.9777     | 0.9684      | 0.9721             | 1.0    | 0.9635          | 0.9330        | 1.0                    | 0.9910              | 1.0         | 0.9918       | 0.9962         | 0.9477 | 0.9546    | 0.9892              | 0.9876 | 0.9976 | 0.9893   | 0.9873    | 0.9889  | 1.0    | 0.9953       | 0.9975      | 1.0           | 1.0           | 0.9873     |


### Framework versions

- Transformers 4.35.0
- Pytorch 2.0.0
- Datasets 2.1.0
- Tokenizers 0.14.1