Isotonic
/

mdeberta-v3-base_finetuned_ai4privacy_v2

@@ -1,11 +1,24 @@
 ---
-license: mit
 base_model: microsoft/mdeberta-v3-base
-tags:
-- generated_from_trainer
 model-index:
 - name: mdeberta-v3-base_finetuned_ai4privacy_v2
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -13,69 +26,17 @@ should probably proofread and complete it, then remove this comment. -->
 # mdeberta-v3-base_finetuned_ai4privacy_v2
-This model is a fine-tuned version of [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base) on the None dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.0323
 - Overall Precision: 0.9636
 - Overall Recall: 0.9731
 - Overall F1: 0.9683
 - Overall Accuracy: 0.9896
-- Accountname F1: 0.9998
-- Accountnumber F1: 0.9973
-- Age F1: 0.9878
-- Amount F1: 0.9495
-- Bic F1: 0.9932
-- Bitcoinaddress F1: 0.9704
-- Buildingnumber F1: 0.9648
-- City F1: 0.9887
-- Companyname F1: 0.9942
-- County F1: 0.9940
-- Creditcardcvv F1: 0.9820
-- Creditcardissuer F1: 0.9985
-- Creditcardnumber F1: 0.9570
-- Currency F1: 0.8750
-- Currencycode F1: 0.9888
-- Currencyname F1: 0.7416
-- Currencysymbol F1: 0.9819
-- Date F1: 0.9295
-- Dob F1: 0.8946
-- Email F1: 0.9998
-- Ethereumaddress F1: 0.9965
-- Eyecolor F1: 0.9984
-- Firstname F1: 0.9886
-- Gender F1: 0.9962
-- Height F1: 1.0
-- Iban F1: 0.9966
-- Ip F1: 0.6284
-- Ipv4 F1: 0.8884
-- Ipv6 F1: 0.8015
-- Jobarea F1: 0.9940
-- Jobtitle F1: 0.9973
-- Jobtype F1: 0.9970
-- Lastname F1: 0.9653
-- Litecoinaddress F1: 0.9109
-- Mac F1: 0.9992
-- Maskednumber F1: 0.9524
-- Middlename F1: 0.9347
-- Nearbygpscoordinate F1: 1.0
-- Ordinaldirection F1: 0.9984
-- Password F1: 0.9936
-- Phoneimei F1: 0.9998
-- Phonenumber F1: 0.9992
-- Pin F1: 0.9857
-- Prefix F1: 0.9801
-- Secondaryaddress F1: 0.9988
-- Sex F1: 0.9979
-- Ssn F1: 0.9983
-- State F1: 0.9944
-- Street F1: 0.9953
-- Time F1: 0.9974
-- Url F1: 1.0
-- Useragent F1: 1.0
-- Username F1: 0.9966
-- Vehiclevin F1: 0.9936
-- Vehiclevrm F1: 0.9917
-- Zipcode F1: 0.9727
 ## Model description
@@ -95,11 +56,11 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
-- train_batch_size: 16
-- eval_batch_size: 16
 - seed: 42
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
 - lr_scheduler_warmup_ratio: 0.2
 - num_epochs: 5
@@ -119,4 +80,4 @@ The following hyperparameters were used during training:
 - Transformers 4.35.2
 - Pytorch 2.1.0+cu121
 - Datasets 2.16.1
-- Tokenizers 0.15.0

 ---
 base_model: microsoft/mdeberta-v3-base
 model-index:
 - name: mdeberta-v3-base_finetuned_ai4privacy_v2
   results: []
+datasets:
+- ai4privacy/pii-masking-200k
+- Isotonic/pii-masking-200k
+language:
+- en
+- de
+- fr
+- it
+metrics:
+- accuracy
+- f1
+- precision
+- recall
+library_name: transformers
+pipeline_tag: token-classification
+license: cc-by-nc-4.0
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # mdeberta-v3-base_finetuned_ai4privacy_v2
+This model is a fine-tuned version of [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base) on the [ai4privacy/pii-masking-200k](https://huggingface.co/datasets/ai4privacy/pii-masking-200k) dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.0323
 - Overall Precision: 0.9636
 - Overall Recall: 0.9731
 - Overall F1: 0.9683
 - Overall Accuracy: 0.9896
+## Useage
+GitHub Implementation: [Ai4Privacy](https://github.com/Sripaad/ai4privacy)
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
+- train_batch_size: 32
+- eval_batch_size: 32
 - seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-06
+- lr_scheduler_type: cosine_with_restarts
 - lr_scheduler_warmup_ratio: 0.2
 - num_epochs: 5
 - Transformers 4.35.2
 - Pytorch 2.1.0+cu121
 - Datasets 2.16.1
+- Tokenizers 0.15.0