fedcsis-slot_baseline-xlm_r-es

This model is a fine-tuned version of xlm-roberta-base on the leyzer-fedcsis dataset.

Result on test set:

  • Precision: 0.9696
  • Recall: 0.9686
  • F1: 0.9691
  • Accuracy: 0.9904

It achieves the following results on the evaluation set:

  • Loss: 0.0521
  • Precision: 0.9728
  • Recall: 0.9711
  • F1: 0.9720
  • Accuracy: 0.9914

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
0.7183 1.0 941 0.1287 0.9389 0.9429 0.9409 0.9802
0.0792 2.0 1882 0.0698 0.9551 0.9609 0.9580 0.9876
0.0502 3.0 2823 0.0586 0.9623 0.9624 0.9624 0.9886
0.0312 4.0 3764 0.0511 0.9697 0.9668 0.9682 0.9904
0.0229 5.0 4705 0.0494 0.9715 0.9687 0.9701 0.9913
0.021 6.0 5646 0.0447 0.9697 0.9680 0.9689 0.9911
0.0139 7.0 6587 0.0512 0.9715 0.9691 0.9703 0.9915
0.0126 8.0 7528 0.0507 0.9713 0.9699 0.9706 0.9913
0.01 9.0 8469 0.0500 0.9720 0.9702 0.9711 0.9913
0.0072 10.0 9410 0.0521 0.9728 0.9711 0.9720 0.9914

Per slot evaluation on test set

slot_name precision recall f1 tc_size
album 0.9500 0.9135 0.9314 104
all_lang 0.7500 1.0000 0.8571 3
artist 0.9556 0.9685 0.9620 222
av_alias 1.0000 1.0000 1.0000 18
caption 0.9565 0.9362 0.9462 47
category 0.9091 1.0000 0.9524 10
channel 0.7857 0.7857 0.7857 14
channel_id 0.9500 1.0000 0.9744 19
count 1.0000 1.0000 1.0000 8
date 0.9762 0.9762 0.9762 42
date_day 1.0000 1.0000 1.0000 6
date_month 1.0000 1.0000 1.0000 7
device_name 0.9770 1.0000 0.9884 85
email 1.0000 0.9740 0.9868 192
event_name 1.0000 1.0000 1.0000 35
file_name 1.0000 1.0000 1.0000 10
file_size 1.0000 1.0000 1.0000 2
filter 1.0000 1.0000 1.0000 15
hashtag 1.0000 0.9565 0.9778 46
img_query 0.9843 0.9843 0.9843 764
label 1.0000 1.0000 1.0000 7
location 0.9753 0.9875 0.9814 80
mail 1.0000 1.0000 1.0000 5
message 0.9577 0.9607 0.9592 636
mime_type 1.0000 1.0000 1.0000 1
name 0.9677 0.9677 0.9677 31
percent 0.8571 1.0000 0.9231 6
phone_number 0.9429 0.9763 0.9593 169
phone_type 1.0000 0.6667 0.8000 3
picture_url 1.0000 0.9286 0.9630 42
playlist 0.9701 0.9630 0.9665 135
portal 1.0000 0.9940 0.9970 168
priority 1.0000 1.0000 1.0000 3
purpose 0.0000 0.0000 0.0000 1
query 0.9259 0.8929 0.9091 28
rating 1.0000 1.0000 1.0000 3
review_count 0.7500 0.7500 0.7500 4
section 1.0000 1.0000 1.0000 134
seek_time 1.0000 1.0000 1.0000 2
sender 0.0000 0.0000 0.0000 1
sender_address 1.0000 1.0000 1.0000 6
song 0.9314 0.9628 0.9468 296
src_lang 0.9872 1.0000 0.9935 77
status 0.8462 0.9565 0.8980 23
subject 0.9555 0.9567 0.9561 785
text 0.9798 0.9798 0.9798 99
time 1.0000 1.0000 1.0000 32
to 0.9760 0.9651 0.9705 802
topic 1.0000 1.0000 1.0000 1
translator 1.0000 1.0000 1.0000 52
trg_lang 0.9886 1.0000 0.9943 87
txt_query 1.0000 0.8947 0.9444 19
username 1.0000 1.0000 1.0000 6
value 0.9318 0.9535 0.9425 43
weight 1.0000 1.0000 1.0000 1

Framework versions

  • Transformers 4.27.4
  • Pytorch 1.13.1+cu116
  • Datasets 2.11.0
  • Tokenizers 0.13.2

Citation

If you use this model, please cite the following:

@inproceedings{kubis2023caiccaic,
    author={Marek Kubis and Paweł Skórzewski and Marcin Sowański and Tomasz Ziętkiewicz},
    pages={1319–1324},
    title={Center for Artificial Intelligence Challenge on Conversational AI Correctness},
    booktitle={Proceedings of the 18th Conference on Computer Science and Intelligence Systems},
    year={2023},
    doi={10.15439/2023B6058},
    url={http://dx.doi.org/10.15439/2023B6058},
    volume={35},
    series={Annals of Computer Science and Information Systems}
}
Downloads last month
21
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train cartesinus/fedcsis-slot_baseline-xlm_r-es