tatiana-merz commited on
Commit
4490708
1 Parent(s): 672d32c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -20
README.md CHANGED
@@ -1,46 +1,80 @@
 
 
 
 
 
 
 
1
  model-index:
2
  - name: turkic-cyrillic-classifier
3
  results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
 
 
 
5
 
 
6
 
 
 
7
 
8
 
 
 
 
 
 
 
9
 
 
10
 
 
11
 
 
12
 
13
 
14
 
 
15
 
 
16
 
17
- ---
18
-
19
- This model card has been generated automatically according to the information the Trainer had access to. You
20
-
21
- # turkic-cyrillic-classifier
22
-
23
- This model is a fine-tuned version of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) on an unknown dataset.
24
- It achieves the following results on the evaluation set:
25
- - Loss: 0.0139
26
- - Accuracy: 0.9971
27
-
28
- ## Model description
29
 
30
- More information needed
31
 
32
- ## Intended uses & limitations
33
 
34
- More information needed
 
 
 
 
 
 
 
35
 
36
- ## Training and evaluation data
37
 
38
- More information needed
 
 
 
39
 
40
 
41
- ## Training procedure
42
 
43
  - Transformers 4.27.0
44
  - Pytorch 1.13.1+cu116
45
- - Datasets 2.10.1
46
- - Tokenizers 0.13.2
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - generated_from_trainer
5
+ metrics:
6
+ - accuracy
7
+ - f1
8
  model-index:
9
  - name: turkic-cyrillic-classifier
10
  results: []
11
+ language:
12
+ - ba
13
+ - cv
14
+ - sah
15
+ - tt
16
+ - ky
17
+ - kk
18
+ - tyv
19
+ - krc
20
+ - ru
21
+ datasets:
22
+ - tatiana-merz/cyrillic_turkic_langs
23
+ pipeline_tag: text-classification
24
+ ---
25
 
26
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
27
+ should probably proofread and complete it, then remove this comment. -->
28
 
29
+ # turkic-cyrillic-classifier
30
 
31
+ This model is a fine-tuned version of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) on an tatiana-merz/cyrillic_turkic_langs dataset.
32
+ It achieves the following results on the evaluation set:
33
 
34
 
35
+ {'test_loss': 0.013604652136564255,
36
+ 'test_accuracy': 0.997,
37
+ 'test_f1': 0.9969996069718668,
38
+ 'test_runtime': 60.5479,
39
+ 'test_samples_per_second': 148.643,
40
+ 'test_steps_per_second': 2.329}
41
 
42
+ ## Model description
43
 
44
+ The model classifies text based on a provided Turkic language written in Cyrillic script.
45
 
46
+ ## Intended uses & limitations
47
 
48
 
49
 
50
+ ## Training and evaluation data
51
 
52
+ [cyrillic_turkic_langs](https://huggingface.co/datasets/tatiana-merz/cyrillic_turkic_langs/)
53
 
 
 
 
 
 
 
 
 
 
 
 
 
54
 
55
+ ## Training procedure
56
 
57
+ ### Training hyperparameters
58
 
59
+ The following hyperparameters were used during training:
60
+ - learning_rate: 2e-05
61
+ - train_batch_size: 64
62
+ - eval_batch_size: 64
63
+ - seed: 42
64
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
65
+ - lr_scheduler_type: linear
66
+ - num_epochs: 2
67
 
68
+ ### Training results
69
 
70
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
71
+ |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|
72
+ | 0.1063 | 1.0 | 1000 | 0.0204 | 0.9950 | 0.9950 |
73
+ | 0.0126 | 2.0 | 2000 | 0.0136 | 0.9970 | 0.9970 |
74
 
75
 
76
+ ### Framework versions
77
 
78
  - Transformers 4.27.0
79
  - Pytorch 1.13.1+cu116
80
+ - Datasets 2.10.1