File size: 9,621 Bytes
1409a3f 4f2f573 1409a3f 21d144e 1409a3f 21d144e 3193dc1 73480b8 6c6750c 632228d 61e2b06 3337532 c36f6b4 a2684d8 107ca20 6faeaec 45d5d44 9276287 bd6d209 9d62c4a 1bfda2d 7c78e35 5d05830 bc98d9f 7e0d0f5 99b765f f2e8e9d 495ea56 919b6f6 f001dd2 0483205 6134da3 c22c8c7 2338dbd 856a5c6 31e052e 7ba117a ec6ac15 22dab36 8792f56 c3ec567 b6853d9 7c9c4a0 056dfc8 22b8a49 8882df2 a3e01d0 3e9a881 612571d 0035e1e cc3bcb8 b5f97ee e4f1dfb a790d55 524edde 1dd8b95 4f2f573 1409a3f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 |
---
license: mit
tags:
- generated_from_keras_callback
model-index:
- name: Zemulax/masked-lm-tpu
results: []
---
<!-- This model card has been generated automatically according to the information Keras had access to. You should
probably proofread and complete it, then remove this comment. -->
# Zemulax/masked-lm-tpu
This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on an unknown dataset.
It achieves the following results on the evaluation set:
- Train Loss: 7.7770
- Train Accuracy: 0.0241
- Validation Loss: 7.7589
- Validation Accuracy: 0.0230
- Epoch: 98
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0001, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0001, 'decay_steps': 223250, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, '__passive_serialization__': True}, 'warmup_steps': 11750, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.001}
- training_precision: float32
### Training results
| Train Loss | Train Accuracy | Validation Loss | Validation Accuracy | Epoch |
|:----------:|:--------------:|:---------------:|:-------------------:|:-----:|
| 10.2868 | 0.0 | 10.2891 | 0.0 | 0 |
| 10.2817 | 0.0000 | 10.2764 | 0.0 | 1 |
| 10.2772 | 0.0000 | 10.2667 | 0.0000 | 2 |
| 10.2604 | 0.0000 | 10.2521 | 0.0 | 3 |
| 10.2421 | 0.0000 | 10.2282 | 0.0000 | 4 |
| 10.2219 | 0.0 | 10.2010 | 0.0 | 5 |
| 10.1957 | 0.0 | 10.1669 | 0.0 | 6 |
| 10.1667 | 0.0000 | 10.1388 | 0.0000 | 7 |
| 10.1278 | 0.0000 | 10.0908 | 0.0000 | 8 |
| 10.0848 | 0.0000 | 10.0405 | 0.0001 | 9 |
| 10.0496 | 0.0002 | 9.9921 | 0.0007 | 10 |
| 9.9940 | 0.0010 | 9.9422 | 0.0039 | 11 |
| 9.9424 | 0.0035 | 9.8765 | 0.0110 | 12 |
| 9.8826 | 0.0092 | 9.8156 | 0.0182 | 13 |
| 9.8225 | 0.0155 | 9.7461 | 0.0209 | 14 |
| 9.7670 | 0.0201 | 9.6768 | 0.0222 | 15 |
| 9.7065 | 0.0219 | 9.6127 | 0.0222 | 16 |
| 9.6352 | 0.0227 | 9.5445 | 0.0220 | 17 |
| 9.5757 | 0.0226 | 9.4795 | 0.0219 | 18 |
| 9.4894 | 0.0232 | 9.3985 | 0.0222 | 19 |
| 9.4277 | 0.0234 | 9.3386 | 0.0222 | 20 |
| 9.3676 | 0.0229 | 9.2753 | 0.0220 | 21 |
| 9.2980 | 0.0229 | 9.2170 | 0.0219 | 22 |
| 9.2361 | 0.0233 | 9.1518 | 0.0219 | 23 |
| 9.1515 | 0.0236 | 9.0827 | 0.0223 | 24 |
| 9.1171 | 0.0228 | 9.0406 | 0.0218 | 25 |
| 9.0447 | 0.0234 | 8.9867 | 0.0218 | 26 |
| 9.0119 | 0.0229 | 8.9307 | 0.0221 | 27 |
| 8.9625 | 0.0229 | 8.8969 | 0.0221 | 28 |
| 8.9098 | 0.0230 | 8.8341 | 0.0223 | 29 |
| 8.8726 | 0.0227 | 8.8118 | 0.0220 | 30 |
| 8.8574 | 0.0223 | 8.7910 | 0.0219 | 31 |
| 8.7798 | 0.0231 | 8.7506 | 0.0221 | 32 |
| 8.7535 | 0.0231 | 8.7055 | 0.0222 | 33 |
| 8.7333 | 0.0228 | 8.6801 | 0.0223 | 34 |
| 8.6985 | 0.0231 | 8.6837 | 0.0220 | 35 |
| 8.6816 | 0.0229 | 8.6243 | 0.0223 | 36 |
| 8.6356 | 0.0228 | 8.6323 | 0.0217 | 37 |
| 8.6392 | 0.0225 | 8.5603 | 0.0225 | 38 |
| 8.5802 | 0.0233 | 8.5722 | 0.0219 | 39 |
| 8.5825 | 0.0228 | 8.5548 | 0.0220 | 40 |
| 8.5625 | 0.0228 | 8.5272 | 0.0220 | 41 |
| 8.5415 | 0.0228 | 8.5200 | 0.0222 | 42 |
| 8.5124 | 0.0230 | 8.4787 | 0.0222 | 43 |
| 8.4999 | 0.0229 | 8.4819 | 0.0218 | 44 |
| 8.4561 | 0.0235 | 8.4453 | 0.0221 | 45 |
| 8.4854 | 0.0223 | 8.4378 | 0.0220 | 46 |
| 8.4367 | 0.0229 | 8.4212 | 0.0222 | 47 |
| 8.4096 | 0.0232 | 8.4033 | 0.0221 | 48 |
| 8.4162 | 0.0228 | 8.3869 | 0.0221 | 49 |
| 8.4005 | 0.0229 | 8.3768 | 0.0218 | 50 |
| 8.3583 | 0.0235 | 8.3470 | 0.0224 | 51 |
| 8.3428 | 0.0235 | 8.3540 | 0.0221 | 52 |
| 8.3491 | 0.0231 | 8.3201 | 0.0225 | 53 |
| 8.3551 | 0.0231 | 8.3382 | 0.0221 | 54 |
| 8.3186 | 0.0231 | 8.3136 | 0.0219 | 55 |
| 8.3139 | 0.0226 | 8.2844 | 0.0222 | 56 |
| 8.3170 | 0.0229 | 8.2740 | 0.0221 | 57 |
| 8.2886 | 0.0231 | 8.2485 | 0.0223 | 58 |
| 8.2648 | 0.0233 | 8.2336 | 0.0223 | 59 |
| 8.2714 | 0.0225 | 8.2321 | 0.0221 | 60 |
| 8.2446 | 0.0233 | 8.2135 | 0.0223 | 61 |
| 8.2303 | 0.0230 | 8.1980 | 0.0223 | 62 |
| 8.2022 | 0.0237 | 8.1996 | 0.0222 | 63 |
| 8.2222 | 0.0227 | 8.1822 | 0.0222 | 64 |
| 8.1690 | 0.0236 | 8.2005 | 0.0220 | 65 |
| 8.1741 | 0.0233 | 8.1446 | 0.0226 | 66 |
| 8.1990 | 0.0224 | 8.1586 | 0.0219 | 67 |
| 8.1395 | 0.0236 | 8.1243 | 0.0225 | 68 |
| 8.1675 | 0.0229 | 8.1275 | 0.0222 | 69 |
| 8.1432 | 0.0229 | 8.1374 | 0.0217 | 70 |
| 8.1197 | 0.0234 | 8.1078 | 0.0221 | 71 |
| 8.1046 | 0.0232 | 8.0991 | 0.0221 | 72 |
| 8.1013 | 0.0231 | 8.0794 | 0.0222 | 73 |
| 8.0887 | 0.0228 | 8.0720 | 0.0221 | 74 |
| 8.0661 | 0.0233 | 8.0573 | 0.0222 | 75 |
| 8.0548 | 0.0231 | 8.0313 | 0.0226 | 76 |
| 8.0307 | 0.0235 | 8.0278 | 0.0222 | 77 |
| 8.0626 | 0.0226 | 8.0084 | 0.0224 | 78 |
| 8.0276 | 0.0229 | 8.0099 | 0.0221 | 79 |
| 8.0213 | 0.0231 | 7.9930 | 0.0222 | 80 |
| 7.9798 | 0.0237 | 7.9742 | 0.0224 | 81 |
| 8.0135 | 0.0226 | 7.9857 | 0.0218 | 82 |
| 7.9500 | 0.0235 | 7.9505 | 0.0223 | 83 |
| 7.9519 | 0.0234 | 7.9711 | 0.0217 | 84 |
| 7.9616 | 0.0228 | 7.9288 | 0.0223 | 85 |
| 7.9803 | 0.0225 | 7.8997 | 0.0226 | 86 |
| 7.9369 | 0.0227 | 7.9015 | 0.0225 | 87 |
| 7.9309 | 0.0229 | 7.9010 | 0.0224 | 88 |
| 7.9367 | 0.0226 | 7.8988 | 0.0220 | 89 |
| 7.8840 | 0.0230 | 7.8774 | 0.0216 | 90 |
| 7.8785 | 0.0233 | 7.8527 | 0.0225 | 91 |
| 7.8998 | 0.0226 | 7.8509 | 0.0219 | 92 |
| 7.8451 | 0.0232 | 7.8488 | 0.0221 | 93 |
| 7.8596 | 0.0231 | 7.8310 | 0.0222 | 94 |
| 7.8434 | 0.0231 | 7.8168 | 0.0229 | 95 |
| 7.7929 | 0.0238 | 7.7815 | 0.0233 | 96 |
| 7.8174 | 0.0236 | 7.7857 | 0.0232 | 97 |
| 7.7770 | 0.0241 | 7.7589 | 0.0230 | 98 |
### Framework versions
- Transformers 4.30.1
- TensorFlow 2.12.0
- Tokenizers 0.13.3
|