legal_mistral / README.md
tthhanh's picture
End of training
c17bbcb verified
|
raw
history blame
23.1 kB
metadata
base_model: mistralai/Mistral-7B-v0.3
library_name: peft
license: apache-2.0
tags:
  - generated_from_trainer
model-index:
  - name: legal_mistral
    results: []

legal_mistral

This model is a fine-tuned version of mistralai/Mistral-7B-v0.3 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6537
  • Law Precision: 0.7614
  • Law Recall: 0.9054
  • Law F1: 0.8272
  • Law Number: 74
  • Violated by Precision: 0.6383
  • Violated by Recall: 0.8219
  • Violated by F1: 0.7186
  • Violated by Number: 73
  • Violated on Precision: 0.3768
  • Violated on Recall: 0.4727
  • Violated on F1: 0.4194
  • Violated on Number: 55
  • Violation Precision: 0.4510
  • Violation Recall: 0.6273
  • Violation F1: 0.5247
  • Violation Number: 601
  • Overall Precision: 0.4876
  • Overall Recall: 0.6600
  • Overall F1: 0.5608
  • Overall Accuracy: 0.9389

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Law Precision Law Recall Law F1 Law Number Violated by Precision Violated by Recall Violated by F1 Violated by Number Violated on Precision Violated on Recall Violated on F1 Violated on Number Violation Precision Violation Recall Violation F1 Violation Number Overall Precision Overall Recall Overall F1 Overall Accuracy
No log 1.0 45 0.3581 0.3158 0.2432 0.2748 74 0.0 0.0 0.0 73 0.0 0.0 0.0 55 0.1694 0.3111 0.2194 601 0.1739 0.2553 0.2069 0.8813
No log 2.0 90 0.2947 0.4224 0.6622 0.5158 74 0.46 0.6301 0.5318 73 0.2 0.1455 0.1684 55 0.2202 0.3228 0.2618 601 0.2612 0.3699 0.3062 0.9038
No log 3.0 135 0.2478 0.5167 0.8378 0.6392 74 0.4701 0.7534 0.5789 73 0.2237 0.3091 0.2595 55 0.3313 0.5507 0.4138 601 0.3544 0.5791 0.4397 0.9209
No log 4.0 180 0.2199 0.6329 0.6757 0.6536 74 0.725 0.7945 0.7582 73 0.2222 0.3273 0.2647 55 0.3990 0.5491 0.4622 601 0.4274 0.5679 0.4877 0.9364
No log 5.0 225 0.3222 0.8310 0.7973 0.8138 74 0.7143 0.5479 0.6202 73 0.2791 0.2182 0.2449 55 0.3404 0.5108 0.4085 601 0.3899 0.5205 0.4459 0.9271
No log 6.0 270 0.3166 0.775 0.8378 0.8052 74 0.6622 0.6712 0.6667 73 0.3519 0.3455 0.3486 55 0.4220 0.5807 0.4888 601 0.4628 0.5965 0.5212 0.9372
No log 7.0 315 0.2988 0.7024 0.7973 0.7468 74 0.6923 0.7397 0.7152 73 0.48 0.4364 0.4571 55 0.3734 0.5840 0.4555 601 0.4236 0.6077 0.4992 0.9354
No log 8.0 360 0.3663 0.7831 0.8784 0.8280 74 0.6774 0.8630 0.7590 73 0.3239 0.4182 0.3651 55 0.4180 0.6023 0.4935 601 0.4609 0.6389 0.5355 0.9376
No log 9.0 405 0.4296 0.7143 0.8784 0.7879 74 0.6667 0.8493 0.7470 73 0.3889 0.5091 0.4409 55 0.4095 0.6323 0.4971 601 0.4519 0.6663 0.5385 0.9350
No log 10.0 450 0.3842 0.7021 0.8919 0.7857 74 0.7632 0.7945 0.7785 73 0.4262 0.4727 0.4483 55 0.3948 0.5774 0.4689 601 0.4477 0.6189 0.5196 0.9366
No log 11.0 495 0.4852 0.7561 0.8378 0.7949 74 0.6264 0.7808 0.6951 73 0.4545 0.4545 0.4545 55 0.4298 0.6057 0.5028 601 0.4726 0.6326 0.5410 0.9371
0.2514 12.0 540 0.4601 0.7033 0.8649 0.7758 74 0.6484 0.8082 0.7195 73 0.3768 0.4727 0.4194 55 0.4076 0.5724 0.4761 601 0.4502 0.6139 0.5195 0.9344
0.2514 13.0 585 0.5308 0.7558 0.8784 0.8125 74 0.6176 0.8630 0.7200 73 0.375 0.5455 0.4444 55 0.3433 0.4975 0.4062 601 0.4012 0.5691 0.4706 0.9284
0.2514 14.0 630 0.5586 0.7529 0.8649 0.8050 74 0.7093 0.8356 0.7673 73 0.3784 0.5091 0.4341 55 0.4246 0.6090 0.5003 601 0.4688 0.6463 0.5435 0.9384
0.2514 15.0 675 0.4173 0.8767 0.8649 0.8707 74 0.7922 0.8356 0.8133 73 0.3731 0.4545 0.4098 55 0.3991 0.5824 0.4736 601 0.4570 0.6227 0.5271 0.9369
0.2514 16.0 720 0.4812 0.825 0.8919 0.8571 74 0.7590 0.8630 0.8077 73 0.3378 0.4545 0.3876 55 0.3875 0.5474 0.4538 601 0.4448 0.6015 0.5114 0.9341
0.2514 17.0 765 0.5224 0.7805 0.8649 0.8205 74 0.75 0.8630 0.8025 73 0.3662 0.4727 0.4127 55 0.4446 0.6406 0.5249 601 0.4878 0.6700 0.5645 0.9382
0.2514 18.0 810 0.5306 0.7711 0.8649 0.8153 74 0.7326 0.8630 0.7925 73 0.3425 0.4545 0.3906 55 0.4505 0.6057 0.5167 601 0.4914 0.6426 0.5569 0.9393
0.2514 19.0 855 0.5059 0.7619 0.8649 0.8101 74 0.6854 0.8356 0.7531 73 0.3788 0.4545 0.4132 55 0.4509 0.6190 0.5217 601 0.4906 0.6501 0.5592 0.9392
0.2514 20.0 900 0.5216 0.7412 0.8514 0.7925 74 0.5865 0.8356 0.6893 73 0.3467 0.4727 0.4 55 0.3962 0.5840 0.4721 601 0.4357 0.6239 0.5131 0.9354
0.2514 21.0 945 0.4863 0.7683 0.8514 0.8077 74 0.6914 0.7671 0.7273 73 0.4262 0.4727 0.4483 55 0.4334 0.6007 0.5035 601 0.4787 0.6301 0.5441 0.9397
0.2514 22.0 990 0.5010 0.7191 0.8649 0.7853 74 0.7176 0.8356 0.7722 73 0.3710 0.4182 0.3932 55 0.4240 0.5890 0.4930 601 0.4687 0.6252 0.5358 0.9383
0.003 23.0 1035 0.5276 0.8205 0.8649 0.8421 74 0.6778 0.8356 0.7485 73 0.3768 0.4727 0.4194 55 0.4301 0.5940 0.4990 601 0.4761 0.6326 0.5433 0.9387
0.003 24.0 1080 0.5210 0.7975 0.8514 0.8235 74 0.7662 0.8082 0.7867 73 0.3692 0.4364 0.4 55 0.4315 0.5923 0.4993 601 0.4799 0.6252 0.5430 0.9407
0.003 25.0 1125 0.5500 0.7901 0.8649 0.8258 74 0.6897 0.8219 0.75 73 0.3731 0.4545 0.4098 55 0.4642 0.6256 0.5330 601 0.5024 0.6538 0.5682 0.9409
0.003 26.0 1170 0.5754 0.8205 0.8649 0.8421 74 0.7093 0.8356 0.7673 73 0.3768 0.4727 0.4194 55 0.4771 0.6240 0.5407 601 0.5162 0.6550 0.5774 0.9410
0.003 27.0 1215 0.6002 0.7805 0.8649 0.8205 74 0.6932 0.8356 0.7578 73 0.3768 0.4727 0.4194 55 0.4537 0.6040 0.5182 601 0.4947 0.6401 0.5581 0.9406
0.003 28.0 1260 0.6246 0.7901 0.8649 0.8258 74 0.6854 0.8356 0.7531 73 0.3676 0.4545 0.4065 55 0.4605 0.6106 0.5250 601 0.4995 0.6438 0.5626 0.9408
0.003 29.0 1305 0.6461 0.8 0.8649 0.8312 74 0.7011 0.8356 0.7625 73 0.3731 0.4545 0.4098 55 0.4573 0.6057 0.5211 601 0.4990 0.6401 0.5608 0.9408
0.003 30.0 1350 0.6604 0.7805 0.8649 0.8205 74 0.6778 0.8356 0.7485 73 0.3824 0.4727 0.4228 55 0.4676 0.6356 0.5388 601 0.5043 0.6638 0.5731 0.9399
0.003 31.0 1395 0.6739 0.7805 0.8649 0.8205 74 0.6593 0.8219 0.7317 73 0.4091 0.4909 0.4463 55 0.4698 0.6339 0.5397 601 0.5067 0.6625 0.5742 0.9402
0.003 32.0 1440 0.6841 0.8 0.8649 0.8312 74 0.6522 0.8219 0.7273 73 0.4127 0.4727 0.4407 55 0.4693 0.6356 0.5399 601 0.5071 0.6625 0.5745 0.9400
0.003 33.0 1485 0.6367 0.7674 0.8919 0.825 74 0.6374 0.7945 0.7073 73 0.3731 0.4545 0.4098 55 0.4539 0.6389 0.5308 601 0.4890 0.6638 0.5631 0.9392
0.0063 34.0 1530 0.5328 0.8312 0.8649 0.8477 74 0.7273 0.7671 0.7467 73 0.4426 0.4909 0.4655 55 0.4481 0.6106 0.5169 601 0.4971 0.6401 0.5596 0.9400
0.0063 35.0 1575 0.5545 0.7927 0.8784 0.8333 74 0.6941 0.8082 0.7468 73 0.4030 0.4909 0.4426 55 0.4498 0.6190 0.5210 601 0.4929 0.6513 0.5612 0.9399
0.0063 36.0 1620 0.5684 0.7805 0.8649 0.8205 74 0.7160 0.7945 0.7532 73 0.4912 0.5091 0.5 55 0.4400 0.6106 0.5115 601 0.4905 0.6438 0.5568 0.9404
0.0063 37.0 1665 0.5208 0.7765 0.8919 0.8302 74 0.6186 0.8219 0.7059 73 0.3714 0.4727 0.4160 55 0.4387 0.6190 0.5135 601 0.4764 0.6526 0.5507 0.9377
0.0063 38.0 1710 0.5981 0.7831 0.8784 0.8280 74 0.6667 0.8219 0.7362 73 0.3906 0.4545 0.4202 55 0.4638 0.6506 0.5416 601 0.5009 0.6737 0.5746 0.9395
0.0063 39.0 1755 0.6085 0.7738 0.8784 0.8228 74 0.6593 0.8219 0.7317 73 0.4 0.4727 0.4333 55 0.4646 0.6439 0.5397 601 0.5014 0.6700 0.5736 0.9396
0.0063 40.0 1800 0.6269 0.7901 0.8649 0.8258 74 0.6977 0.8219 0.7547 73 0.3934 0.4364 0.4138 55 0.4599 0.6389 0.5348 601 0.5005 0.6625 0.5702 0.9394
0.0063 41.0 1845 0.6321 0.7927 0.8784 0.8333 74 0.6977 0.8219 0.7547 73 0.3934 0.4364 0.4138 55 0.4638 0.6389 0.5374 601 0.5043 0.6638 0.5731 0.9394
0.0063 42.0 1890 0.6381 0.7927 0.8784 0.8333 74 0.6818 0.8219 0.7453 73 0.3871 0.4364 0.4103 55 0.4637 0.6373 0.5368 601 0.5028 0.6625 0.5717 0.9395
0.0063 43.0 1935 0.6482 0.7831 0.8784 0.8280 74 0.6977 0.8219 0.7547 73 0.4032 0.4545 0.4274 55 0.4637 0.6373 0.5368 601 0.5043 0.6638 0.5731 0.9396
0.0063 44.0 1980 0.6575 0.7738 0.8784 0.8228 74 0.6742 0.8219 0.7407 73 0.4355 0.4909 0.4615 55 0.4627 0.6389 0.5367 601 0.5033 0.6675 0.5739 0.9394
0.0011 45.0 2025 0.6740 0.75 0.8919 0.8148 74 0.6667 0.8219 0.7362 73 0.4 0.4727 0.4333 55 0.4540 0.6406 0.5314 601 0.4922 0.6687 0.5671 0.9389
0.0011 46.0 2070 0.6741 0.7674 0.8919 0.825 74 0.6818 0.8219 0.7453 73 0.4 0.4727 0.4333 55 0.4529 0.6406 0.5307 601 0.4931 0.6687 0.5677 0.9390
0.0011 47.0 2115 0.6766 0.75 0.8919 0.8148 74 0.6742 0.8219 0.7407 73 0.4 0.4727 0.4333 55 0.4552 0.6423 0.5328 601 0.4936 0.6700 0.5684 0.9390
0.0011 48.0 2160 0.6761 0.7416 0.8919 0.8098 74 0.6186 0.8219 0.7059 73 0.3768 0.4727 0.4194 55 0.4543 0.6456 0.5333 601 0.4869 0.6725 0.5649 0.9383
0.0011 49.0 2205 0.6523 0.7614 0.9054 0.8272 74 0.6316 0.8219 0.7143 73 0.3768 0.4727 0.4194 55 0.4516 0.6290 0.5257 601 0.4876 0.6613 0.5613 0.9387
0.0011 50.0 2250 0.6537 0.7614 0.9054 0.8272 74 0.6383 0.8219 0.7186 73 0.3768 0.4727 0.4194 55 0.4510 0.6273 0.5247 601 0.4876 0.6600 0.5608 0.9389

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1