distilbert-base-uncased-logline-v3
This model is a fine-tuned version of distilbert-base-uncased on the AIT Log Data Set V2.0 dataset1, https://zenodo.org/records/5789064. It achieves the following results on the evaluation set:
- Loss: 0.0022
- Accuracy: 0.9995
- F1: 0.9994
Model description
This model is meant for text classification of log files for network intrusion detection. The python package that runs this model can be found here -> https://github.com/Isaacwilliam4/INSyT. As mentioned on their site, this model was trained on the following logs: Apache access and error logs, authentication logs, DNS logs, VPN logs, audit logs, Suricata logs, network traffic packet captures, horde logs, exim logs, syslog, and system monitoring logs.
Labels
Label | Label Name |
---|---|
0 | attacker:dnsteal:dnsteal-dropped |
1 | attacker:dnsteal:dnsteal-received |
2 | attacker:dnsteal:exfiltration-service |
3 | attacker_change_user:escalate |
4 | attacker_change_user:escalate:escalated_command:escalated_sudo_command |
5 | attacker_http:dirb:foothold |
6 | attacker_http:foothold:service_scan |
7 | attacker_http:foothold:webshell_cmd |
8 | attacker_http:foothold:webshell_upload |
9 | attacker_http:foothold:wpscan |
10 | attacker_vpn:escalate |
11 | attacker_vpn:foothold |
12 | benign |
13 | crack_passwords:escalate |
14 | dirb:foothold |
15 | dns_scan:foothold |
16 | escalate:escalated_command:escalated_sudo_command |
17 | escalate:escalated_command:escalated_sudo_command:escalated_sudo_session |
18 | escalate:webshell_cmd |
19 | foothold:network_scan |
20 | foothold:service_scan |
21 | foothold:traceroute |
22 | foothold:wpscan |
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
---|---|---|---|---|---|
0.0435 | 1.0 | 6274 | 0.0120 | 0.9965 | 0.9965 |
0.0059 | 2.0 | 12548 | 0.0032 | 0.9993 | 0.9992 |
0.0023 | 3.0 | 18822 | 0.0022 | 0.9995 | 0.9994 |
Test results
Test Loss | Test Accuracy | Test F1 |
---|---|---|
0.0020 | 0.9994 | 0.9994 |
Five Fold Cross Validation Mean Test Confusion Matrix
Framework versions
- Transformers 4.38.2
- Pytorch 2.0.0+cu117
- Datasets 2.18.0
- Tokenizers 0.15.1
Citations
[1]M. Landauer, F. Skopik, M. Frank, W. Hotwagner, M. Wurzenbergerand A. Rauber, “AIT Log Data Set V2.0”. Zenodo, Feb. 24, 2022. doi: 10.5281/zenodo.5789064.
- Downloads last month
- 6
Model tree for isaacwilliam4/insyt
Base model
distilbert/distilbert-base-uncased