mastavtsev commited on
Commit
24dd3b4
1 Parent(s): d66695c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -15
README.md CHANGED
@@ -2,21 +2,25 @@
2
  license: apache-2.0
3
  ---
4
 
5
- This is the model for my Course Project at 3 Year of HSE FCS. Python notebooks for my
6
- research can be found here: https://github.com/mastavtsev/PM_NLP/tree/main.
7
 
8
- Model is specified for the task of unsupervised anomaly detection using SqueezeBERT architecture.
9
- The model is trained using masked language modeling using traces of normal program execution. Thus, the model produces a representation
10
- of trace tokens corresponding to normal program execution.
11
 
12
- Meta info:
13
- - Tokenizer LOA 13 - 20000 dictionary size, 300 max. token length
14
- - 512 context window size
15
- - 2.5e-3 learning rate
16
- - LAMB optimizer
17
- - 300 epochs
18
- - 43.6 million parameters
19
- - 1.5 hours in Google Colab with GPU A100
20
 
21
- Results of the model on the test data:
22
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/661915f3fdcb7df8a6e2fdaa/bsZ8dXWN6IoAH4zXfwzi2.png)
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  ---
4
 
5
+ ## SqueezeBERT Model for Unsupervised Anomaly Detection
 
6
 
7
+ ### Overview
8
+ This model was developed as part of a Course Project during my third year at HSE Faculty of Computer Science (FCS). It utilizes the SqueezeBERT architecture, tailored for the task of unsupervised anomaly detection. The model identifies anomalies by learning representations of trace tokens indicative of normal program execution through masked language modeling.
 
9
 
10
+ ### Research Notebooks
11
+ Detailed Python notebooks documenting the research and methodology are available on GitHub: [Visit GitHub Repository](https://github.com/mastavtsev/PM_NLP/tree/main).
 
 
 
 
 
 
12
 
13
+ ### Model Configuration
14
+ - **Architecture**: SqueezeBERT, adapted for masked language modeling.
15
+ - **Tokenizer**: LOA 13 with a dictionary size of 20,000 and a maximum token length of 300.
16
+ - **Context Window Size**: 512 tokens.
17
+ - **Learning Rate**: 2.5e-3.
18
+ - **Optimizer**: LAMB.
19
+ - **Training Duration**: Trained for 300 epochs.
20
+ - **Parameters**: 43.6 million.
21
+ - **Training Environment**: Google Colab, utilizing an A100 GPU, with a training time of approximately 1.5 hours.
22
+
23
+ ### Model Performance
24
+ The model's effectiveness in anomaly detection is evidenced by its performance on test data. For visual representation of the model's capability to segregate normal vs. anomalous execution traces, see the results [here](https://cdn-uploads.huggingface.co/production/uploads/661915f3fdcb7df8a6e2fdaa/bsZ8dXWN6IoAH4zXfwzi2.png).
25
+
26
+ This detailed configuration and performance data is provided to facilitate replication and further experimentation by the community. The use of the Apache-2.0 license allows for both academic and commercial use, promoting wider adoption and potential contributions to the model's development.