mastavtsev
/

SqueezeBERT_PM_CLR

Inference Endpoints

Model card Files Files and versions Community

mastavtsev commited on Apr 12

Commit

24dd3b4

•

1 Parent(s): d66695c

Update README.md

Files changed (1) hide show

README.md +19 -15

README.md CHANGED Viewed

@@ -2,21 +2,25 @@
 license: apache-2.0
 ---
-This is the model for my Course Project at 3 Year of HSE FCS. Python notebooks for my
-research can be found here: https://github.com/mastavtsev/PM_NLP/tree/main.
-Model is specified for the task of unsupervised anomaly detection using SqueezeBERT architecture.
-The model is trained using masked language modeling using traces of normal program execution. Thus, the model produces a representation
-of trace tokens corresponding to normal program execution.
-Meta info:
-- Tokenizer LOA 13 - 20000 dictionary size, 300 max. token length
-- 512 context window size
-- 2.5e-3 learning rate
-- LAMB optimizer
-- 300 epochs
-- 43.6 million parameters
-- 1.5 hours in Google Colab with GPU A100
-Results of the model on the test data:
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/661915f3fdcb7df8a6e2fdaa/bsZ8dXWN6IoAH4zXfwzi2.png)

 license: apache-2.0
 ---
+## SqueezeBERT Model for Unsupervised Anomaly Detection
+### Overview
+This model was developed as part of a Course Project during my third year at HSE Faculty of Computer Science (FCS). It utilizes the SqueezeBERT architecture, tailored for the task of unsupervised anomaly detection. The model identifies anomalies by learning representations of trace tokens indicative of normal program execution through masked language modeling.
+### Research Notebooks
+Detailed Python notebooks documenting the research and methodology are available on GitHub: [Visit GitHub Repository](https://github.com/mastavtsev/PM_NLP/tree/main).
+### Model Configuration
+- **Architecture**: SqueezeBERT, adapted for masked language modeling.
+- **Tokenizer**: LOA 13 with a dictionary size of 20,000 and a maximum token length of 300.
+- **Context Window Size**: 512 tokens.
+- **Learning Rate**: 2.5e-3.
+- **Optimizer**: LAMB.
+- **Training Duration**: Trained for 300 epochs.
+- **Parameters**: 43.6 million.
+- **Training Environment**: Google Colab, utilizing an A100 GPU, with a training time of approximately 1.5 hours.
+### Model Performance
+The model's effectiveness in anomaly detection is evidenced by its performance on test data. For visual representation of the model's capability to segregate normal vs. anomalous execution traces, see the results [here](https://cdn-uploads.huggingface.co/production/uploads/661915f3fdcb7df8a6e2fdaa/bsZ8dXWN6IoAH4zXfwzi2.png).
+This detailed configuration and performance data is provided to facilitate replication and further experimentation by the community. The use of the Apache-2.0 license allows for both academic and commercial use, promoting wider adoption and potential contributions to the model's development.