Canstralian
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -69,4 +69,40 @@ nlp = pipeline("text-classification", model=model_name)
|
|
69 |
# Example usage
|
70 |
result = nlp("Example shell command or exploit input")
|
71 |
print(result)
|
72 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
69 |
# Example usage
|
70 |
result = nlp("Example shell command or exploit input")
|
71 |
print(result)
|
72 |
+
```
|
73 |
+
|
74 |
+
## Training Details
|
75 |
+
### Training Data
|
76 |
+
The model was fine-tuned on the following datasets:
|
77 |
+
|
78 |
+
- Canstralian/ShellCommands: A collection of shell commands used in cybersecurity contexts.
|
79 |
+
- Canstralian/CyberExploitDB: A curated set of known exploits and vulnerabilities.
|
80 |
+
Further details on the preprocessing of these datasets can be found in their respective dataset cards.
|
81 |
+
|
82 |
+
## Training Procedure
|
83 |
+
### Preprocessing
|
84 |
+
The data was preprocessed to remove any sensitive or personally identifiable information. Text normalization and tokenization were applied to ensure consistency across the datasets.
|
85 |
+
|
86 |
+
### Training Hyperparameters
|
87 |
+
Training regime: fp16 mixed precision
|
88 |
+
Evaluation
|
89 |
+
Testing Data, Factors & Metrics
|
90 |
+
Testing was performed on both synthetic and real-world shell command and exploit datasets, focusing on their ability to correctly parse shell commands and identify exploit signatures.
|
91 |
+
|
92 |
+
## Factors
|
93 |
+
The evaluation factors included:
|
94 |
+
|
95 |
+
Model performance across different types of shell commands and exploits.
|
96 |
+
Accuracy, precision, recall, and F1-score in detecting known exploits.
|
97 |
+
## Metrics
|
98 |
+
Metrics used for evaluation include:
|
99 |
+
|
100 |
+
- Accuracy: Percentage of correct predictions made by the model.
|
101 |
+
- Precision: The number of relevant instances among the retrieved instances.
|
102 |
+
- Recall: The number of relevant instances that were retrieved.
|
103 |
+
- F1-score: The harmonic mean of precision and recall.
|
104 |
+
## Results
|
105 |
+
The model performs well on standard shell command parsing tasks and exploit detection, with high accuracy for common exploits. However, its performance may degrade on newer or less common exploits.
|
106 |
+
|
107 |
+
## Summary
|
108 |
+
The model is well-suited for cybersecurity applications involving shell command and exploit detection. While it excels in these areas, users should monitor its performance for emerging threats and unusual attack patterns.
|