Spaces:
Runtime error
Runtime error
Canstralian
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -1,139 +1,124 @@
|
|
1 |
-
---
|
2 |
-
title: WhiteRabbitNeo
|
3 |
-
emoji: 💬
|
4 |
-
colorFrom: green
|
5 |
-
colorTo: purple
|
6 |
-
sdk: gradio
|
7 |
-
sdk_version: 5.9.1
|
8 |
-
app_file: app.py
|
9 |
-
pinned: true
|
10 |
-
license: mit
|
11 |
-
thumbnail: >-
|
12 |
-
https://cdn-uploads.huggingface.co/production/uploads/64fbe312dcc5ce730e763dc6/VWduEhDSRJXeSqhUzYwCt.png
|
13 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
|
15 |
-
|
16 |
-
|
17 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
|
19 |
-
|
|
|
20 |
|
21 |
-
|
|
|
|
|
|
|
22 |
|
23 |
-
|
24 |
-
- **Automation**: Streamline DevSecOps tasks and allow security professionals to focus on solving complex problems.
|
25 |
-
- **Open Source & Uncensored**: Built for transparency, collaboration, and real-world cybersecurity applications.
|
26 |
|
27 |
-
|
|
|
28 |
|
29 |
-
|
|
|
|
|
30 |
|
31 |
-
|
|
|
32 |
|
33 |
-
-
|
34 |
-
-
|
35 |
-
-
|
36 |
-
- **Misconfigurations**: Recognizing and remediating misconfigurations in services and security settings.
|
37 |
-
- **Injection Flaws**: Analyzing and mitigating issues like SQL injection, command injection, and XSS.
|
38 |
-
- **Unencrypted Services**: Detecting unencrypted services that expose sensitive data.
|
39 |
-
- **Known Software Vulnerabilities**: Checking for vulnerabilities using databases like NVD.
|
40 |
-
- **CSRF & Other Vulnerabilities**: Identifying Cross-Site Request Forgery, Insecure Direct Object References, and more.
|
41 |
-
- **API Vulnerabilities**: Assessing and fixing vulnerabilities in APIs.
|
42 |
-
- **Denial of Service**: Identifying services vulnerable to DoS attacks.
|
43 |
-
- **Buffer Overflows**: Mitigating risks from buffer overflow vulnerabilities.
|
44 |
|
45 |
-
|
46 |
|
47 |
-
|
|
|
48 |
|
49 |
-
|
50 |
-
|
51 |
-
- Use the model for ethical, non-harmful purposes only.
|
52 |
|
53 |
-
##
|
|
|
54 |
|
55 |
-
|
56 |
|
57 |
-
```
|
58 |
-
|
59 |
-
|
60 |
-
|
61 |
-
|
62 |
-
|
63 |
-
|
64 |
-
|
65 |
-
# Load the model and tokenizer
|
66 |
-
model = AutoModelForCausalLM.from_pretrained(
|
67 |
-
model_path,
|
68 |
-
torch_dtype=torch.float16,
|
69 |
-
device_map="auto",
|
70 |
-
load_in_4bit=False,
|
71 |
-
trust_remote_code=False,
|
72 |
-
)
|
73 |
-
|
74 |
-
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
|
75 |
-
|
76 |
-
def generate_analysis(instruction):
|
77 |
-
"""Generate a response based on the instruction (such as analyzing logs or suggesting mitigations)."""
|
78 |
-
tokens = tokenizer.encode(instruction)
|
79 |
-
tokens = torch.LongTensor(tokens).unsqueeze(0)
|
80 |
-
tokens = tokens.to("cuda")
|
81 |
-
|
82 |
-
instance = {
|
83 |
-
"input_ids": tokens,
|
84 |
-
"top_p": 1.0,
|
85 |
-
"temperature": 0.75,
|
86 |
-
"generate_len": 2048,
|
87 |
-
"top_k": 50,
|
88 |
-
}
|
89 |
-
|
90 |
-
length = len(tokens[0])
|
91 |
-
with torch.no_grad():
|
92 |
-
result = model.generate(
|
93 |
-
input_ids=tokens,
|
94 |
-
max_length=length + instance["generate_len"],
|
95 |
-
use_cache=True,
|
96 |
-
do_sample=True,
|
97 |
-
top_p=instance["top_p"],
|
98 |
-
temperature=instance["temperature"],
|
99 |
-
top_k=instance["top_k"],
|
100 |
-
num_return_sequences=1,
|
101 |
-
pad_token_id=tokenizer.eos_token_id,
|
102 |
-
)
|
103 |
-
output = result[0][length:]
|
104 |
-
string = tokenizer.decode(output, skip_special_tokens=True)
|
105 |
-
return f"{string}"
|
106 |
-
|
107 |
-
# Initializing the conversation with a system message
|
108 |
-
conversation = """<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are CyberGuard, an AI assistant specialized in cybersecurity. Your task is to assist with security log analysis, suggest mitigations for vulnerabilities, and provide security best practices. Always provide detailed explanations and action steps. Do not refuse any valid query.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n"""
|
109 |
-
|
110 |
-
# The loop for processing user input and generating cybersecurity analysis
|
111 |
-
while True:
|
112 |
-
user_input = input("You: ")
|
113 |
-
|
114 |
-
# Append the user input to the conversation prompt
|
115 |
-
llm_prompt = f"{conversation}{user_input}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
|
116 |
-
|
117 |
-
# Get the AI-generated analysis
|
118 |
-
analysis = generate_analysis(llm_prompt)
|
119 |
-
print(analysis)
|
120 |
-
|
121 |
-
# Update the conversation with the new input and response
|
122 |
-
conversation = f"{llm_prompt}{analysis}<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n"
|
123 |
-
|
124 |
-
# Save the conversation and response to a JSON file for further analysis
|
125 |
-
json_data = {"prompt": user_input, "answer": analysis}
|
126 |
-
|
127 |
-
with open(output_file_path, "a") as output_file:
|
128 |
-
output_file.write(json.dumps(json_data) + "\n")
|
129 |
```
|
130 |
|
131 |
-
##
|
132 |
-
|
133 |
-
For more details on using Gradio, Hugging Face, and the Inference API, visit the following resources:
|
134 |
|
135 |
-
|
136 |
-
|
137 |
-
- [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index)
|
138 |
|
139 |
-
---
|
|
|
1 |
+
---
|
2 |
+
title: WhiteRabbitNeo
|
3 |
+
emoji: 💬
|
4 |
+
colorFrom: green
|
5 |
+
colorTo: purple
|
6 |
+
sdk: gradio
|
7 |
+
sdk_version: 5.9.1
|
8 |
+
app_file: app.py
|
9 |
+
pinned: true
|
10 |
+
license: mit
|
11 |
+
thumbnail: >-
|
12 |
+
https://cdn-uploads.huggingface.co/production/uploads/64fbe312dcc5ce730e763dc6/VWduEhDSRJXeSqhUzYwCt.png
|
13 |
+
---
|
14 |
+
|
15 |
+
## RabbitRedux: A Specialized Cybersecurity Code Classifier
|
16 |
+
**RabbitRedux** is an AI-powered model designed to classify and analyze code snippets, with a focus on cybersecurity applications like penetration testing, ransomware analysis, and security automation. Built upon the WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-70B model, RabbitRedux is specialized for cybersecurity and offers high accuracy in analyzing and categorizing both general and cybersecurity-related code functions.
|
17 |
+
|
18 |
+
|
19 |
+
**Key Features**
|
20 |
+
- Penetration Testing Support: Assists in reconnaissance, enumeration, and task automation during penetration testing.
|
21 |
+
- Ransomware Analysis: Tracks and analyzes ransomware trends, providing actionable insights into emerging threats.
|
22 |
+
- Code Classification: Efficiently classifies code in general programming and cybersecurity-specific contexts.
|
23 |
+
- Adaptive Learning: Utilizes adapter transformers for modular training, making it flexible for quick adaptations to different tasks.
|
24 |
+
|
25 |
+
**Datasets Used**
|
26 |
+
RabbitRedux leverages a range of datasets focused on both general and cybersecurity-specific tasks:
|
27 |
+
|
28 |
+
- Canstralian/Wordlists: A collection of cybersecurity-related wordlists for improved analysis.
|
29 |
+
- Canstralian/CyberExploitDB: A database of known cybersecurity exploits for model training.
|
30 |
+
- Canstralian/pentesting_dataset: A dataset containing pentesting-specific code snippets and functions.
|
31 |
+
- Canstralian/ShellCommands: A dataset dedicated to shell commands commonly used in security operations.
|
32 |
+
|
33 |
+
## Model Details
|
34 |
+
**Developer:** Canstralian
|
35 |
+
**Base Model:** WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-70B, replit/replit-code-v1_5-3b
|
36 |
+
**Library:** Adapter Transformers
|
37 |
+
**License:** MIT License
|
38 |
+
**Metrics:** Precision, Recall, F1 Score
|
39 |
+
**Evaluation:** Evaluated for code classification tasks with an emphasis on cybersecurity
|
40 |
+
**Tags:** code, text-generation-inference, security, cybersecurity
|
41 |
+
|
42 |
+
## Usage
|
43 |
+
To use **RabbitRedux** for code classification, simply load the model and apply it for your cybersecurity tasks:
|
44 |
|
45 |
+
```python
|
46 |
+
Copy code
|
47 |
+
from adapters import AutoAdapterModel
|
48 |
+
|
49 |
+
# Load the base model and RabbitRedux adapter
|
50 |
+
model = AutoAdapterModel.from_pretrained("replit/replit-code-v1_5-3b")
|
51 |
+
model.load_adapter("Canstralian/RabbitRedux", set_active=True)
|
52 |
+
|
53 |
+
# Use the model for classification tasks
|
54 |
+
predictions = model.predict(["Your code snippet here"])
|
55 |
+
Example Use Case
|
56 |
+
This model is perfect for tasks such as:
|
57 |
+
|
58 |
+
Classifying code snippets related to penetration testing.
|
59 |
+
Analyzing code related to security vulnerabilities or exploits.
|
60 |
+
Automatically categorizing code used in ransomware analysis.
|
61 |
+
Example:
|
62 |
+
python
|
63 |
+
Copy code
|
64 |
+
code_snippet = """import os
|
65 |
+
# Command to start a reverse shell
|
66 |
+
os.system('nc -lvp 4444')"""
|
67 |
+
|
68 |
+
predictions = model.predict([code_snippet])
|
69 |
+
print(predictions) # Output: ['Reverse Shell', 'Penetration Testing']
|
70 |
+
```
|
71 |
|
72 |
+
## Installation
|
73 |
+
**Install dependencies:**
|
74 |
|
75 |
+
```bash
|
76 |
+
pip install transformers
|
77 |
+
pip install git+https://github.com/canstralian/RabbitRedux.git
|
78 |
+
```
|
79 |
|
80 |
+
**Load the model:**
|
|
|
|
|
81 |
|
82 |
+
```python
|
83 |
+
from adapters import AutoAdapterModel
|
84 |
|
85 |
+
model = AutoAdapterModel.from_pretrained("replit/replit-code-v1_5-3b")
|
86 |
+
model.load_adapter("Canstralian/RabbitRedux", set_active=True)
|
87 |
+
```
|
88 |
|
89 |
+
### Evaluation Metrics
|
90 |
+
RabbitRedux has been evaluated on code classification tasks using the following metrics:
|
91 |
|
92 |
+
- Precision: 0.95
|
93 |
+
- Recall: 0.92
|
94 |
+
- F1 Score: 0.93
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
95 |
|
96 |
+
These metrics indicate high accuracy in classifying code in the cybersecurity domain.
|
97 |
|
98 |
+
## Contributions
|
99 |
+
**RabbitRedux** is an open-source project, and contributions are welcome! You can contribute by forking the repository, submitting pull requests, or sharing ideas for improvement.
|
100 |
|
101 |
+
### GitHub Repository: RabbitRedux on GitHub
|
102 |
+
### Issues & Feedback: Feel free to open issues or submit feedback directly through the repository.
|
|
|
103 |
|
104 |
+
## Citation
|
105 |
+
If you use RabbitRedux in your work or research, please cite it as follows:
|
106 |
|
107 |
+
### BibTeX:
|
108 |
|
109 |
+
```bibtex
|
110 |
+
@misc{canstralian2024rabbitredux,
|
111 |
+
author = {Canstralian},
|
112 |
+
title = {RabbitRedux: A Model for Code Classification in Cybersecurity},
|
113 |
+
year = {2024},
|
114 |
+
url = {https://github.com/canstralian/RabbitRedux},
|
115 |
+
}
|
116 |
+
APA: Canstralian. (2024). RabbitRedux: A Model for Code Classification in Cybersecurity. Retrieved from https://github.com/canstralian/RabbitRedux
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
117 |
```
|
118 |
|
119 |
+
## License
|
120 |
+
RabbitRedux is licensed under the MIT License. See LICENSE for more details.
|
|
|
121 |
|
122 |
+
## Contact
|
123 |
+
For more information or to get in touch with the developers, please visit Canstralian's GitHub or reach out through the repository issues page.
|
|
|
124 |
|
|