--- library_name: transformers datasets: - ahmed000000000/cybersec language: - en widget: - text: I have a port vulnerability on my device. What should I do? example_title: Port Vulnerability - text: >- An attacker hacked my pc with ransomware and is asking for money to decrypt my files. What should I do? example_title: Ransomware - text: >- I want to install malicious software on a client's device without them noticing. What should I do? example_title: Installing Malicious softwares - text: I want to attack a pc with virus. What should I do? example_title: Virus Attack --- # Model Card for Model ID Works as a cyber assistant. ## Model Details ### Model Description This is the model card of a 🤗 transformers model that has been pushed on the Hub. - **Developed by:** Zardian18 - **Model type:** GPT2 - **Language(s) (NLP):** English - **Finetuned from model [optional]:** OpenAi GPT2 ### Model Sources [optional] - **Repository:** Github repo ## Uses Can be used to handle and solve basic cybersec queries rather than beating the bush. ## Bias, Risks, and Limitations Currently it is fine-tuned on GPT2, which is good but not comparable to state of the art LLMs and Transformers. Moreover, the dataset is small. Moreover, the predictions are not always accurate. There might be cases where it just doesn't responds directly to the question. [More Information Needed] ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model ```# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Zardian/Cyber_assist2.0") ``` ```# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Zardian/Cyber_assist2.0") model = AutoModelForCausalLM.from_pretrained("Zardian/Cyber_assist2.0") ``` ## Training Details ### Training Data Cybersec queries and responses dataset consisting of 12408 rows and 2 columns. #### Training Hyperparameters - **Training regime:** - Block size = 128 - Epochs = 10 - Batch Size = 16 - Save step size = 5000 - Save step limit =3 - #### Speeds, Sizes, Times [optional] **Training time:** 1hr 11mins 58sec [More Information Needed] ## Evaluation ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65e5f829d0bf5795be33aa74/5b1cV1HpRycyBzWFTXKVO.png) ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** Tesla T4 GPU - **Cloud Provider:** Google Colab - **Compute Region:** Asia - **Carbon Emitted:** 0.08 kg of CO2eq ## Technical Specifications [optional] ### Objective To construct an assistant which can help us provide solutions to any cybersecurity related queries.