--- datasets: - theeseus-ai/RiskClassifier base_model: - meta-llama/Llama-3.1-8B-Instruct --- # RiskClassifier: Fine-Tuned LLaMA 3.1 8B Model ## Model Summary **RiskClassifier** is a fine-tuned version of the **meta-llama/Llama-3.1-8B-Instruct** model, designed to evaluate risk levels across diverse scenarios using structured critical thinking. It is fine-tuned on the **theeseus-ai/RiskClassifier** dataset, which focuses on assessing and labeling risk scores while maintaining detailed reasoning explanations. This model is optimized for tasks requiring risk classification, fraud detection, and analytical reasoning. ## Model Details - **Base Model**: [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) - **Fine-tuned Dataset**: [theeseus-ai/RiskClassifier](https://huggingface.co/datasets/theeseus-ai/RiskClassifier) - **Model Size**: 8 Billion Parameters - **Language**: English - **License**: Apache 2.0 - **Use Case**: Risk assessment, fraud detection, critical thinking tasks ## Dataset Information The **RiskClassifier** dataset provides structured scenarios with: - **Context**: A description of the event requiring analysis. - **Query**: A critical-thinking question tied to the scenario. - **Answers**: Four risk level options ("Low risk," "Moderate risk," "High risk," "Very high risk"). - **Risk Score**: A numeric value (0–100) representing the raw risk assessment. - **Conversations**: Reformatted data in ShareGPT-style conversation format to train the model for reasoning and structured responses. Example Reformatted Output: ``` { "context": "A customer used a credit card in a high-fraud region for a large purchase.", "query": "What is the risk level of this transaction?", "answers": ["Low risk", "Moderate risk", "High risk", "Very high risk"], "risk_score": 85, "conversations": [ {"role": "system", "content": "You are a helpful AI that assesses risk levels and provides explanations."}, {"role": "user", "content": "Context: A customer used a credit card in a high-fraud region for a large purchase.\nQuestion: What is the risk level of this transaction?\nAnswers: [Low risk, Moderate risk, High risk, Very high risk]"}, {"role": "assistant", "content": "Risk Level: Very high risk (Score: 85)"} ] } ``` ## Intended Use ### Applications - **Fraud Detection**: Evaluating suspicious transactions and identifying high-risk activities. - **Risk Analysis**: Assessing scenarios with probabilistic evaluations for financial and operational decisions. - **Critical Thinking Tasks**: Enhancing AI's ability to reason about uncertainty and complex situations. - **Educational Tools**: Training AI systems to provide explanations for risk assessments. ### Limitations - **Context Dependency**: Accuracy may degrade with ambiguous or incomplete context. - **Bias Risk**: Outputs may inherit biases present in training data; manual review is advised for high-impact decisions. - **Numeric Risk Scores**: The numerical scores may require post-processing to fit domain-specific thresholds. ## How to Use ### Example Code: ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model_name = "theeseus-ai/RiskClassifier" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name) inputs = tokenizer("Context: A large transaction flagged for manual review.\nQuestion: What is the risk level?", return_tensors="pt") outputs = model.generate(**inputs, max_length=100) print(tokenizer.decode(outputs[0])) ``` ## Evaluation Metrics - **Accuracy**: Verified predictions against labeled risk levels. - **Reasoning Completeness**: Evaluated explanations for clarity and alignment with context. - **Risk Score Consistency**: Checked correlation between numeric risk scores and label predictions. ## Training Configuration - **Optimizer**: AdamW - **Batch Size**: 32 - **Learning Rate**: 2e-5 - **Epochs**: 3 - **Hardware**: NVIDIA A100 GPUs - **Precision**: bf16 mixed precision ## Environmental Impact - **Hardware**: NVIDIA A100 GPUs - **Training Hours**: ~2 hours - **Carbon Emissions**: Estimated using [ML CO2 Calculator](https://mlco2.github.io/impact) ## Citation ``` @misc{RiskClassifier2024, title={RiskClassifier: Fine-Tuned LLaMA 3.1 8B Model for Risk Assessment}, author={Theeseus AI}, year={2024}, howpublished={\url{https://huggingface.co/theeseus-ai/RiskClassifier}} } ``` ## Contact For inquiries, please reach out to **theeseus@protonmail.com** or visit [LinkedIn](https://www.linkedin.com/in/theeseus).