|
--- |
|
library_name: transformers |
|
tags: |
|
- unsloth |
|
- trl |
|
- sft |
|
base_model: |
|
- meta-llama/Llama-3.1-8B-Instruct |
|
--- |
|
|
|
# Model Card for Critical Thinker |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
The **Critical Thinker** model is a fine-tuned version of **meta-llama/Llama-3.1-8B-Instruct**, optimized for developing and evaluating **critical thinking** and **investigative reasoning** skills. It is specifically trained on the **Critical Thinking Synthetic Dataset**, which focuses on logical reasoning, forensic investigation, and multi-layered decision-making scenarios. |
|
|
|
- **Developed by:** Theeseus AI |
|
- **Funded by [optional]:** Independent Research Grant |
|
- **Shared by:** [Theeseus AI](https://www.linkedin.com/in/theeseus/) |
|
- **Model type:** Transformer-based Language Model |
|
- **Language(s):** English |
|
- **License:** Apache 2.0 |
|
- **Finetuned from model:** meta-llama/Llama-3.1-8B-Instruct |
|
|
|
### Model Sources |
|
- **Repository:** [Critical Thinker on HuggingFace](https://huggingface.co/datasets/theeseus-ai/CriticalThinker) |
|
- **Dataset:** [Critical Thinking Dataset](https://huggingface.co/datasets/theeseus-ai/CriticalThinker) |
|
|
|
--- |
|
|
|
## Uses |
|
|
|
### Direct Use |
|
- **Critical Thinking Assessments:** Evaluating logical reasoning and problem-solving capabilities. |
|
- **Digital Forensics Investigations:** Testing AI capabilities in analyzing logs, metadata, and cybersecurity incidents. |
|
- **AI Research:** Studying and benchmarking multi-step reasoning and decision-making models. |
|
|
|
### Downstream Use |
|
- **Cybersecurity Training Programs:** Training AI models to detect vulnerabilities, analyze logs, and identify attack patterns. |
|
- **Question-Answering Applications:** Developing reasoning-focused QA systems for educational and research tools. |
|
- **AI Decision Support Systems:** Building AI assistants for forensic investigations and cybersecurity monitoring. |
|
|
|
### Out-of-Scope Use |
|
- Tasks requiring **real-time decision-making** under high constraints. |
|
- Applications involving **medical diagnosis** or **legal interpretations** without human oversight. |
|
|
|
--- |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
### Known Limitations |
|
- May **misinterpret ambiguous evidence** or scenarios that lack sufficient context. |
|
- Performance may degrade when analyzing **multi-lingual inputs** as the training data is primarily in **English**. |
|
- Model output can include **false positives** when assessing evidence in forensic cases. |
|
|
|
### Recommendations |
|
- Use outputs as **supporting evidence**, not definitive conclusions. |
|
- Perform **manual validation** for high-stakes decision-making. |
|
- Implement **bias-checking algorithms** when deploying in production environments. |
|
|
|
--- |
|
|
|
## How to Get Started with the Model |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("theeseus-ai/CriticalThinker") |
|
model = AutoModelForCausalLM.from_pretrained("theeseus-ai/CriticalThinker") |
|
|
|
input_text = "Investigate unusual logins from multiple IP addresses in a network." |
|
inputs = tokenizer(input_text, return_tensors="pt") |
|
outputs = model.generate(**inputs) |
|
print(tokenizer.decode(outputs[0])) |
|
``` |
|
|
|
--- |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
The model is fine-tuned on the **Critical Thinking Synthetic Dataset** available at [HuggingFace](https://huggingface.co/datasets/theeseus-ai/CriticalThinker). The dataset simulates digital forensics, cybersecurity incidents, and logical deduction scenarios. |
|
|
|
### Training Procedure |
|
#### Preprocessing |
|
- Cleaned and validated JSONL format. |
|
- Schema enforcement to ensure consistency. |
|
|
|
#### Hyperparameters |
|
- **Optimizer:** AdamW |
|
- **Batch Size:** 16 |
|
- **Learning Rate:** 2e-5 |
|
- **Epochs:** 3 |
|
- **Precision:** bfloat16 (bf16) mixed precision |
|
|
|
#### Compute Resources |
|
- **Hardware:** NVIDIA A100 (80 GB) GPU |
|
- **Training Time:** ~24 hours |
|
|
|
--- |
|
|
|
## Evaluation |
|
|
|
### Testing Data, Factors & Metrics |
|
|
|
#### Testing Data |
|
The dataset was split into **80% training**, **10% validation**, and **10% testing** sets. |
|
|
|
#### Metrics |
|
- **Accuracy:** Measures correctness of predictions. |
|
- **F1 Score:** Evaluates precision and recall balance. |
|
- **Log-likelihood Loss:** Assesses model confidence and robustness. |
|
|
|
### Results |
|
- **Accuracy:** 89.4% |
|
- **F1 Score:** 88.7% |
|
- **Log-likelihood Loss:** 0.21 |
|
|
|
#### Summary |
|
The model demonstrates high performance in **logical deduction tasks** and **multi-choice reasoning problems**. It is particularly effective in identifying **patterns in digital forensics scenarios**. |
|
|
|
--- |
|
|
|
## Environmental Impact |
|
|
|
Carbon emissions estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute): |
|
- **Hardware Type:** NVIDIA A100 GPU |
|
- **Hours Used:** 24 |
|
- **Cloud Provider:** AWS |
|
- **Compute Region:** US-East |
|
- **Carbon Emitted:** ~30 kg CO2eq |
|
|
|
--- |
|
|
|
## Technical Specifications |
|
|
|
### Model Architecture and Objective |
|
- **Architecture:** Transformer-based autoregressive model (decoder-only). |
|
- **Objective:** Minimize cross-entropy loss for sequence prediction. |
|
|
|
### Compute Infrastructure |
|
- **Hardware:** NVIDIA A100 (80 GB) GPUs. |
|
- **Frameworks:** PyTorch and HuggingFace Transformers. |
|
|
|
--- |
|
|
|
## Citation |
|
If you use this model, please cite it as follows: |
|
``` |
|
@model{critical_thinker, |
|
author = {Theeseus AI}, |
|
title = {Critical Thinker Model}, |
|
year = {2024}, |
|
version = {1.0}, |
|
publisher = {HuggingFace Models}, |
|
url = {https://huggingface.co/datasets/theeseus-ai/CriticalThinker} |
|
} |
|
``` |
|
|
|
--- |
|
|
|
## Contact |
|
For questions or contributions, contact: |
|
- **Email:** theeseus@protonmail.com |
|
- **LinkedIn:** [Theeseus](https://www.linkedin.com/in/theeseus/) |
|
|
|
|
|
|
|
|