theeseus-ai's picture
Update README.md
9d3c9aa verified
metadata
library_name: transformers
tags:
  - unsloth
  - trl
  - sft
base_model:
  - meta-llama/Llama-3.1-8B-Instruct

Model Card for Critical Thinker

Model Details

Model Description

The Critical Thinker model is a fine-tuned version of meta-llama/Llama-3.1-8B-Instruct, optimized for developing and evaluating critical thinking and investigative reasoning skills. It is specifically trained on the Critical Thinking Synthetic Dataset, which focuses on logical reasoning, forensic investigation, and multi-layered decision-making scenarios.

  • Developed by: Theeseus AI
  • Funded by [optional]: Independent Research Grant
  • Shared by: Theeseus AI
  • Model type: Transformer-based Language Model
  • Language(s): English
  • License: Apache 2.0
  • Finetuned from model: meta-llama/Llama-3.1-8B-Instruct

Model Sources


Uses

Direct Use

  • Critical Thinking Assessments: Evaluating logical reasoning and problem-solving capabilities.
  • Digital Forensics Investigations: Testing AI capabilities in analyzing logs, metadata, and cybersecurity incidents.
  • AI Research: Studying and benchmarking multi-step reasoning and decision-making models.

Downstream Use

  • Cybersecurity Training Programs: Training AI models to detect vulnerabilities, analyze logs, and identify attack patterns.
  • Question-Answering Applications: Developing reasoning-focused QA systems for educational and research tools.
  • AI Decision Support Systems: Building AI assistants for forensic investigations and cybersecurity monitoring.

Out-of-Scope Use

  • Tasks requiring real-time decision-making under high constraints.
  • Applications involving medical diagnosis or legal interpretations without human oversight.

Bias, Risks, and Limitations

Known Limitations

  • May misinterpret ambiguous evidence or scenarios that lack sufficient context.
  • Performance may degrade when analyzing multi-lingual inputs as the training data is primarily in English.
  • Model output can include false positives when assessing evidence in forensic cases.

Recommendations

  • Use outputs as supporting evidence, not definitive conclusions.
  • Perform manual validation for high-stakes decision-making.
  • Implement bias-checking algorithms when deploying in production environments.

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("theeseus-ai/CriticalThinker")
model = AutoModelForCausalLM.from_pretrained("theeseus-ai/CriticalThinker")

input_text = "Investigate unusual logins from multiple IP addresses in a network."
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))

Training Details

Training Data

The model is fine-tuned on the Critical Thinking Synthetic Dataset available at HuggingFace. The dataset simulates digital forensics, cybersecurity incidents, and logical deduction scenarios.

Training Procedure

Preprocessing

  • Cleaned and validated JSONL format.
  • Schema enforcement to ensure consistency.

Hyperparameters

  • Optimizer: AdamW
  • Batch Size: 16
  • Learning Rate: 2e-5
  • Epochs: 3
  • Precision: bfloat16 (bf16) mixed precision

Compute Resources

  • Hardware: NVIDIA A100 (80 GB) GPU
  • Training Time: ~24 hours

Evaluation

Testing Data, Factors & Metrics

Testing Data

The dataset was split into 80% training, 10% validation, and 10% testing sets.

Metrics

  • Accuracy: Measures correctness of predictions.
  • F1 Score: Evaluates precision and recall balance.
  • Log-likelihood Loss: Assesses model confidence and robustness.

Results

  • Accuracy: 89.4%
  • F1 Score: 88.7%
  • Log-likelihood Loss: 0.21

Summary

The model demonstrates high performance in logical deduction tasks and multi-choice reasoning problems. It is particularly effective in identifying patterns in digital forensics scenarios.


Environmental Impact

Carbon emissions estimated using the Machine Learning Impact calculator:

  • Hardware Type: NVIDIA A100 GPU
  • Hours Used: 24
  • Cloud Provider: AWS
  • Compute Region: US-East
  • Carbon Emitted: ~30 kg CO2eq

Technical Specifications

Model Architecture and Objective

  • Architecture: Transformer-based autoregressive model (decoder-only).
  • Objective: Minimize cross-entropy loss for sequence prediction.

Compute Infrastructure

  • Hardware: NVIDIA A100 (80 GB) GPUs.
  • Frameworks: PyTorch and HuggingFace Transformers.

Citation

If you use this model, please cite it as follows:

@model{critical_thinker,
  author       = {Theeseus AI},
  title        = {Critical Thinker Model},
  year         = {2024},
  version      = {1.0},
  publisher    = {HuggingFace Models},
  url          = {https://huggingface.co/datasets/theeseus-ai/CriticalThinker}
}

Contact

For questions or contributions, contact: