GTD-Finetuned Llama 3.1 8B - GGUF q8_0

Model Type: GGUF quantized (q8_0)
Base Model: unsloth/meta-llama-3.1-8b-bnb-4bit
Quantization Details:
- Method: q8_0
- 8-bit quantization for higher accuracy
- Larger file size but better performance
- Recommended for accuracy-critical applications

Training Data

Dataset: Global Terrorism Database (GTD)
Time Period: Events before January 1, 2017
Format: Multi-label classification with probabilities
Labels:
- Assassination
- Armed Assault
- Bombing/Explosion
- Hijacking
- Hostage Taking (Barricade Incident)
- Hostage Taking (Kidnapping)
- Facility/Infrastructure Attack
- Unarmed Assault
- Unknown

Data Processing

Date Filtering:
- Filtered events occurring before 2017-01-01
- Properly handled missing month/day values
Data Cleaning:
- Removed entries with missing summaries
- Removed entries with missing primary attack types
- Handled multi-label cases (up to 3 labels per event)
Label Processing:
- Primary attack type: Assigned higher probability (0.8)
- Secondary attack type: Assigned medium probability (0.5)
- Tertiary attack type: Assigned lower probability (0.3)
- Probabilities normalized to sum to 1.0
Training Format:
- Input: Event summaries in natural language
- Output: JSON object with attack types and probabilities
- Instruction: Consistent prompt for classification task
- Added EOS tokens for proper generation

Training Details

Optimizer: AdamW 8-bit
Learning Rate: 2e-4
Batch Size: 2 per device
Gradient Accumulation Steps: 4
LR Scheduler: Linear
Weight Decay: 0.01
LoRA Configuration:
- Rank: 16
- Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- Alpha: 16
- Dropout: 0

Intended Use

This model is designed for:

Multi-label classification of terrorist events
Probability estimation for different attack types
Research purposes in terrorism studies and security analysis
Comparative analysis with ConfliBERT and other models

Limitations

Training data limited to pre-2017 events
May not capture recent changes in attack patterns
Performance dependent on quality of event descriptions
Subject to biases present in the GTD dataset

Evaluation Results

[To be added after evaluation]

Ethical Considerations

Model trained on sensitive data about terrorist events
Should be used responsibly for research and analysis
Not intended for operational security decisions
Results should be interpreted with appropriate context

Citation and Attribution

@misc{gtd-llama,
  author = {Meher, Shreyas},
  title = {GTD-Finetuned Llama 3.1 8B},
  year = {2024},
  publisher = {HuggingFace},
  note = {Based on Meta's Llama 3.1 and GTD Dataset}
}

Acknowledgments

Unsloth for optimization framework
Hugging Face for transformers and TRL library
Global Terrorism Database team
Meta AI for Llama 3.1 base model