GTD-Finetuned Llama 3.1 8B - GGUF q8_0
- Model Type: GGUF quantized (q8_0)
- Base Model: unsloth/meta-llama-3.1-8b-bnb-4bit
- Quantization Details:
- Method: q8_0
- 8-bit quantization for higher accuracy
- Larger file size but better performance
- Recommended for accuracy-critical applications
Training Data
- Dataset: Global Terrorism Database (GTD)
- Time Period: Events before January 1, 2017
- Format: Multi-label classification with probabilities
- Labels:
- Assassination
- Armed Assault
- Bombing/Explosion
- Hijacking
- Hostage Taking (Barricade Incident)
- Hostage Taking (Kidnapping)
- Facility/Infrastructure Attack
- Unarmed Assault
- Unknown
Data Processing
Date Filtering:
- Filtered events occurring before 2017-01-01
- Properly handled missing month/day values
Data Cleaning:
- Removed entries with missing summaries
- Removed entries with missing primary attack types
- Handled multi-label cases (up to 3 labels per event)
Label Processing:
- Primary attack type: Assigned higher probability (0.8)
- Secondary attack type: Assigned medium probability (0.5)
- Tertiary attack type: Assigned lower probability (0.3)
- Probabilities normalized to sum to 1.0
Training Format:
- Input: Event summaries in natural language
- Output: JSON object with attack types and probabilities
- Instruction: Consistent prompt for classification task
- Added EOS tokens for proper generation
Training Details
- Optimizer: AdamW 8-bit
- Learning Rate: 2e-4
- Batch Size: 2 per device
- Gradient Accumulation Steps: 4
- LR Scheduler: Linear
- Weight Decay: 0.01
- LoRA Configuration:
- Rank: 16
- Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- Alpha: 16
- Dropout: 0
Intended Use
This model is designed for:
- Multi-label classification of terrorist events
- Probability estimation for different attack types
- Research purposes in terrorism studies and security analysis
- Comparative analysis with ConfliBERT and other models
Limitations
- Training data limited to pre-2017 events
- May not capture recent changes in attack patterns
- Performance dependent on quality of event descriptions
- Subject to biases present in the GTD dataset
Evaluation Results
[To be added after evaluation]
Ethical Considerations
- Model trained on sensitive data about terrorist events
- Should be used responsibly for research and analysis
- Not intended for operational security decisions
- Results should be interpreted with appropriate context
Citation and Attribution
@misc{gtd-llama,
author = {Meher, Shreyas},
title = {GTD-Finetuned Llama 3.1 8B},
year = {2024},
publisher = {HuggingFace},
note = {Based on Meta's Llama 3.1 and GTD Dataset}
}
Acknowledgments
- Unsloth for optimization framework
- Hugging Face for transformers and TRL library
- Global Terrorism Database team
- Meta AI for Llama 3.1 base model
- Downloads last month
- 22