File size: 4,406 Bytes
4e52258
 
 
fefe4b9
 
4e52258
fefe4b9
4e52258
 
 
863783b
4e52258
 
 
 
 
 
 
 
 
 
863783b
 
 
 
 
4e52258
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e334ea2
 
8bc643c
4e52258
66b9fc7
4e52258
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8024b47
1a44423
8024b47
4e52258
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
---
license: mit
datasets:
- trendmicro-ailab/Primus-Reasoning
- trendmicro-ailab/Primus-Seed
- trendmicro-ailab/Primus-FineWeb
- trendmicro-ailab/Primus-Instruct
language:
- en
base_model:
- trendmicro-ailab/Llama-Primus-Merged
pipeline_tag: text-generation
library_name: transformers
tags:
- cybersecurity
- pretraining
extra_gated_fields:
  Affiliation: text
  Country: country
  I want to use this model for:
    type: select
    options:
    - Research
    - Commercial
    - label: Other
      value: other
  Job title:
    type: select
    options:
    - Student
    - Research graduate
    - AI researcher
    - AI developer/engineer
    - Cybersecurity researcher
    - Reporter
    - Other
  geo: ip_location
---

# Primus: A Pioneering Collection of Open-Source Datasets for Cybersecurity LLM Training

<img src="https://i.imgur.com/PtqeTZw.png" alt="Primus Overview" width="60%">

**First cybersecurity reasoning model!**

>TL;DR: Llama-Primus-Reasoning is a reasoning model distilled from the reasoning steps with reflection data generated by o1-preview on cybersecurity tasks (_Primus-Reasoning_), based on Llama-Primus-Merged. It demonstrates a 🚀**10%** improvement in security certification (CISSP).

**🔥 For more details, please refer to the paper: [[📄Paper]](https://arxiv.org/abs/2502.11191).**

## Introduction

Large Language Models (LLMs) have demonstrated remarkable versatility in recent years, with promising applications in specialized domains such as finance, law, and biomedicine. However, in the domain of cybersecurity, we noticed a lack of open-source datasets specifically designed for LLM pre-training—even though much research has shown that LLMs acquire their knowledge during pre-training.  To fill this gap, we present a collection of datasets covering multiple stages of cybersecurity LLM training, including pre-training (_Primus-Seed_ and _Primus-FineWeb_), instruction fine-tuning (_Primus-Instruct_), and reasoning data for distillation (_Primus-Reasoning_).  Based on these datasets and Llama-3.1-8B-Instruct, we developed _Llama-Primus-Base_, _Llama-Primus-Merged_, and _Llama-Primus-Reasoning_. This model card is **Llama-Primus-Reasoning**.

  >  **Note:** No TrendMicro customer information is included.

## Cybersecurity Benchmark Results


| Model                               | CISSP                | Avg. Tokens |
|--------------------------------------|----------------------|-------------|
| **w/o CoT, 5-shot**                 |                      |             |
| Llama-3.1-8B-Instruct               | 0.7073               | 1           |
| Llama-Primus-Merged                 | 0.7191 ↑1.67%        | 1           |
| **w/ CoT, 0-shot**                   |                      |             |
| Llama-3.1-8B-Instruct               | 0.7288 ↑3.03%        | 279.69      |
| DeepSeek-R1-Distill-Llama-8B        | 0.7399 ↑4.61%        | 1542.10     |
| Llama-Primus-Merged                 | 0.7603 ↑7.49%        | 241.92      |
| **Finetuned on Primus-Reasoning**   |                      |             |
| Llama-3.1-8B-Reasoning              | 0.7583 ↑7.21%        | 646.94      |
| Llama-Primus-Reasoning              | 0.7780 ↑**10.0%**    | 726.96      |
| ---   |                      |             |
| o1-preview                          | 0.8035               | 1054.91     |



Effect of _Primus-Reasoning_ fine-tuning, evaluated on CISSP. ↑ indicates the percentage improvement over Llama without CoT and in the 5-shot setting. The best improvement is highlighted in **bold**.

## About _Primus_
Primus is Trend Micro's pioneering family of lightweight, state-of-the-art open cybersecurity language models and datasets. Developed through our cutting-edge research initiatives and advanced technology, these resources share the innovative foundation that powers our enterprise-class [Trend Cybertron](https://newsroom.trendmicro.com/2025-02-25-Trend-Micro-Puts-Industry-Ahead-of-Cyberattacks-with-Industrys-First-Proactive-Cybersecurity-AI) solution. As an industry leader in cybersecurity, Trend Micro is proud to contribute these powerful, efficiency-optimized models and datasets to the community, while maintaining the excellence and reliability that define our global security standards.

## License
This model is based on the MIT license, but you must also comply with the Llama 3.1 Community License Agreement.