Skip to content

CTINexus is a novel framework that leverages optimized in-context learning of LLMs to enable data-efficient extraction of cyber threat intelligence and the construction of high-quality cybersecurity knowledge graphs.

Notifications You must be signed in to change notification settings

6armyflag6/CTINexus

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Logo

Automatic Cyber Threat Intelligence Knowledge Graph Construction Using Large Language Models

The repository of CTINexus, a novel framework leveraging optimized in-context learning (ICL) of large language models (LLMs) for data-efficient CTI knowledge extraction and high-quality cybersecurity knowledge graph (CSKG) construction. CTINexus requires neither extensive data nor parameter tuning and can adapt to various ontologies with minimal annotated examples.

framework

News

🔥 [2025/04/21] We released the camera-ready paper on arxiv.

🔥 [2025/02/12] CTINexus is accepted at 2025 IEEE European Symposium on Security and Privacy (Euro S&P).

Introduction

CTINexus composes of the following modules:

  • IE: A carefully designed automatic prompt construction strategy with optimal demonstration retrieval for extracting a wide range of cybersecurity entities and relations;
  • A hierarchical entity alignment technique that canonicalizes the extracted knowledge and removes redundancy;
    • ET: Groups mentions of the same type.
    • EM: Merges mentions referring to the same entity with IOC protection.
  • LP: An long-distance relation prediction technique to further complete the CSKG with missing links.

Get Start

1. Datasets

2. Cybersecurity Triplet Extraction

  1. Update the configuration file. To use the optimal settings, simply insert your OpenAI API key.
  2. Run the following script to perform triplet extraction:
    sh tools/scripts/ie.sh

3. Hierarchical Entity Alignment

3.1 Course-grained Entity Typing

  1. Update the configuration file. To use the optimal settings, simply insert your OpenAI API key.
  2. Run the following script to perform triplet extraction:
    sh tools/scripts/et.sh

3.2 Fine-grained Entity Merging

  1. Update the configuration files (config1, config2). To use the optimal settings, simply insert your OpenAI API key.
  2. Run the following script to perform entity alignment:
    sh tools/scripts/em.sh

4. Long-Distance Relation Prediction

  1. Update the configuration file. To use the optimal settings, simply insert your OpenAI API key.
  2. Run the following script to predict long-distance relations:
    sh tools/scripts/lp.sh

Citation

We hope our work serves as a foundation for further LLM applications in the CTI analysis community. If you find it helpful for your research, please consider citing our paper! ❤️

@inproceedings{cheng2025ctinexusautomaticcyberthreat,
      title={CTINexus: Automatic Cyber Threat Intelligence Knowledge Graph Construction Using Large Language Models}, 
      author={Yutong Cheng and Osama Bajaber and Saimon Amanuel Tsegai and Dawn Song and Peng Gao},
      booktitle={2025 IEEE European Symposium on Security and Privacy (EuroS\&P)},
      year={2025},
      organization={IEEE}
}

About

CTINexus is a novel framework that leverages optimized in-context learning of LLMs to enable data-efficient extraction of cyber threat intelligence and the construction of high-quality cybersecurity knowledge graphs.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 83.4%
  • Jinja 16.4%
  • Shell 0.2%