Skip to content

Commit 7dc1fa4

Browse files
Proper ordering for publications
1 parent 3c1b635 commit 7dc1fa4

File tree

1 file changed

+19
-18
lines changed

1 file changed

+19
-18
lines changed

_data/publications.yml

Lines changed: 19 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -2283,6 +2283,25 @@
22832283
year: 2025
22842284
doi:
22852285

2286+
- id: FGCS2025
2287+
id_iris: 364408
2288+
title: "A comparative benchmark study of LLM-based threat elicitation tools"
2289+
authors:
2290+
- DimitriVanLanduyt
2291+
- MajidMollaeefar
2292+
- MarioRaciti
2293+
- StefVerreydt
2294+
- AbdulazizKalash
2295+
- AndreaBissoli
2296+
- DavyPreuveneers
2297+
- GiampaoloBella
2298+
- SilvioRanise
2299+
abstract: >
2300+
Threat modeling refers to the software design activity that involves the proactive identification, evaluation, and mitigation of specific potential threat scenarios. Recently, attention has been growing for the potential to automate the threat elicitation process using Large Language Models (llms), and different tools have emerged that are capable of generating threats based on system models and other descriptive system documentation. This paper presents the outcomes of an experimental evaluation study of llm-based threat elicitation tools, which we apply to two complex and contemporary application cases that involve biometric authentication. The comparative benchmark is based on a grounded approach to establish four distinct baselines which are representative of the results of human threat modelers, both novices and experts. In support of scale and reproducibility, the evaluation approach itself is maximally automated using sentence transformer models to perform threat mapping. Our study evaluates 56 distinct threat models generated by 6 llm-based threat elicitation tools. While the generated threats are somewhat similar to the threats documented by human threats modelers, relative performance is low. The evaluated llm-based threat elicitation tools prove to be particularly inefficient in eliciting the threats on the expert level. Furthermore, we show that performance differences between these tools can be attributed on a similar level to both the prompting approach (e.g., multi-shot, knowledge pre-prompting, role prompting) and the actual reasoning capabilities of the underlying llms being used.
2301+
destination: FGCS
2302+
year: 2025
2303+
doi: 10.1016/j.future.2025.108243
2304+
22862305
- id: IWBF2025
22872306
id_iris: 362127
22882307
title: "Spotting Tell-Tale Visual Artifacts in Face Swapping Videos: Strengths and Pitfalls of CNN Detectors"
@@ -2384,22 +2403,4 @@
23842403
year: 2025
23852404
doi:
23862405

2387-
- id: FGCS2025
2388-
id_iris: 364408
2389-
title: "A comparative benchmark study of LLM-based threat elicitation tools"
2390-
authors:
2391-
- DimitriVanLanduyt
2392-
- MajidMollaeefar
2393-
- MarioRaciti
2394-
- StefVerreydt
2395-
- AbdulazizKalash
2396-
- AndreaBissoli
2397-
- DavyPreuveneers
2398-
- GiampaoloBella
2399-
- SilvioRanise
2400-
abstract: >
2401-
Threat modeling refers to the software design activity that involves the proactive identification, evaluation, and mitigation of specific potential threat scenarios. Recently, attention has been growing for the potential to automate the threat elicitation process using Large Language Models (llms), and different tools have emerged that are capable of generating threats based on system models and other descriptive system documentation. This paper presents the outcomes of an experimental evaluation study of llm-based threat elicitation tools, which we apply to two complex and contemporary application cases that involve biometric authentication. The comparative benchmark is based on a grounded approach to establish four distinct baselines which are representative of the results of human threat modelers, both novices and experts. In support of scale and reproducibility, the evaluation approach itself is maximally automated using sentence transformer models to perform threat mapping. Our study evaluates 56 distinct threat models generated by 6 llm-based threat elicitation tools. While the generated threats are somewhat similar to the threats documented by human threats modelers, relative performance is low. The evaluated llm-based threat elicitation tools prove to be particularly inefficient in eliciting the threats on the expert level. Furthermore, we show that performance differences between these tools can be attributed on a similar level to both the prompting approach (e.g., multi-shot, knowledge pre-prompting, role prompting) and the actual reasoning capabilities of the underlying llms being used.
2402-
destination: FGCS
2403-
year: 2025
2404-
doi: 10.1016/j.future.2025.108243
24052406
# PLEASE KEEP ALPHABETICAL ORDER BY ID WITHIN YEARS

0 commit comments

Comments
 (0)