Skip to content

Commit 23ad810

Browse files
authored
Updates LLM matrix (#883)
9.0 component of #738 Updates the LLM performance matrix to reflect the latest testing Preview: [LLM performance matrix](https://docs-v3-preview.elastic.dev/elastic/docs-content/pull/883/solutions/security/ai/large-language-model-performance-matrix)
1 parent 0552db8 commit 23ad810

File tree

1 file changed

+15
-13
lines changed

1 file changed

+15
-13
lines changed

solutions/security/ai/large-language-model-performance-matrix.md

Lines changed: 15 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,8 @@ applies_to:
1212

1313
This page describes the performance of various large language models (LLMs) for different use cases in {{elastic-sec}}, based on our internal testing. To learn more about these use cases, refer to [Attack discovery](/solutions/security/ai/attack-discovery.md) or [AI Assistant](/solutions/security/ai/ai-assistant.md).
1414

15-
::::{note}
16-
`Excellent` is the best rating, followed by `Great`, then by `Good`, and finally by `Poor`.
15+
::::{important}
16+
`Excellent` is the best rating, followed by `Great`, then by `Good`, and finally by `Poor`. Models rated `Excellent` or `Great` should produce quality results. Models rated `Good` or `Poor` are not recommended for that use case.
1717
::::
1818

1919

@@ -22,17 +22,19 @@ This page describes the performance of various large language models (LLMs) for
2222

2323
Models from third-party LLM providers.
2424

25-
| **Feature** | | **Assistant - General** | **Assistant - {{esql}} generation** | **Assistant - Alert questions** | **Assistant - Knowledge retrieval** | **Attack Discovery** |
26-
| --- | --- | --- | --- | --- | --- | --- |
27-
| **Model** | **Claude 3: Opus** | Excellent | Excellent | Excellent | Good | Great |
28-
| | **Claude 3.5: Sonnet v2** | Excellent | Excellent | Excellent | Excellent | Great |
29-
| | **Claude 3.5: Sonnet** | Excellent | Excellent | Excellent | Excellent | Excellent |
30-
| | **Claude 3.5: Haiku** | Excellent | Excellent | Excellent | Excellent | Poor |
31-
| | **Claude 3: Haiku** | Excellent | Excellent | Excellent | Excellent | Poor |
32-
| | **GPT-4o** | Excellent | Excellent | Excellent | Excellent | Great |
33-
| | **GPT-4o-mini** | Excellent | Great | Great | Great | Poor |
34-
| | **Gemini 1.5 Pro 002** | Excellent | Excellent | Excellent | Excellent | Excellent |
35-
| | **Gemini 1.5 Flash 002** | Excellent | Poor | Good | Excellent | Poor |
25+
| **Feature** | - | **Assistant - General** | **Assistant - {{esql}} generation** | **Assistant - Alert questions** | **Assistant - Knowledge retrieval** | **Attack Discovery** | **AI-powered SIEM migration** |
26+
| --- | --- | --- | --- | --- | --- | --- | --- |
27+
| **Model** | **Claude 3: Opus** | Excellent | Excellent | Excellent | Good | Great | Good
28+
| | **Claude 3.7: Sonnet** | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent
29+
| | **Claude 3.5: Sonnet v2** | Excellent | Excellent | Excellent | Excellent | Great | Excellent
30+
| | **Claude 3.5: Sonnet** | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent
31+
| | **Claude 3.5: Haiku** | Excellent | Excellent | Excellent | Excellent | Poor | Poor
32+
| | **Claude 3: Haiku** | Excellent | Excellent | Excellent | Excellent | Poor | Poor
33+
| | **GPT-4o** | Excellent | Excellent | Excellent | Excellent | Great | Great
34+
| | **GPT-4o-mini** | Excellent | Great | Great | Great | Poor | Good
35+
| | **Gemini 1.5 Pro 002** | Excellent | Excellent | Excellent | Excellent | Excellent | Great
36+
| | **Gemini 1.5 Flash 002** | Excellent | Poor | Good | Excellent | Poor | Excellent
37+
| | **Gemini 2.0 Flash 001** | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent
3638

3739

3840
## Open-source models [_open_source_models]

0 commit comments

Comments
 (0)