Skip to content

Commit 930e586

Browse files
[8.x] [Security Solution] Fixes multi-line diff algorithm performance in the `upgrade/_review` endpoint (#199388) (#200096)
# Backport This will backport the following commits from `main` to `8.x`: - [[Security Solution] Fixes multi-line diff algorithm performance in the &#x60;upgrade/_review&#x60; endpoint (#199388)](#199388) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Davis Plumlee","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-11-13T20:59:15Z","message":"[Security Solution] Fixes multi-line diff algorithm performance in the `upgrade/_review` endpoint (#199388)\n\n**Fixes https://github.com/elastic/kibana/issues/199290**\r\n\r\n## Summary\r\n\r\nThe current multi-line string algorithm uses a very inefficient regex to\r\nsplit and analyze string fields, and exponentially increases in time\r\ncomplexity when the strings are long. This PR substitutes a much simpler\r\ncomparison regex for far better efficiency as shown in the table below.\r\n\r\n### Performance between different regex options using sample prebuilt\r\nrule setup guide string\r\n\r\n| | `/(\\S+\\|\\s+)/g` (original) | `/(\\s+)/g` | `/(\\n)/g` |\r\n`/(\\r\\n\\|\\n\\|\\r)/g` |\r\n\r\n|-----------------------|---------------|----------|---------|-------------------|\r\n| Unit test speed | `986ms` | `96ms` | `1ms` | `2ms` |\r\n| FTR test with 1 rule | `3.0s` | `2.8s` | `2.0s` | `2.0s` |\r\n| FTR test with 5 rules | `11.6s` | `6.8s` | `6.1s` | |\r\n\r\n\r\n### Performance between different regex options using intentionally long\r\nstrings (25k characters)\r\n\r\n| | `/(\\S+\\|\\s+)/g` | `/(\\r\\n\\|\\n\\|\\r)/g` |\r\n|----------------------|-----------------------|---------------------|\r\n| Unit test speed | `1049414ms` (17 min) | `58ms` |\r\n| FTR test with 1 rule | `>360000ms` (Timeout) | `2.1 s` |\r\n\r\n### Checklist\r\n\r\nDelete any items that are not applicable to this PR.\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios\r\n- [ ] [Flaky Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was\r\nused on any tests changed\r\n\r\n\r\n### For maintainers\r\n\r\n- [ ] This was checked for breaking API changes and was [labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#_add_your_labels)\r\n- [ ] This will appear in the **Release Notes** and follow the\r\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <[email protected]>\r\nCo-authored-by: Georgii Gorbachev <[email protected]>","sha":"4f6d3570c59d30951368f601b2b59ab3b7a1ae4c","branchLabelMapping":{"^v9.0.0$":"main","^v8.17.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["bug","release_note:skip","impact:critical","v9.0.0","Team:Detections and Resp","Team: SecuritySolution","Team:Detection Rule Management","Feature:Prebuilt Detection Rules","backport:version","v8.17.0","v8.16.1"],"title":"[Security Solution] Fixes multi-line diff algorithm performance in the `upgrade/_review` endpoint","number":199388,"url":"https://github.com/elastic/kibana/pull/199388","mergeCommit":{"message":"[Security Solution] Fixes multi-line diff algorithm performance in the `upgrade/_review` endpoint (#199388)\n\n**Fixes https://github.com/elastic/kibana/issues/199290**\r\n\r\n## Summary\r\n\r\nThe current multi-line string algorithm uses a very inefficient regex to\r\nsplit and analyze string fields, and exponentially increases in time\r\ncomplexity when the strings are long. This PR substitutes a much simpler\r\ncomparison regex for far better efficiency as shown in the table below.\r\n\r\n### Performance between different regex options using sample prebuilt\r\nrule setup guide string\r\n\r\n| | `/(\\S+\\|\\s+)/g` (original) | `/(\\s+)/g` | `/(\\n)/g` |\r\n`/(\\r\\n\\|\\n\\|\\r)/g` |\r\n\r\n|-----------------------|---------------|----------|---------|-------------------|\r\n| Unit test speed | `986ms` | `96ms` | `1ms` | `2ms` |\r\n| FTR test with 1 rule | `3.0s` | `2.8s` | `2.0s` | `2.0s` |\r\n| FTR test with 5 rules | `11.6s` | `6.8s` | `6.1s` | |\r\n\r\n\r\n### Performance between different regex options using intentionally long\r\nstrings (25k characters)\r\n\r\n| | `/(\\S+\\|\\s+)/g` | `/(\\r\\n\\|\\n\\|\\r)/g` |\r\n|----------------------|-----------------------|---------------------|\r\n| Unit test speed | `1049414ms` (17 min) | `58ms` |\r\n| FTR test with 1 rule | `>360000ms` (Timeout) | `2.1 s` |\r\n\r\n### Checklist\r\n\r\nDelete any items that are not applicable to this PR.\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios\r\n- [ ] [Flaky Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was\r\nused on any tests changed\r\n\r\n\r\n### For maintainers\r\n\r\n- [ ] This was checked for breaking API changes and was [labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#_add_your_labels)\r\n- [ ] This will appear in the **Release Notes** and follow the\r\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <[email protected]>\r\nCo-authored-by: Georgii Gorbachev <[email protected]>","sha":"4f6d3570c59d30951368f601b2b59ab3b7a1ae4c"}},"sourceBranch":"main","suggestedTargetBranches":["8.x","8.16"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/199388","number":199388,"mergeCommit":{"message":"[Security Solution] Fixes multi-line diff algorithm performance in the `upgrade/_review` endpoint (#199388)\n\n**Fixes https://github.com/elastic/kibana/issues/199290**\r\n\r\n## Summary\r\n\r\nThe current multi-line string algorithm uses a very inefficient regex to\r\nsplit and analyze string fields, and exponentially increases in time\r\ncomplexity when the strings are long. This PR substitutes a much simpler\r\ncomparison regex for far better efficiency as shown in the table below.\r\n\r\n### Performance between different regex options using sample prebuilt\r\nrule setup guide string\r\n\r\n| | `/(\\S+\\|\\s+)/g` (original) | `/(\\s+)/g` | `/(\\n)/g` |\r\n`/(\\r\\n\\|\\n\\|\\r)/g` |\r\n\r\n|-----------------------|---------------|----------|---------|-------------------|\r\n| Unit test speed | `986ms` | `96ms` | `1ms` | `2ms` |\r\n| FTR test with 1 rule | `3.0s` | `2.8s` | `2.0s` | `2.0s` |\r\n| FTR test with 5 rules | `11.6s` | `6.8s` | `6.1s` | |\r\n\r\n\r\n### Performance between different regex options using intentionally long\r\nstrings (25k characters)\r\n\r\n| | `/(\\S+\\|\\s+)/g` | `/(\\r\\n\\|\\n\\|\\r)/g` |\r\n|----------------------|-----------------------|---------------------|\r\n| Unit test speed | `1049414ms` (17 min) | `58ms` |\r\n| FTR test with 1 rule | `>360000ms` (Timeout) | `2.1 s` |\r\n\r\n### Checklist\r\n\r\nDelete any items that are not applicable to this PR.\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios\r\n- [ ] [Flaky Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was\r\nused on any tests changed\r\n\r\n\r\n### For maintainers\r\n\r\n- [ ] This was checked for breaking API changes and was [labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#_add_your_labels)\r\n- [ ] This will appear in the **Release Notes** and follow the\r\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <[email protected]>\r\nCo-authored-by: Georgii Gorbachev <[email protected]>","sha":"4f6d3570c59d30951368f601b2b59ab3b7a1ae4c"}},{"branch":"8.x","label":"v8.17.0","branchLabelMappingKey":"^v8.17.0$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.16","label":"v8.16.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Davis Plumlee <[email protected]>
1 parent 123c8e4 commit 930e586

File tree

4 files changed

+202
-26
lines changed

4 files changed

+202
-26
lines changed

x-pack/plugins/security_solution/server/lib/detection_engine/prebuilt_rules/logic/diff/calculation/algorithms/multi_line_string_diff_algorithm.mock.ts

Lines changed: 64 additions & 0 deletions
Large diffs are not rendered by default.

x-pack/plugins/security_solution/server/lib/detection_engine/prebuilt_rules/logic/diff/calculation/algorithms/multi_line_string_diff_algorithm.test.ts

Lines changed: 81 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -13,13 +13,23 @@ import {
1313
ThreeWayDiffConflict,
1414
} from '../../../../../../../../common/api/detection_engine';
1515
import { multiLineStringDiffAlgorithm } from './multi_line_string_diff_algorithm';
16+
import {
17+
TEXT_M_A,
18+
TEXT_M_B,
19+
TEXT_M_C,
20+
TEXT_M_MERGED,
21+
TEXT_XL_A,
22+
TEXT_XL_B,
23+
TEXT_XL_C,
24+
TEXT_XL_MERGED,
25+
} from './multi_line_string_diff_algorithm.mock';
1626

1727
describe('multiLineStringDiffAlgorithm', () => {
1828
it('returns current_version as merged output if there is no update - scenario AAA', () => {
1929
const mockVersions: ThreeVersionsOf<string> = {
20-
base_version: 'My description.\nThis is a second line.',
21-
current_version: 'My description.\nThis is a second line.',
22-
target_version: 'My description.\nThis is a second line.',
30+
base_version: TEXT_M_A,
31+
current_version: TEXT_M_A,
32+
target_version: TEXT_M_A,
2333
};
2434

2535
const result = multiLineStringDiffAlgorithm(mockVersions);
@@ -36,9 +46,9 @@ describe('multiLineStringDiffAlgorithm', () => {
3646

3747
it('returns current_version as merged output if current_version is different and there is no update - scenario ABA', () => {
3848
const mockVersions: ThreeVersionsOf<string> = {
39-
base_version: 'My description.\nThis is a second line.',
40-
current_version: 'My GREAT description.\nThis is a second line.',
41-
target_version: 'My description.\nThis is a second line.',
49+
base_version: TEXT_M_A,
50+
current_version: TEXT_M_B,
51+
target_version: TEXT_M_A,
4252
};
4353

4454
const result = multiLineStringDiffAlgorithm(mockVersions);
@@ -55,9 +65,9 @@ describe('multiLineStringDiffAlgorithm', () => {
5565

5666
it('returns target_version as merged output if current_version is the same and there is an update - scenario AAB', () => {
5767
const mockVersions: ThreeVersionsOf<string> = {
58-
base_version: 'My description.\nThis is a second line.',
59-
current_version: 'My description.\nThis is a second line.',
60-
target_version: 'My GREAT description.\nThis is a second line.',
68+
base_version: TEXT_M_A,
69+
current_version: TEXT_M_A,
70+
target_version: TEXT_M_B,
6171
};
6272

6373
const result = multiLineStringDiffAlgorithm(mockVersions);
@@ -74,9 +84,9 @@ describe('multiLineStringDiffAlgorithm', () => {
7484

7585
it('returns current_version as merged output if current version is different but it matches the update - scenario ABB', () => {
7686
const mockVersions: ThreeVersionsOf<string> = {
77-
base_version: 'My description.\nThis is a second line.',
78-
current_version: 'My GREAT description.\nThis is a second line.',
79-
target_version: 'My GREAT description.\nThis is a second line.',
87+
base_version: TEXT_M_A,
88+
current_version: TEXT_M_B,
89+
target_version: TEXT_M_B,
8090
};
8191

8292
const result = multiLineStringDiffAlgorithm(mockVersions);
@@ -92,32 +102,53 @@ describe('multiLineStringDiffAlgorithm', () => {
92102
});
93103

94104
describe('if all three versions are different - scenario ABC', () => {
95-
it('returns a computated merged version without a conflict if 3 way merge is possible', () => {
105+
it('returns a computated merged version with a solvable conflict if 3 way merge is possible (real-world example)', () => {
96106
const mockVersions: ThreeVersionsOf<string> = {
97-
base_version: `My description.\f\nThis is a second\u2001 line.\f\nThis is a third line.`,
98-
current_version: `My GREAT description.\f\nThis is a second\u2001 line.\f\nThis is a third line.`,
99-
target_version: `My description.\f\nThis is a second\u2001 line.\f\nThis is a GREAT line.`,
107+
base_version: TEXT_M_A,
108+
current_version: TEXT_M_B,
109+
target_version: TEXT_M_C,
100110
};
101111

102-
const expectedMergedVersion = `My GREAT description.\f\nThis is a second\u2001 line.\f\nThis is a GREAT line.`;
112+
const result = multiLineStringDiffAlgorithm(mockVersions);
113+
114+
expect(result).toEqual(
115+
expect.objectContaining({
116+
merged_version: TEXT_M_MERGED,
117+
diff_outcome: ThreeWayDiffOutcome.CustomizedValueCanUpdate,
118+
conflict: ThreeWayDiffConflict.SOLVABLE,
119+
merge_outcome: ThreeWayMergeOutcome.Merged,
120+
})
121+
);
122+
});
123+
124+
it('returns a computated merged version with a solvable conflict if 3 way merge is possible (simplified example)', () => {
125+
// 3 way merge is possible when changes are made to different lines of text
126+
// (in other words, there are no different changes made to the same line of text).
127+
const mockVersions: ThreeVersionsOf<string> = {
128+
base_version: 'My description.\nThis is a second line.',
129+
current_version: 'My MODIFIED description.\nThis is a second line.',
130+
target_version: 'My description.\nThis is a MODIFIED second line.',
131+
};
103132

104133
const result = multiLineStringDiffAlgorithm(mockVersions);
105134

106135
expect(result).toEqual(
107136
expect.objectContaining({
108-
merged_version: expectedMergedVersion,
137+
merged_version: 'My MODIFIED description.\nThis is a MODIFIED second line.',
109138
diff_outcome: ThreeWayDiffOutcome.CustomizedValueCanUpdate,
110139
conflict: ThreeWayDiffConflict.SOLVABLE,
111140
merge_outcome: ThreeWayMergeOutcome.Merged,
112141
})
113142
);
114143
});
115144

116-
it('returns the current_version with a conflict if 3 way merge is not possible', () => {
145+
it('returns the current_version with a non-solvable conflict if 3 way merge is not possible (simplified example)', () => {
146+
// It's enough to have different changes made to the same line of text
147+
// to trigger a NON_SOLVABLE conflict. This behavior is similar to how Git works.
117148
const mockVersions: ThreeVersionsOf<string> = {
118149
base_version: 'My description.\nThis is a second line.',
119-
current_version: 'My GREAT description.\nThis is a third line.',
120-
target_version: 'My EXCELLENT description.\nThis is a fourth.',
150+
current_version: 'My GREAT description.\nThis is a second line.',
151+
target_version: 'My EXCELLENT description.\nThis is a second line.',
121152
};
122153

123154
const result = multiLineStringDiffAlgorithm(mockVersions);
@@ -131,14 +162,39 @@ describe('multiLineStringDiffAlgorithm', () => {
131162
})
132163
);
133164
});
165+
166+
it('does not exceed performance limits when diffing and merging extra large input texts', () => {
167+
const mockVersions: ThreeVersionsOf<string> = {
168+
base_version: TEXT_XL_A,
169+
current_version: TEXT_XL_B,
170+
target_version: TEXT_XL_C,
171+
};
172+
173+
const startTime = performance.now();
174+
const result = multiLineStringDiffAlgorithm(mockVersions);
175+
const endTime = performance.now();
176+
177+
// If the regex merge in this function takes over 500ms, this test fails
178+
// Performance measurements: https://github.com/elastic/kibana/pull/199388
179+
expect(endTime - startTime).toBeLessThan(500);
180+
181+
expect(result).toEqual(
182+
expect.objectContaining({
183+
merged_version: TEXT_XL_MERGED,
184+
diff_outcome: ThreeWayDiffOutcome.CustomizedValueCanUpdate,
185+
conflict: ThreeWayDiffConflict.SOLVABLE,
186+
merge_outcome: ThreeWayMergeOutcome.Merged,
187+
})
188+
);
189+
});
134190
});
135191

136192
describe('if base_version is missing', () => {
137193
it('returns current_version as merged output if current_version and target_version are the same - scenario -AA', () => {
138194
const mockVersions: ThreeVersionsOf<string> = {
139195
base_version: MissingVersion,
140-
current_version: 'My description.\nThis is a second line.',
141-
target_version: 'My description.\nThis is a second line.',
196+
current_version: TEXT_M_A,
197+
target_version: TEXT_M_A,
142198
};
143199

144200
const result = multiLineStringDiffAlgorithm(mockVersions);
@@ -158,8 +214,8 @@ describe('multiLineStringDiffAlgorithm', () => {
158214
it('returns target_version as merged output if current_version and target_version are different - scenario -AB', () => {
159215
const mockVersions: ThreeVersionsOf<string> = {
160216
base_version: MissingVersion,
161-
current_version: `My GREAT description.\nThis is a second line.`,
162-
target_version: `My description.\nThis is a second line, now longer.`,
217+
current_version: TEXT_M_A,
218+
target_version: TEXT_M_B,
163219
};
164220

165221
const result = multiLineStringDiffAlgorithm(mockVersions);

x-pack/plugins/security_solution/server/lib/detection_engine/prebuilt_rules/logic/diff/calculation/algorithms/multi_line_string_diff_algorithm.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,7 @@ const mergeVersions = ({
102102
// TS does not realize that in ABC scenario, baseVersion cannot be missing
103103
// Missing baseVersion scenarios were handled as -AA and -AB.
104104
const mergedVersion = merge(currentVersion, baseVersion ?? '', targetVersion, {
105-
stringSeparator: /(\S+|\s+)/g, // Retains all whitespace, which we keep to preserve formatting
105+
stringSeparator: /(\r\n|\n|\r)/g, // Separates strings by new lines
106106
});
107107

108108
return mergedVersion.conflict

x-pack/test/security_solution_api_integration/test_suites/detections_response/rules_management/prebuilt_rules/management/trial_license_complete_tier/upgrade_review_prebuilt_rules.multi_line_string_fields.ts

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,12 @@ import {
1010
ThreeWayDiffOutcome,
1111
ThreeWayMergeOutcome,
1212
} from '@kbn/security-solution-plugin/common/api/detection_engine';
13+
import {
14+
TEXT_XL_A,
15+
TEXT_XL_B,
16+
TEXT_XL_C,
17+
TEXT_XL_MERGED,
18+
} from '@kbn/security-solution-plugin/server/lib/detection_engine/prebuilt_rules/logic/diff/calculation/algorithms/multi_line_string_diff_algorithm.mock';
1319
import { FtrProviderContext } from '../../../../../../ftr_provider_context';
1420
import {
1521
deleteAllTimelines,
@@ -249,6 +255,56 @@ export default ({ getService }: FtrProviderContext): void => {
249255
expect(reviewResponse.stats.num_rules_with_conflicts).toBe(1);
250256
expect(reviewResponse.stats.num_rules_with_non_solvable_conflicts).toBe(0);
251257
});
258+
259+
it('should handle long multi-line strings without timing out', async () => {
260+
// Install base prebuilt detection rule
261+
await createHistoricalPrebuiltRuleAssetSavedObjects(es, [
262+
createRuleAssetSavedObject({
263+
rule_id: 'rule-1',
264+
version: 1,
265+
description: TEXT_XL_A,
266+
}),
267+
]);
268+
await installPrebuiltRules(es, supertest);
269+
270+
// Customize a multi line string field on the installed rule
271+
await patchRule(supertest, log, {
272+
rule_id: 'rule-1',
273+
description: TEXT_XL_B,
274+
});
275+
276+
// Increment the version of the installed rule, update a multi line string field, and create the new rule assets
277+
const updatedRuleAssetSavedObjects = [
278+
createRuleAssetSavedObject({
279+
rule_id: 'rule-1',
280+
version: 2,
281+
description: TEXT_XL_C,
282+
}),
283+
];
284+
await createHistoricalPrebuiltRuleAssetSavedObjects(es, updatedRuleAssetSavedObjects);
285+
286+
// Call the upgrade review prebuilt rules endpoint and check that one rule is eligible for update
287+
// and multi line string field update has no conflict
288+
const reviewResponse = await reviewPrebuiltRulesToUpgrade(supertest);
289+
expect(reviewResponse.rules[0].diff.fields.description).toEqual({
290+
base_version: TEXT_XL_A,
291+
current_version: TEXT_XL_B,
292+
target_version: TEXT_XL_C,
293+
merged_version: TEXT_XL_MERGED,
294+
diff_outcome: ThreeWayDiffOutcome.CustomizedValueCanUpdate,
295+
merge_outcome: ThreeWayMergeOutcome.Merged,
296+
conflict: ThreeWayDiffConflict.SOLVABLE,
297+
has_update: true,
298+
has_base_version: true,
299+
});
300+
expect(reviewResponse.rules[0].diff.num_fields_with_updates).toBe(2);
301+
expect(reviewResponse.rules[0].diff.num_fields_with_conflicts).toBe(1);
302+
expect(reviewResponse.rules[0].diff.num_fields_with_non_solvable_conflicts).toBe(0);
303+
304+
expect(reviewResponse.stats.num_rules_to_upgrade_total).toBe(1);
305+
expect(reviewResponse.stats.num_rules_with_conflicts).toBe(1);
306+
expect(reviewResponse.stats.num_rules_with_non_solvable_conflicts).toBe(0);
307+
});
252308
});
253309

254310
describe('when all versions are not mergable', () => {

0 commit comments

Comments
 (0)