Skip to content

Commit 06f0778

Browse files
authored
[system tests] Add new settings to wait for documents being ingested (#2429)
This PR includes new system test settings to allow for different conditions when waiting for the required/needed documents to be ingested into the data stream. It adds two new settings `assert.min_count` and `assert.fields_present`.
1 parent cf9a75c commit 06f0778

File tree

33 files changed

+825
-51
lines changed

33 files changed

+825
-51
lines changed

docs/howto/system_testing.md

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -434,6 +434,10 @@ for system tests.
434434
| skip_transform_validation | boolean | | Disable or enable the transforms validation performed in system tests. |
435435
| vars | dictionary | | Package level variables to set (i.e. declared in `$package_root/manifest.yml`). If not specified the defaults from the manifest are used. |
436436
| wait_for_data_timeout | duration | | Amount of time to wait for data to be present in Elasticsearch. Defaults to 10m. |
437+
| assert.hit_count | integer | | Exact number of documents to wait for being ingested. |
438+
| assert.min_count | integer | | Minimum number of documents to wait for being ingested. |
439+
| assert.fields_present | []string| | List of fields that must be present in the documents to stop waiting for new documents. |
440+
| assert.ingestion_idle_time | duration | | Minimum time elapsed since the last document was ingested. |
437441
438442
For example, the `apache/access` data stream's `test-access-log-config.yml` is
439443
shown below.
@@ -470,7 +474,25 @@ you can use the `input` option to select the stream to test. The first stream
470474
whose input type matches the `input` value will be tested. By default, the first
471475
stream declared in the manifest will be tested.
472476

473-
To add an assertion on the number of hits in a given system test, consider this example from the `httpjson/generic` data stream's `test-expected-hit-count-config.yml`, shown below.
477+
#### Available assertions to wait for documents
478+
479+
System tests allow to define different conditions to collect data from the integration service and index it into the correct Elasticsearch data stream.
480+
481+
By default, `elastic-package` waits until there are more than zero documents ingested. The exact number of documents to be
482+
validated in this default scenario depends on how fast the documents are ingested.
483+
484+
There are other 4 options available:
485+
- Wait for collecting exactly `assert.hit_count` documents into the data stream.
486+
- It will fail if the final number of documents ingested into Elasticsearch is different from `assert.hit_count` documents.
487+
- Wait for collecting at least `assert.min_count` documents into the data stream.
488+
- Once there have been `assert.min_count` or more documents ingested, `elastic-package` will proceed to validate the documents.
489+
- This could be used to ensure that a wide range of different documents have been ingested into Elasticsearch.
490+
- Collect data into the data stream until all the fields defined in the list `assert.fields_present` are present in any of the documents.
491+
- Each field in that list could be present in different documents.
492+
493+
The following example shows how to add an assertion on the number of hits in a given system test using `assert.hit_count`.
494+
495+
Consider this example from the `httpjson/generic` data stream's `test-expected-hit-count-config.yml`, shown below.
474496

475497
```yaml
476498
input: httpjson
@@ -519,6 +541,11 @@ inserts the value of `response_split` from the test configuration into the integ
519541

520542
Returning to `test-expected-hit-count-config.yml`, when `assert.hit_count` is defined and `> 0` the test will assert that the number of hits in the array matches that value and fail when this is not true.
521543

544+
#### Defining new Elastic Agents for a given test
545+
546+
System tests allow to create specific an Elsatic Agent for each test with custom settings or additional software.
547+
Elastic Agents can be customized by defining the needed `agent.*` settings.
548+
522549
As an example to add settings to create a new Elastic Agent in a given test,
523550
the`auditd_manager/audtid` data stream's `test-default-config.yml` is shown below:
524551

internal/testrunner/runners/system/test_config.go

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,8 +45,14 @@ type testConfig struct {
4545
SkipTransformValidation bool `config:"skip_transform_validation"`
4646

4747
Assert struct {
48-
// Expected number of hits for a given test
48+
// HitCount expected number of hits for a given test
4949
HitCount int `config:"hit_count"`
50+
51+
// MinCount minimum number of hits for a given test
52+
MinCount int `config:"min_count"`
53+
54+
// FieldsPresent list of fields that must be present in any of documents ingested
55+
FieldsPresent []string `config:"fields_present"`
5056
} `config:"assert"`
5157

5258
// NumericKeywordFields holds a list of fields that have keyword

internal/testrunner/runners/system/tester.go

Lines changed: 112 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -1355,51 +1355,7 @@ func (r *tester) prepareScenario(ctx context.Context, config *testConfig, stackC
13551355
return &scenario, nil
13561356
}
13571357

1358-
// Use custom timeout if the service can't collect data immediately.
1359-
waitForDataTimeout := waitForDataDefaultTimeout
1360-
if config.WaitForDataTimeout > 0 {
1361-
waitForDataTimeout = config.WaitForDataTimeout
1362-
}
1363-
1364-
// (TODO in future) Optionally exercise service to generate load.
1365-
logger.Debugf("checking for expected data in data stream (%s)...", waitForDataTimeout)
1366-
var hits *hits
1367-
oldHits := 0
1368-
passed, waitErr := wait.UntilTrue(ctx, func(ctx context.Context) (bool, error) {
1369-
var err error
1370-
hits, err = r.getDocs(ctx, scenario.dataStream)
1371-
if err != nil {
1372-
return false, err
1373-
}
1374-
1375-
if r.checkFailureStore {
1376-
failureStore, err := r.getFailureStoreDocs(ctx, scenario.dataStream)
1377-
if err != nil {
1378-
return false, fmt.Errorf("failed to check failure store: %w", err)
1379-
}
1380-
if n := len(failureStore); n > 0 {
1381-
// Interrupt loop earlier if there are failures in the document store.
1382-
logger.Debugf("Found %d hits in the failure store for %s", len(failureStore), scenario.dataStream)
1383-
return true, nil
1384-
}
1385-
}
1386-
1387-
if config.Assert.HitCount > 0 {
1388-
if hits.size() < config.Assert.HitCount {
1389-
return false, nil
1390-
}
1391-
1392-
ret := hits.size() == oldHits
1393-
if !ret {
1394-
oldHits = hits.size()
1395-
time.Sleep(4 * time.Second)
1396-
}
1397-
1398-
return ret, nil
1399-
}
1400-
1401-
return hits.size() > 0, nil
1402-
}, 1*time.Second, waitForDataTimeout)
1358+
hits, waitErr := r.waitForDocs(ctx, config, scenario.dataStream)
14031359

14041360
// before checking "waitErr" error , it is necessary to check if the service has finished with error
14051361
// to report it as a test case failed
@@ -1417,10 +1373,6 @@ func (r *tester) prepareScenario(ctx context.Context, config *testConfig, stackC
14171373
return nil, waitErr
14181374
}
14191375

1420-
if !passed {
1421-
return nil, testrunner.ErrTestCaseFailed{Reason: fmt.Sprintf("could not find hits in %s data stream", scenario.dataStream)}
1422-
}
1423-
14241376
// Get deprecation warnings after ensuring that there are ingested docs and thus the
14251377
// data stream exists.
14261378
scenario.deprecationWarnings, err = r.getDeprecationWarnings(ctx, scenario.dataStream)
@@ -1583,6 +1535,117 @@ func (r *tester) createServiceStateDir() error {
15831535
return nil
15841536
}
15851537

1538+
func (r *tester) waitForDocs(ctx context.Context, config *testConfig, dataStream string) (*hits, error) {
1539+
// Use custom timeout if the service can't collect data immediately.
1540+
waitForDataTimeout := waitForDataDefaultTimeout
1541+
if config.WaitForDataTimeout > 0 {
1542+
waitForDataTimeout = config.WaitForDataTimeout
1543+
}
1544+
1545+
if config.Assert.HitCount > elasticsearchQuerySize {
1546+
return nil, fmt.Errorf("invalid value for assert.hit_count (%d): it must be lower of the maximum query size (%d)", config.Assert.HitCount, elasticsearchQuerySize)
1547+
}
1548+
1549+
if config.Assert.MinCount > elasticsearchQuerySize {
1550+
return nil, fmt.Errorf("invalid value for assert.min_count (%d): it must be lower of the maximum query size (%d)", config.Assert.MinCount, elasticsearchQuerySize)
1551+
}
1552+
1553+
// (TODO in future) Optionally exercise service to generate load.
1554+
logger.Debugf("checking for expected data in data stream (%s)...", waitForDataTimeout)
1555+
var hits *hits
1556+
oldHits := 0
1557+
foundFields := map[string]any{}
1558+
passed, waitErr := wait.UntilTrue(ctx, func(ctx context.Context) (bool, error) {
1559+
var err error
1560+
hits, err = r.getDocs(ctx, dataStream)
1561+
if err != nil {
1562+
return false, err
1563+
}
1564+
1565+
defer func() {
1566+
oldHits = hits.size()
1567+
}()
1568+
1569+
if r.checkFailureStore {
1570+
failureStore, err := r.getFailureStoreDocs(ctx, dataStream)
1571+
if err != nil {
1572+
return false, fmt.Errorf("failed to check failure store: %w", err)
1573+
}
1574+
if n := len(failureStore); n > 0 {
1575+
// Interrupt loop earlier if there are failures in the document store.
1576+
logger.Debugf("Found %d hits in the failure store for %s", len(failureStore), dataStream)
1577+
return true, nil
1578+
}
1579+
}
1580+
1581+
assertHitCount := func() bool {
1582+
if config.Assert.HitCount == 0 {
1583+
// not enabled
1584+
return true
1585+
}
1586+
if hits.size() < config.Assert.HitCount {
1587+
return false
1588+
}
1589+
1590+
ret := hits.size() == oldHits
1591+
if !ret {
1592+
time.Sleep(4 * time.Second)
1593+
}
1594+
1595+
return ret
1596+
}()
1597+
1598+
assertFieldsPresent := func() bool {
1599+
if len(config.Assert.FieldsPresent) == 0 {
1600+
// not enabled
1601+
return true
1602+
}
1603+
if hits.size() == 0 {
1604+
// At least there should be one document ingested
1605+
return false
1606+
}
1607+
for _, f := range config.Assert.FieldsPresent {
1608+
if _, found := foundFields[f]; found {
1609+
continue
1610+
}
1611+
found := false
1612+
for _, d := range hits.Fields {
1613+
if _, err := d.GetValue(f); err == nil {
1614+
found = true
1615+
break
1616+
}
1617+
}
1618+
if !found {
1619+
return false
1620+
}
1621+
logger.Debugf("Found field %q in hits", f)
1622+
foundFields[f] = struct{}{}
1623+
}
1624+
return true
1625+
}()
1626+
1627+
assertMinCount := func() bool {
1628+
if config.Assert.MinCount > 0 {
1629+
return hits.size() >= config.Assert.MinCount
1630+
}
1631+
// By default at least one document
1632+
return hits.size() > 0
1633+
}()
1634+
1635+
return assertFieldsPresent && assertMinCount && assertHitCount, nil
1636+
}, 1*time.Second, waitForDataTimeout)
1637+
1638+
if waitErr != nil {
1639+
return nil, waitErr
1640+
}
1641+
1642+
if !passed {
1643+
return nil, testrunner.ErrTestCaseFailed{Reason: fmt.Sprintf("could not find the expected hits in %s data stream", dataStream)}
1644+
}
1645+
1646+
return hits, nil
1647+
}
1648+
15861649
func (r *tester) validateTestScenario(ctx context.Context, result *testrunner.ResultComposer, scenario *scenarioTest, config *testConfig) ([]testrunner.TestResult, error) {
15871650
if err := validateFailureStore(scenario.failureStore); err != nil {
15881651
return result.WithError(err)
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
<failure>test case failed: could not find the expected hits in logs-failed_fields_present_assert.test-[[:digit:]]+ data stream</failure>
Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
Elastic License 2.0
2+
3+
URL: https://www.elastic.co/licensing/elastic-license
4+
5+
## Acceptance
6+
7+
By using the software, you agree to all of the terms and conditions below.
8+
9+
## Copyright License
10+
11+
The licensor grants you a non-exclusive, royalty-free, worldwide,
12+
non-sublicensable, non-transferable license to use, copy, distribute, make
13+
available, and prepare derivative works of the software, in each case subject to
14+
the limitations and conditions below.
15+
16+
## Limitations
17+
18+
You may not provide the software to third parties as a hosted or managed
19+
service, where the service provides users with access to any substantial set of
20+
the features or functionality of the software.
21+
22+
You may not move, change, disable, or circumvent the license key functionality
23+
in the software, and you may not remove or obscure any functionality in the
24+
software that is protected by the license key.
25+
26+
You may not alter, remove, or obscure any licensing, copyright, or other notices
27+
of the licensor in the software. Any use of the licensor’s trademarks is subject
28+
to applicable law.
29+
30+
## Patents
31+
32+
The licensor grants you a license, under any patent claims the licensor can
33+
license, or becomes able to license, to make, have made, use, sell, offer for
34+
sale, import and have imported the software, in each case subject to the
35+
limitations and conditions in this license. This license does not cover any
36+
patent claims that you cause to be infringed by modifications or additions to
37+
the software. If you or your company make any written claim that the software
38+
infringes or contributes to infringement of any patent, your patent license for
39+
the software granted under these terms ends immediately. If your company makes
40+
such a claim, your patent license ends immediately for work on behalf of your
41+
company.
42+
43+
## Notices
44+
45+
You must ensure that anyone who gets a copy of any part of the software from you
46+
also gets a copy of these terms.
47+
48+
If you modify the software, you must include in any modified copies of the
49+
software prominent notices stating that you have modified the software.
50+
51+
## No Other Rights
52+
53+
These terms do not imply any licenses other than those expressly granted in
54+
these terms.
55+
56+
## Termination
57+
58+
If you use the software in violation of these terms, such use is not licensed,
59+
and your licenses will automatically terminate. If the licensor provides you
60+
with a notice of your violation, and you cease all violation of this license no
61+
later than 30 days after you receive that notice, your licenses will be
62+
reinstated retroactively. However, if you violate these terms after such
63+
reinstatement, any additional violation of these terms will cause your licenses
64+
to terminate automatically and permanently.
65+
66+
## No Liability
67+
68+
*As far as the law allows, the software comes as is, without any warranty or
69+
condition, and the licensor will not be liable to you for any damages arising
70+
out of these terms or the use or nature of the software, under any kind of
71+
legal claim.*
72+
73+
## Definitions
74+
75+
The **licensor** is the entity offering these terms, and the **software** is the
76+
software the licensor makes available under these terms, including any portion
77+
of it.
78+
79+
**you** refers to the individual or entity agreeing to these terms.
80+
81+
**your company** is any legal entity, sole proprietorship, or other kind of
82+
organization that you work for, plus all organizations that have control over,
83+
are under the control of, or are under common control with that
84+
organization. **control** means ownership of substantially all the assets of an
85+
entity, or the power to direct its management and policies by vote, contract, or
86+
otherwise. Control can be direct or indirect.
87+
88+
**your licenses** are all the licenses granted to you for the software under
89+
these terms.
90+
91+
**use** means anything you do with the software requiring one of your licenses.
92+
93+
**trademark** means trademarks, service marks, and similar rights.
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
dependencies:
2+
ecs:
3+
reference: [email protected]
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# newer versions go on top
2+
- version: "0.0.1"
3+
changes:
4+
- description: Initial draft of the package
5+
type: enhancement
6+
link: https://github.com/elastic/integrations/pull/1 # FIXME Replace with the real PR link
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
vars: ~
2+
data_stream:
3+
vars:
4+
paths:
5+
- "/custom/paths/logs.json"
6+
wait_for_data_timeout: 10s
7+
assert:
8+
fields_present:
9+
- target.file
10+
- target.expected
11+
- target.finish # this field is not present in the log file
12+
agent:
13+
provisioning_script:
14+
language: bash
15+
contents: |
16+
mkdir -p /custom/paths
17+
cd /custom/paths
18+
touch logs.json
19+
# elastic-package just retrieves the 500 first documents in the search query
20+
for i in $(seq 1 245) ; do
21+
echo '{ "contents": "Message from file", "file": "logs.json"}'
22+
done >> logs.json
23+
echo '{ "contents": "Message from file", "file": "logs.json", "expected": "finish"}' >> logs.json
24+
for i in $(seq 1 245); do
25+
echo '{ "contents": "Message from file", "file": "logs.json"}'
26+
done >> logs.json
27+
pre_start_script:
28+
language: sh
29+
contents: |
30+
export PATH=${PATH}:/custom/paths
31+
mkdir -p /tmp/other/path
32+
cd /tmp/other/path
33+
echo "Pre-start: Current directory $(pwd)"
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
paths:
2+
{{#each paths as |path i|}}
3+
- {{path}}
4+
{{/each}}
5+
exclude_files: [".gz$"]
6+
processors:
7+
- add_locale: ~

0 commit comments

Comments
 (0)