elastic
diff --git a/‎docs/reference/data-analysis/aggregations/search-aggregations-bucket-geohexgrid-aggregation.md‎
Lines changed: 10 additions & 0 deletions b/‎docs/reference/data-analysis/aggregations/search-aggregations-bucket-geohexgrid-aggregation.md‎
Lines changed: 10 additions & 0 deletions
diff --git a/‎docs/reference/data-analysis/aggregations/search-aggregations-bucket-significantterms-aggregation.md‎
Lines changed: 69 additions & 0 deletions b/‎docs/reference/data-analysis/aggregations/search-aggregations-bucket-significantterms-aggregation.md‎
Lines changed: 69 additions & 0 deletions
diff --git a/‎docs/reference/data-analysis/aggregations/search-aggregations-change-point-aggregation.md‎
Lines changed: 6 additions & 0 deletions b/‎docs/reference/data-analysis/aggregations/search-aggregations-change-point-aggregation.md‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎docs/reference/data-analysis/aggregations/search-aggregations-pipeline-inference-bucket-aggregation.md‎
Lines changed: 4 additions & 0 deletions b/‎docs/reference/data-analysis/aggregations/search-aggregations-pipeline-inference-bucket-aggregation.md‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎docs/reference/data-analysis/text-analysis/analysis-cjk-bigram-tokenfilter.md‎
Lines changed: 78 additions & 0 deletions b/‎docs/reference/data-analysis/text-analysis/analysis-cjk-bigram-tokenfilter.md‎
Lines changed: 78 additions & 0 deletions
diff --git a/‎docs/reference/data-analysis/text-analysis/analysis-cjk-width-tokenfilter.md‎
Lines changed: 15 additions & 0 deletions b/‎docs/reference/data-analysis/text-analysis/analysis-cjk-width-tokenfilter.md‎
Lines changed: 15 additions & 0 deletions
diff --git a/‎docs/reference/data-analysis/text-analysis/analysis-hunspell-tokenfilter.md‎
Lines changed: 36 additions & 0 deletions b/‎docs/reference/data-analysis/text-analysis/analysis-hunspell-tokenfilter.md‎
Lines changed: 36 additions & 0 deletions
@@ -85,6 +85,8 @@ Response:
 }
 ```
 
+%  TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]
+
 
 ## High-precision requests [geohexgrid-high-precision]
 
@@ -118,6 +120,8 @@ POST /museums/_search?size=0
 }
 ```
 
+%  TEST[continued]
+
 Response:
 
 ```console-result
@@ -147,6 +151,8 @@ Response:
 }
 ```
 
+%  TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]
+
 
 ## Requests with additional bounding box filtering [geohexgrid-addtl-bounding-box-filtering]
 
@@ -172,6 +178,8 @@ POST /museums/_search?size=0
 }
 ```
 
+%  TEST[continued]
+
 Response:
 
 ```console-result
@@ -198,6 +206,8 @@ Response:
 }
 ```
 
+%  TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]
+
 
 ### Aggregating `geo_shape` fields [geohexgrid-aggregating-geo-shape]
 
 
@@ -17,6 +17,45 @@ An aggregation that returns interesting or unusual occurrences of terms in a set
 
 In all these cases the terms being selected are not simply the most popular terms in a set. They are the terms that have undergone a significant change in popularity measured between a *foreground* and *background* set. If the term "H5N1" only exists in 5 documents in a 10 million document index and yet is found in 4 of the 100 documents that make up a user’s search results that is significant and probably very relevant to their search. 5/10,000,000 vs 4/100 is a big swing in frequency.
 
+% 
+% [source,console]
+% --------------------------------------------------
+% PUT /reports
+% {
+%   "mappings": {
+%     "properties": {
+%       "force": {
+%         "type": "keyword"
+%       },
+%       "crime_type": {
+%         "type": "keyword"
+%       }
+%     }
+%   }
+% }
+% 
+% POST /reports/_bulk?refresh
+% {"index":{"_id":0}}
+% {"force": "British Transport Police", "crime_type": "Bicycle theft"}
+% {"index":{"_id":1}}
+% {"force": "British Transport Police", "crime_type": "Bicycle theft"}
+% {"index":{"_id":2}}
+% {"force": "British Transport Police", "crime_type": "Bicycle theft"}
+% {"index":{"_id":3}}
+% {"force": "British Transport Police", "crime_type": "Robbery"}
+% {"index":{"_id":4}}
+% {"force": "Metropolitan Police Service", "crime_type": "Robbery"}
+% {"index":{"_id":5}}
+% {"force": "Metropolitan Police Service", "crime_type": "Bicycle theft"}
+% {"index":{"_id":6}}
+% {"force": "Metropolitan Police Service", "crime_type": "Robbery"}
+% {"index":{"_id":7}}
+% {"force": "Metropolitan Police Service", "crime_type": "Robbery"}
+% 
+% -------------------------------------------------
+% // TESTSETUP
+% 
+
 ## Single-set analysis [_single_set_analysis]
 
 In the simplest case, the *foreground* set of interest is the search results matched by a query and the *background* set used for statistical comparisons is the index or indices from which the results were gathered.
@@ -39,6 +78,8 @@ GET /_search
 }
 ```
 
+%  TEST[s/_search/_search\?filter_path=aggregations/]
+
 Response:
 
 ```console-result
@@ -62,6 +103,10 @@ Response:
 }
 ```
 
+%  TESTRESPONSE[s/\.\.\.//]
+
+%  TESTRESPONSE[s/: (0\.)?[0-9]+/: $body.$_path/]
+
 When querying an index of all crimes from all police forces, what these results show is that the British Transport Police force stand out as a force dealing with a disproportionately large number of bicycle thefts. Ordinarily, bicycle thefts represent only 1% of crimes (66799/5064554) but for the British Transport Police, who handle crime on railways and stations, 7% of crimes (3640/47347) is a bike theft. This is a significant seven-fold increase in frequency and so this anomaly was highlighted as the top crime type.
 
 The problem with using a query to spot anomalies is it only gives us one subset to use for comparisons. To discover all the other police forces' anomalies we would have to repeat the query for each of the different forces.
@@ -93,6 +138,8 @@ GET /_search
 }
 ```
 
+%  TEST[s/_search/_search\?filter_path=aggregations/]
+
 Response:
 
 ```console-result
@@ -143,6 +190,12 @@ Response:
 }
 ```
 
+%  TESTRESPONSE[s/\.\.\.//]
+
+%  TESTRESPONSE[s/: (0\.)?[0-9]+/: $body.$_path/]
+
+%  TESTRESPONSE[s/: "[^"]*"/: $body.$_path/]
+
 Now we have anomaly detection for each of the police forces using a single request.
 
 We can use other forms of top-level aggregations to segment our data, for example segmenting by geographic area to identify unusual hot-spots of a particular crime type:
@@ -257,6 +310,8 @@ The JLH score can be used as a significance score by adding the parameter
 	 }
 ```
 
+%  NOTCONSOLE
+
 The scores are derived from the doc frequencies in *foreground* and *background* sets. The *absolute* change in popularity (foregroundPercent - backgroundPercent) would favor common terms whereas the *relative* change in popularity (foregroundPercent/ backgroundPercent) would favor rare terms. Rare vs common is essentially a precision vs recall balance and so the absolute and relative changes are multiplied to provide a sweet spot between precision and recall.
 
 
@@ -270,6 +325,8 @@ Mutual information as described in "Information Retrieval", Manning et al., Chap
 	 }
 ```
 
+%  NOTCONSOLE
+
 Mutual information does not differentiate between terms that are descriptive for the subset or for documents outside the subset. The significant terms therefore can contain terms that appear more or less frequent in the subset than outside the subset. To filter out the terms that appear less often in the subset than in documents outside the subset, `include_negatives` can be set to `false`.
 
 Per default, the assumption is that the documents in the bucket are also contained in the background. If instead you defined a custom background filter that represents a different set of documents that you want to compare to, set
@@ -278,6 +335,8 @@ Per default, the assumption is that the documents in the bucket are also contain
 "background_is_superset": false
 ```
 
+%  NOTCONSOLE
+
 
 ### Chi square [_chi_square]
 
@@ -288,6 +347,8 @@ Chi square as described in "Information Retrieval", Manning et al., Chapter 13.5
 	 }
 ```
 
+%  NOTCONSOLE
+
 Chi square behaves like mutual information and can be configured with the same parameters `include_negatives` and `background_is_superset`.
 
 
@@ -300,6 +361,8 @@ Google normalized distance as described in ["The Google Similarity Distance", Ci
 	 }
 ```
 
+%  NOTCONSOLE
+
 `gnd` also accepts the `background_is_superset` parameter.
 
 
@@ -383,6 +446,8 @@ GET /_search
 }
 ```
 
+%  TEST[s/_search/_search?size=0/]
+
 
 
 ### Percentage [_percentage]
@@ -398,6 +463,8 @@ It would be hard for a seasoned boxer to win a championship if the prize was awa
 	 }
 ```
 
+%  NOTCONSOLE
+
 
 ### Which one is best? [_which_one_is_best]
 
@@ -421,6 +488,8 @@ Customized scores can be implemented via a script:
             }
 ```
 
+%  NOTCONSOLE
+
 Scripts can be inline (as in above example), indexed or stored on disk. For details on the options, see [script documentation](docs-content://explore-analyze/scripting.md).
 
 Available parameters in the script are
 
@@ -37,6 +37,8 @@ A `change_point` aggregation looks like this in isolation:
 }
 ```
 
+%  NOTCONSOLE
+
 1. The buckets containing the values to test against.
 
 
@@ -99,6 +101,8 @@ GET kibana_sample_data_logs/_search
 }
 ```
 
+%  NOTCONSOLE
+
 1. A date histogram aggregation that creates buckets with one day long interval.
 2. A sibling aggregation of the `date` aggregation that calculates the average value of the `bytes` field within every bucket.
 3. The change point detection aggregation configuration object.
@@ -125,6 +129,8 @@ The request returns a response that is similar to the following:
     }
 ```
 
+%  NOTCONSOLE
+
 1. The bucket key that is the change point.
 2. The number of documents in that bucket.
 3. Aggregated values in the bucket.
 
@@ -32,6 +32,8 @@ A `inference` aggregation looks like this in isolation:
 }
 ```
 
+%  NOTCONSOLE
+
 1. The unique identifier or alias for the trained model.
 2. The optional inference config which overrides the model’s default settings
 3. Map the value of `avg_agg` to the model’s input field `avg_cost`
@@ -158,6 +160,8 @@ GET kibana_sample_data_logs/_search
 }
 ```
 
+%  TEST[skip:setup kibana sample data]
+
 1. A composite bucket aggregation that aggregates the data by `client_ip`.
 2. A series of metrics and bucket sub-aggregations.
 3. {{infer-cap}} bucket aggregation that specifies the trained model and maps the aggregation names to the model’s input fields.
 
@@ -30,6 +30,84 @@ The filter produces the following tokens:
 [ 東京, 京都, 都は, 日本, 本の, の首, 首都, 都で, であ, あり ]
 ```
 
+% [source,console-result]
+% --------------------------------------------------
+% {
+%   "tokens" : [
+%     {
+%       "token" : "東京",
+%       "start_offset" : 0,
+%       "end_offset" : 2,
+%       "type" : "<DOUBLE>",
+%       "position" : 0
+%     },
+%     {
+%       "token" : "京都",
+%       "start_offset" : 1,
+%       "end_offset" : 3,
+%       "type" : "<DOUBLE>",
+%       "position" : 1
+%     },
+%     {
+%       "token" : "都は",
+%       "start_offset" : 2,
+%       "end_offset" : 4,
+%       "type" : "<DOUBLE>",
+%       "position" : 2
+%     },
+%     {
+%       "token" : "日本",
+%       "start_offset" : 5,
+%       "end_offset" : 7,
+%       "type" : "<DOUBLE>",
+%       "position" : 3
+%     },
+%     {
+%       "token" : "本の",
+%       "start_offset" : 6,
+%       "end_offset" : 8,
+%       "type" : "<DOUBLE>",
+%       "position" : 4
+%     },
+%     {
+%       "token" : "の首",
+%       "start_offset" : 7,
+%       "end_offset" : 9,
+%       "type" : "<DOUBLE>",
+%       "position" : 5
+%     },
+%     {
+%       "token" : "首都",
+%       "start_offset" : 8,
+%       "end_offset" : 10,
+%       "type" : "<DOUBLE>",
+%       "position" : 6
+%     },
+%     {
+%       "token" : "都で",
+%       "start_offset" : 9,
+%       "end_offset" : 11,
+%       "type" : "<DOUBLE>",
+%       "position" : 7
+%     },
+%     {
+%       "token" : "であ",
+%       "start_offset" : 10,
+%       "end_offset" : 12,
+%       "type" : "<DOUBLE>",
+%       "position" : 8
+%     },
+%     {
+%       "token" : "あり",
+%       "start_offset" : 11,
+%       "end_offset" : 13,
+%       "type" : "<DOUBLE>",
+%       "position" : 9
+%     }
+%   ]
+% }
+% --------------------------------------------------
+
 
 ## Add to an analyzer [analysis-cjk-bigram-tokenfilter-analyzer-ex]
 
 
@@ -36,6 +36,21 @@ The filter produces the following token:
 シーサイドライナー
 ```
 
+% [source,console-result]
+% --------------------------------------------------
+% {
+%   "tokens" : [
+%     {
+%       "token" : "シーサイドライナー",
+%       "start_offset" : 0,
+%       "end_offset" : 10,
+%       "type" : "<KATAKANA>",
+%       "position" : 0
+%     }
+%   ]
+% }
+% --------------------------------------------------
+
 
 ## Add to an analyzer [analysis-cjk-width-tokenfilter-analyzer-ex]
 
 
@@ -70,6 +70,42 @@ The filter produces the following tokens:
 [ the, fox, jump, quick ]
 ```
 
+% [source,console-result]
+% ----
+% {
+%   "tokens": [
+%     {
+%       "token": "the",
+%       "start_offset": 0,
+%       "end_offset": 3,
+%       "type": "<ALPHANUM>",
+%       "position": 0
+%     },
+%     {
+%       "token": "fox",
+%       "start_offset": 4,
+%       "end_offset": 9,
+%       "type": "<ALPHANUM>",
+%       "position": 1
+%     },
+%     {
+%       "token": "jump",
+%       "start_offset": 10,
+%       "end_offset": 17,
+%       "type": "<ALPHANUM>",
+%       "position": 2
+%     },
+%     {
+%       "token": "quick",
+%       "start_offset": 18,
+%       "end_offset": 25,
+%       "type": "<ALPHANUM>",
+%       "position": 3
+%     }
+%   ]
+% }
+% ----
+
 
 ## Configurable parameters [analysis-hunspell-tokenfilter-configure-parms]
Original file line number	Diff line number	Diff line change
`@@ -85,6 +85,8 @@ Response:`
`85`	`85`	`}`
`86`	`86`	```
`87`	`87`
	`88`	`+% TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]`
	`89`	`+`
`88`	`90`
`89`	`91`	`## High-precision requests [geohexgrid-high-precision]`
`90`	`92`
`@@ -118,6 +120,8 @@ POST /museums/_search?size=0`
`118`	`120`	`}`
`119`	`121`	```
`120`	`122`
	`123`	`+% TEST[continued]`
	`124`	`+`
`121`	`125`	`Response:`
`122`	`126`
`123`	`127`	```console-result
`@@ -147,6 +151,8 @@ Response:`
`147`	`151`	`}`
`148`	`152`	```
`149`	`153`
	`154`	`+% TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]`
	`155`	`+`
`150`	`156`
`151`	`157`	`## Requests with additional bounding box filtering [geohexgrid-addtl-bounding-box-filtering]`
`152`	`158`
`@@ -172,6 +178,8 @@ POST /museums/_search?size=0`
`172`	`178`	`}`
`173`	`179`	```
`174`	`180`
	`181`	`+% TEST[continued]`
	`182`	`+`
`175`	`183`	`Response:`
`176`	`184`
`177`	`185`	```console-result
`@@ -198,6 +206,8 @@ Response:`
`198`	`206`	`}`
`199`	`207`	```
`200`	`208`
	`209`	`+% TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]`
	`210`	`+`
`201`	`211`
`202`	`212`	### Aggregating `geo_shape` fields [geohexgrid-aggregating-geo-shape]
`203`	`213`