+
+ | Buckets |
+ Query |
+ Order |
+ OS 1.3.18 |
+ OS 2.7 |
+ OS 2.11.1 |
+ OS 2.12.0 |
+ OS 2.13.0 |
+ OS 2.14 |
+ OS 2.15 |
+ OS 2.16 |
+ OS 2.17 |
+ OS 2.18 |
+ OS 2.19 |
+ OS 3.0 |
+
+
+ | Text queries |
+ query-string-on-message |
+ 1 |
+ 332.75 |
+ 280 |
+ 276 |
+ 78.25 |
+ 80 |
+ 77.75 |
+ 77.25 |
+ 77.75 |
+ 78 |
+ 85 |
+ 4 |
+ 4 |
+
+
+ | query-string-on-message-filtered |
+ 2 |
+ 67.25 |
+ 47 |
+ 30.25 |
+ 46.5 |
+ 47.5 |
+ 46 |
+ 46.75 |
+ 29.5 |
+ 30 |
+ 27 |
+ 11 |
+ 11 |
+
+
+ | query-string-on-message-filtered-sorted-num |
+ 3 |
+ 125.25 |
+ 102 |
+ 85.5 |
+ 41 |
+ 41.25 |
+ 41 |
+ 40.75 |
+ 24 |
+ 24.5 |
+ 27 |
+ 26 |
+ 27 |
+
+
+ | term |
+ 4 |
+ 4 |
+ 3.75 |
+ 4 |
+ 4 |
+ 4 |
+ 4 |
+ 4 |
+ 4 |
+ 4 |
+ 4 |
+ 4 |
+ 4 |
+
+
+ | Sorting |
+ asc_sort_timestamp |
+ 5 |
+ 9.75 |
+ 15.75 |
+ 7.5 |
+ 7 |
+ 7 |
+ 7 |
+ 7 |
+ 7 |
+ 7 |
+ 7 |
+ 7 |
+ 7 |
+
+
+ | asc_sort_timestamp_can_match_shortcut |
+ 6 |
+ 13.75 |
+ 7 |
+ 7 |
+ 6.75 |
+ 6 |
+ 6.25 |
+ 6.5 |
+ 6 |
+ 6.25 |
+ 7 |
+ 7 |
+ 7 |
+
+
+ | asc_sort_timestamp_no_can_match_shortcut |
+ 7 |
+ 13.5 |
+ 7 |
+ 7 |
+ 6.5 |
+ 6 |
+ 6 |
+ 6.5 |
+ 6 |
+ 6.25 |
+ 7 |
+ 7 |
+ 7 |
+
+
+ | asc_sort_with_after_timestamp |
+ 8 |
+ 35 |
+ 33.75 |
+ 238 |
+ 212 |
+ 197.5 |
+ 213.5 |
+ 204.25 |
+ 160.5 |
+ 185.25 |
+ 216 |
+ 150 |
+ 168 |
+
+
+ | desc_sort_timestamp |
+ 9 |
+ 12.25 |
+ 39.25 |
+ 6 |
+ 7 |
+ 5.75 |
+ 5.75 |
+ 5.75 |
+ 6 |
+ 6 |
+ 8 |
+ 7 |
+ 7 |
+
+
+ | desc_sort_timestamp_can_match_shortcut |
+ 10 |
+ 7 |
+ 120.5 |
+ 5 |
+ 5.5 |
+ 5 |
+ 4.75 |
+ 5 |
+ 5 |
+ 5 |
+ 6 |
+ 6 |
+ 5 |
+
+
+ | desc_sort_timestamp_no_can_match_shortcut |
+ 11 |
+ 6.75 |
+ 117 |
+ 5 |
+ 5 |
+ 4.75 |
+ 4.5 |
+ 4.75 |
+ 5 |
+ 5 |
+ 6 |
+ 6 |
+ 5 |
+
+
+ | desc_sort_with_after_timestamp |
+ 12 |
+ 487 |
+ 33.75 |
+ 325.75 |
+ 358 |
+ 361.5 |
+ 385.25 |
+ 378.25 |
+ 320.25 |
+ 329.5 |
+ 262 |
+ 246 |
+ 93 |
+
+
+ | sort_keyword_can_match_shortcut |
+ 13 |
+ 291 |
+ 3 |
+ 3 |
+ 3.25 |
+ 3.5 |
+ 3 |
+ 3 |
+ 3 |
+ 3 |
+ 4 |
+ 4 |
+ 4 |
+
+
+ | sort_keyword_no_can_match_shortcut |
+ 14 |
+ 290.75 |
+ 3.25 |
+ 3 |
+ 3.5 |
+ 3.25 |
+ 3 |
+ 3.75 |
+ 3 |
+ 3.25 |
+ 4 |
+ 4 |
+ 4 |
+
+
+ | sort_numeric_asc |
+ 15 |
+ 7.5 |
+ 4.5 |
+ 4.5 |
+ 4 |
+ 4 |
+ 4 |
+ 4 |
+ 4 |
+ 4 |
+ 17 |
+ 4 |
+ 3 |
+
+
+ | sort_numeric_asc_with_match |
+ 16 |
+ 2 |
+ 1.75 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 1.75 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+
+
+ | sort_numeric_desc |
+ 17 |
+ 8 |
+ 6 |
+ 6 |
+ 5.5 |
+ 4.75 |
+ 5 |
+ 4.75 |
+ 4.25 |
+ 4.5 |
+ 16 |
+ 5 |
+ 4 |
+
+
+ | sort_numeric_desc_with_match |
+ 18 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 1.75 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+
+
+ | Terms aggregations |
+ cardinality-agg-high |
+ 19 |
+ 3075.75 |
+ 2432.25 |
+ 2506.25 |
+ 2246 |
+ 2284.5 |
+ 2202.25 |
+ 2323.75 |
+ 2337.25 |
+ 2408.75 |
+ 2324 |
+ 2235 |
+ 628 |
+
+
+ | cardinality-agg-low |
+ 20 |
+ 2925.5 |
+ 2295.5 |
+ 2383 |
+ 2126 |
+ 2245.25 |
+ 2159 |
+ 3 |
+ 3 |
+ 3 |
+ 3 |
+ 3 |
+ 3 |
+
+
+ | composite_terms-keyword |
+ 21 |
+ 466.75 |
+ 378.5 |
+ 407.75 |
+ 394.5 |
+ 353.5 |
+ 366 |
+ 350 |
+ 346.5 |
+ 350.25 |
+ 216 |
+ 218 |
+ 202 |
+
+
+ | composite-terms |
+ 22 |
+ 290 |
+ 242 |
+ 263 |
+ 252 |
+ 233 |
+ 228.75 |
+ 229 |
+ 223.75 |
+ 226 |
+ 333 |
+ 362 |
+ 328 |
+
+
+ | keyword-terms |
+ 23 |
+ 4695.25 |
+ 3478.75 |
+ 3557.5 |
+ 3220 |
+ 29.5 |
+ 26 |
+ 25.75 |
+ 26.25 |
+ 26.25 |
+ 27 |
+ 26 |
+ 19 |
+
+
+ | keyword-terms-low-cardinality |
+ 24 |
+ 4699.5 |
+ 3383 |
+ 3477.25 |
+ 3249.75 |
+ 25 |
+ 22 |
+ 21.75 |
+ 21.75 |
+ 21.75 |
+ 22 |
+ 22 |
+ 13 |
+
+
+ | multi_terms-keyword |
+ 25 |
+ 0* |
+ 0* |
+ 854.75 |
+ 817.25 |
+ 796.5 |
+ 748 |
+ 768.5 |
+ 746.75 |
+ 770 |
+ 736 |
+ 734 |
+ 657 |
+
+
+ | Range queries |
+ keyword-in-range |
+ 26 |
+ 101.5 |
+ 100 |
+ 18 |
+ 22 |
+ 23.25 |
+ 26 |
+ 27.25 |
+ 18 |
+ 17.75 |
+ 64 |
+ 68 |
+ 14 |
+
+
+ | range |
+ 27 |
+ 85 |
+ 77 |
+ 14.5 |
+ 18.25 |
+ 20.25 |
+ 22.75 |
+ 24.25 |
+ 13.75 |
+ 14.25 |
+ 11 |
+ 14 |
+ 4 |
+
+
+ | range_field_conjunction_big_range_big_term_query |
+ 28 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+
+
+ | range_field_conjunction_small_range_big_term_query |
+ 29 |
+ 2 |
+ 1.75 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 1.5 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+
+
+ | range_field_conjunction_small_range_small_term_query |
+ 30 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+
+
+ | range_field_disjunction_big_range_small_term_query |
+ 31 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2.25 |
+ 2 |
+ 2 |
+ 2 |
+
+
+ | range-agg-1 |
+ 32 |
+ 4641.25 |
+ 3810.75 |
+ 3745.75 |
+ 3578.75 |
+ 3477.5 |
+ 3328.75 |
+ 3318.75 |
+ 2 |
+ 2.25 |
+ 2 |
+ 2 |
+ 2 |
+
+
+ | range-agg-2 |
+ 33 |
+ 4568 |
+ 3717.25 |
+ 3669.75 |
+ 3492.75 |
+ 3403.5 |
+ 3243.5 |
+ 3235 |
+ 2 |
+ 2.25 |
+ 2 |
+ 2 |
+ 2 |
+
+
+ | range-numeric |
+ 34 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+ 2 |
+
+
+ | Date histograms |
+ composite-date_histogram-daily |
+ 35 |
+ 4828.75 |
+ 4055.5 |
+ 4051.25 |
+ 9 |
+ 3 |
+ 2.5 |
+ 3 |
+ 2.75 |
+ 2.75 |
+ 3 |
+ 3 |
+ 3 |
+
+
+ | date_histogram_hourly_agg |
+ 36 |
+ 4790.25 |
+ 4361 |
+ 4363.25 |
+ 12.5 |
+ 12.75 |
+ 6.25 |
+ 6 |
+ 6.25 |
+ 6.5 |
+ 7 |
+ 6 |
+ 4 |
+
+
+ | date_histogram_minute_agg |
+ 37 |
+ 1404.5 |
+ 1340.25 |
+ 1113.75 |
+ 1001.25 |
+ 923 |
+ 36 |
+ 32.75 |
+ 35.25 |
+ 39.75 |
+ 35 |
+ 36 |
+ 37 |
+
+
+ | range-auto-date-histo |
+ 38 |
+ 10373 |
+ 8686.75 |
+ 9940.25 |
+ 8696.75 |
+ 8199.75 |
+ 8214.75 |
+ 8278.75 |
+ 8306 |
+ 8293.75 |
+ 8095 |
+ 7899 |
+ 1871 |
+
+
+ | range-auto-date-histo-with-metrics |
+ 39 |
+ 22988.5 |
+ 20438 |
+ 20108.25 |
+ 20392.75 |
+ 20117.25 |
+ 19656.5 |
+ 19959.25 |
+ 20364.75 |
+ 20147.5 |
+ 19686 |
+ 20211 |
+ 5406 |
+
+
+
+
+While results may vary in different environments, we controlled for noise and hardware variability. The relative performance trends are expected to hold across most real-world scenarios. The [OpenSearch Benchmark workload](https://github.com/opensearch-project/opensearch-benchmark-workloads) is open source, and we welcome replication and feedback from the community.
diff --git a/_posts/2025-05-15-optimized-inference-processors.md b/_posts/2025-05-15-optimized-inference-processors.md
new file mode 100644
index 0000000000..51e4c2c508
--- /dev/null
+++ b/_posts/2025-05-15-optimized-inference-processors.md
@@ -0,0 +1,242 @@
+---
+layout: post
+title: "Optimizing inference processors for cost efficiency and performance"
+authors:
+ - will-hwang
+ - heemin-kim
+ - kolchfa
+date: 2025-05-29
+has_science_table: true
+categories:
+ - technical-posts
+meta_keywords: inference processors, vector embeddings, OpenSearch text embedding, text image embedding, sparse encoding, caching mechanism, ingest pipeline, OpenSearch optimization
+meta_description: Learn about a new OpenSearch optimization for inference processors that reduces redundant calls, lowering costs and improving performance in vector embedding generation.
+
+---
+
+Inference processors, such as `text_embedding`, `text_image_embedding`, and `sparse_encoding`, enable the generation of vector embeddings during document ingestion or updates. Today, these processors invoke model inference every time a document is ingested or updated, even if the embedding source fields remain unchanged. This can lead to unnecessary compute usage and increased costs.
+
+This blog post introduces a new inference processor optimization that reduces redundant inference calls, lowering costs and improving overall performance.
+
+## How the optimization works
+
+The optimization adds a caching mechanism that compares the embedding source fields in the updated document against the existing document. If the embedding fields have not changed, the processor directly copies the existing embeddings into the updated document instead of triggering new inference. If the fields differ, the processor proceeds with inference as usual. The following diagram illustrates this workflow.
+
+
+
+This approach minimizes redundant inference calls, significantly improving efficiency without impacting the accuracy or freshness of embeddings.
+
+## How to enable the optimization
+
+To enable this optimization, set the `skip_existing` parameter to `true` in your ingest pipeline processor definition. This option is available for [`text_embedding`](#text-embedding-processor), [`text_image_embedding`](#textimage-embedding-processor), and [`sparse_encoding`](#sparse-encoding-processor) processors. By default, `skip_existing` is set to `false`.
+
+### Text embedding processor
+
+The [`text_embedding` processor](https://docs.opensearch.org/docs/latest/ingest-pipelines/processors/text-embedding/) generates vector embeddings for text fields, typically used in semantic search.
+
+* **Optimization behavior**: If `skip_existing` is `true`, the processor checks whether the text fields mapped in `field_map` have changed. If they haven't, inference is skipped and the existing vector is reused.
+
+**Example pipeline**:
+
+```json
+PUT /_ingest/pipeline/optimized-ingest-pipeline
+{
+ "description": "Optimized ingest pipeline",
+ "processors": [
+ {
+ "text_embedding": {
+ "model_id": "