Skip to content

Commit 10d7782

Browse files
[Pull-based Ingestion] Update pull-based semantics and offset based lag metric (#11352)
* update pull-based semantics and offset based lag metric Signed-off-by: Varun Bharadwaj <[email protected]> * Update _api-reference/document-apis/pull-based-ingestion.md Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: Varun Bharadwaj <[email protected]> * Update _api-reference/document-apis/pull-based-ingestion.md Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: Varun Bharadwaj <[email protected]> --------- Signed-off-by: Varun Bharadwaj <[email protected]> Co-authored-by: kolchfa-aws <[email protected]>
1 parent 2b12cfb commit 10d7782

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

_api-reference/document-apis/pull-based-ingestion.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ nav_order: 90
1313
This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, join the discussion on the [OpenSearch forum](https://forum.opensearch.org/).
1414
{: .warning}
1515

16-
Pull-based ingestion enables OpenSearch to ingest data from streaming sources such as Apache Kafka or Amazon Kinesis. Unlike traditional ingestion methods where clients actively push data to OpenSearch through REST APIs, pull-based ingestion allows OpenSearch to control the data flow by retrieving data directly from streaming sources. This approach provides exactly-once ingestion semantics and native backpressure handling, helping prevent server overload during traffic spikes.
16+
Pull-based ingestion enables OpenSearch to ingest data from streaming sources such as Apache Kafka or Amazon Kinesis. Unlike traditional ingestion methods where clients actively push data to OpenSearch through REST APIs, pull-based ingestion allows OpenSearch to control the data flow by retrieving data directly from streaming sources. This approach provides native backpressure handling, helping prevent server overload during traffic spikes. Pull-based ingestion guarantees at-least-once ingestion semantics and uses external versioning to ensure data consistency.
1717

1818
## Prerequisites
1919

@@ -199,8 +199,8 @@ The following table lists the available `polling_ingest_stats` metrics.
199199
| `consumer_stats.total_consumer_error_count` | The total number of fatal consumer read errors. |
200200
| `consumer_stats.total_poller_message_failure_count` | The total number of failed messages on the poller. |
201201
| `consumer_stats.total_poller_message_dropped_count` | The total number of failed messages on the poller that were dropped. |
202-
| `consumer_stats.total_duplicate_message_skipped_count` | The total number of skipped messages that were previously processed. |
203202
| `consumer_stats.lag_in_millis` | Lag in milliseconds, computed as the time elapsed since the last processed message timestamp. |
203+
| `consumer_stats.pointer_based_lag` | The Apache Kafka offset-based lag, calculated as the difference between the latest available offset and the current message offset. This metric applies only when Apache Kafka is used as the streaming source. |
204204

205205
To retrieve shard-level pull-based ingestion metrics, use the [Nodes Stats API]({{site.url}}{{site.baseurl}}/api-reference/index-apis/update-settings/):
206206

0 commit comments

Comments
 (0)