-
Notifications
You must be signed in to change notification settings - Fork 498
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
After upgrading from commit qw-airmail-20250522-hotfix (488375a9) to edge (660388a42756a739d0ef0aecd234ca953b85caf5), we are experiencing Kafka source failures.
2025-12-09 16:45:58.757 | PollExceeded (Local: Maximum application poll interval (max.poll.interval.ms) exceeded)) |
-- | -- | --
| | 2025-12-09 16:45:58.757 | 2025-12-09T07:45:58.757Z ERROR quickwit_actors::actor_context: exit activating-kill-switch actor=SourceActor-proud-RYA5 exit_status=Failure(Message consumption error: PollExceeded (Local: Maximum application poll interval (max.poll.interval.ms) exceeded) |
| | 2025-12-09 16:45:58.757 | PollExceeded (Local: Maximum application poll interval (max.poll.interval.ms) exceeded)) |
| | 2025-12-09 16:45:58.757 | 2025-12-09T07:45:58.757Z ERROR quickwit_actors::spawn_builder: actor-exit actor_id="SourceActor-proud-RYA5" phase=handling(quickwit_indexing::source::Loop) exit_status=Failure(Message consumption error: PollExceeded (Local: Maximum application poll interval (max.poll.interval.ms) exceeded) |
| | 2025-12-09 16:45:58.476 | 2025-12-09T07:45:58.476Z ERROR rdkafka::client: librdkafka: Global error: PollExceeded (Local: Maximum application poll interval (max.poll.interval.ms) exceeded): Application maximum poll interval (600000ms) exceeded by 16ms |
| | 2025-12-09 16:45:50.710 | PollExceeded (Local: Maximum application poll interval (max.poll.interval.ms) exceeded)) |
| | 2025-12-09 16:45:50.710 | 2025-12-09T07:45:50.710Z ERROR quickwit_actors::actor_context: exit activating-kill-switch actor=SourceActor-polished-D8qx exit_status=Failure(Message consumption error: PollExceeded (Local: Maximum application poll interval (max.poll.interval.ms) exceeded) |
| | 2025-12-09 16:45:50.710 | PollExceeded (Local: Maximum application poll interval (max.poll.interval.ms) exceeded)) |
| | 2025-12-09 16:45:50.710 | 2025-12-09T07:45:50.710Z ERROR quickwit_actors::spawn_builder: actor-exit actor_id="SourceActor-polished-D8qx" phase=handling(quickwit_indexing::source::Loop) exit_status=Failure(Message consumption error: PollExceeded (Local: Maximum application poll interval (max.poll.interval.ms) exceeded) |
| | 2025-12-09 16:45:50.710 | 2025-12-09T07:45:50.710Z ERROR rdkafka::client: librdkafka: Global error: PollExceeded (Local: Maximum application poll interval (max.poll.interval.ms) exceeded): Application maximum poll interval (600000ms) exceeded by 185ms
Steps to reproduce (if applicable)
Steps to reproduce the behavior:
- Run indexing under sustained high load where backpressure occurs
2 .ObservePollExceedederrors or consumption stalls
Expected behavior
Kafka source should continue consuming messages under high load without max.poll.interval.ms exceeded errors, as it did in the previous version (qw-airmail-20250522-hotfix).
Configuration:
Please provide:
- Quickwit Version(edge:
660388a42756a739d0ef0aecd234ca953b85caf5) - The index_config.yaml
version: 0.8
index_id: log.common.access_log_v2_quickwit
doc_mapping:
field_mappings:
- name: id
type: text
tokenizer: raw
description: "unique identifier for the event"
- name: specversion
type: text
stored: false
indexed: false
description: "version information about the CloudEvents specification"
- name: source
type: text
tokenizer: raw
description: "information about where the event occurred"
- name: subject
type: text
tokenizer: raw
description: "detailed information about the source where the event occurred"
- name: time
type: datetime
input_formats:
- unix_timestamp
- iso8601
output_format: unix_timestamp_nanos
fast: true
description: "timestamp of the event"
- name: datacontenttype
type: text
tokenizer: raw
description: "content type of the data"
- name: requestId
type: text
tokenizer: raw
description: "request id"
- name: ip
type: ip
fast: true
description: "ip address"
- name: userAgent
type: text
tokenizer: default
- name: xUserAgent
type: text
tokenizer: default
- name: userId
type: text
tokenizer: raw
- name: deviceId
type: text
tokenizer: raw
description: "device's id eg) 6ae1f6a6-107d-3183-a1df-adf7368f9d10"
- name: latency
type: text
indexed: false
stored: false
description: "latency of the request, but already in latencyNs, so we don't need to index it"
- name: latencyNs
type: i64
fast: true
description: "latency of the request in nanoseconds"
- name: occurredAt
type: datetime
input_formats:
- unix_timestamp
- iso8601
output_format: unix_timestamp_nanos
description: "timestamp of the log entry"
- name: httpStatusCode
type: i64
fast: true
description: "HTTP status code"
- name: httpMethod
type: text
fast: true
description: "HTTP method"
- name: httpPath
type: text
fast: true
description: "HTTP path"
- name: grpcStatusCode
type: i64
fast: true
description: "gRPC status code"
- name: grpcMethod
type: text
fast: true
description: "gRPC method"
- name: extra
type: json
expand_dots: true
description: "extra fields"
- name: kafkaConsumerGroupId
type: text
fast: false
description: "Kafka Consumer Group Id"
- name: kafkaConsumerClientId
type: text
fast: false
description: "Kafka Consumer Client Id"
- name: kafkaConsumerHostName
type: text
fast: false
description: "Kafka Consumer Host Name"
- name: kafkaTopic
type: text
fast: false
description: "Kafka Topic"
- name: kafkaPartition
type: i64
description: "Kafka Partition"
- name: kafkaOffset
type: i64
description: "Kafka Offset"
- name: kafkaMessageKey
type: text
tokenizer: raw
description: "Kafka Message Key"
- name: kafkaConsumingResult
type: text
fast: true
description: "Kafka Consuming Result"
- name: env
type: text
tokenizer: raw
indexed: true
description: "represents environmental information as an extension field in the cloudEvents specification"
- name: region
type: text
tokenizer: raw
fast: true
indexed: true
description: "represents region information as an extension field in the cloudEvents specification"
- name: namespace
type: text
tokenizer: raw
fast: true
indexed: true
description: "represents namespace information as an extension field in the cloudEvents specification"
timestamp_field: time
tag_fields: [region]
partition_key: namespace
search_settings:
default_search_fields: [id, requestId, userId, extra.grpc.headers.x-request-id]
indexing_settings:
merge_policy:
type: "stable_log"
merge_factor: 10
max_merge_factor: 12
maturation_period: 48h
commit_timeout_secs: 30
retention:
period: 30 days
schedule: daily
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working