If we want to achieve high performance/availability, here're some rules of thumb.
-
Use a consumer group for these KafkaConsumers, thus they will work together -- each one deals with different partitions.
-
Besides
subscribe(topics), users could also choose to explicitlyassigncertain partitions to aKafkaConsumer.
-
Try with a larger
QUEUED_MIN_MESSAGES, especially for small messages. -
Use multiple KafkaConsumers to distribute the payload.
-
A
KafkaManualCommitConsumercould help to commit the offsets more frequently (e.g, always do commit after finishing processing a message). -
Don't use quite a large
MAX_POLL_RECORDSfor aKafkaAutoCommitConsumer, -- you might fail to commit all these messages before crash, thus more duplications with the nextpoll.