Skip to content

Conversation

@hariso
Copy link
Contributor

@hariso hariso commented May 6, 2025

Description

Closes #278.

Quick checks:

  • I have followed the Code Guidelines.
  • There is no other pull request for the same update/change.
  • I have written unit tests.
  • I have made sure that the PR is of reasonable size and can be easily reviewed.

@hariso hariso changed the title [WIP] Optimize ReadN Optimize ReadN in CDC by collecting records in batches May 13, 2025
@hariso hariso marked this pull request as ready for review May 14, 2025 15:20
@hariso hariso requested a review from a team as a code owner May 14, 2025 15:20
@hariso hariso force-pushed the haris/read-n-batches branch from d288eee to c181d90 Compare May 14, 2025 15:22
@hariso hariso enabled auto-merge (squash) May 15, 2025 12:06
Copy link
Contributor

@raulb raulb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hariso Left some comments, but approving because I don't consider them blockers. I'm particularly interested in the benchmark results comparing to our latest tests due to:

  1. Implementation is supposedly more performant actually collecting records in batches.
  2. See if the change has any impact on #282

Co-authored-by: Raúl Barroso <[email protected]>
@hariso
Copy link
Contributor Author

hariso commented May 22, 2025

@hariso Left some comments, but approving because I don't consider them blockers. I'm particularly interested in the benchmark results comparing to our latest tests due to:

1. Implementation is supposedly more performant actually collecting records in batches.

2. See if the change has any impact on [Investigate: Why allocating memory is less efficient on `ReadN` #282](https://github.com/ConduitIO/conduit-connector-postgres/issues/282)

@raulb I've just run in on my EC2 instance (c7i.xlarge), and the CDC test that we currently have in streaming-benchmarks gives me cca. 64k msg/s, whereas this version of the connector gives around cca. 74k msg/s. I ran both versions twice.

@hariso hariso merged commit 5b39f4c into main May 28, 2025
3 checks passed
@hariso hariso deleted the haris/read-n-batches branch May 28, 2025 11:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Add support for collecting records in batches

3 participants