Skip to content

PYTHON-5215 Add an asyncio.Protocol implementation for KMS #2460

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 28 commits into
base: master
Choose a base branch
from

Conversation

blink1073
Copy link
Member

@blink1073 blink1073 commented Aug 6, 2025

See benchmark gist.

Benchmark Results:

Before: 4.93s, 5.26s
After: 4.80s, 4.89s

Depends on mongodb-labs/drivers-evergreen-tools#679

@blink1073 blink1073 requested a review from NoahStapp August 6, 2025 19:30
@blink1073 blink1073 requested a review from a team as a code owner August 6, 2025 19:30
@blink1073
Copy link
Member Author

I'm debugging two failures:

test.asynchronous.test_connection_monitoring.AsyncTestCMAP.test_connection_monitoring_pool_clear_interrupting_pending_connections_clear_with_interruptInUseConnections___true_closes_pending_connections
test.asynchronous.test_connection_monitoring.AsyncTestCMAP.test_connection_monitoring_pool_create_min_size_error_error_during_minPoolSize_population_clears_pool

@blink1073 blink1073 marked this pull request as draft August 7, 2025 00:24
@blink1073 blink1073 removed the request for review from NoahStapp August 7, 2025 00:24
@blink1073
Copy link
Member Author

I realized the benchmark test wasn't actually triggering the protocol -> I'm tweaking things locally

@blink1073
Copy link
Member Author

This has a couple commits from #2467

@blink1073
Copy link
Member Author

Okay this is ready for a look. We might consider switching to the base Protocol since the sizes of the responses are constrained and small, and we typically only have 2-3 reads per socket.

@blink1073 blink1073 requested a review from NoahStapp August 8, 2025 22:07
@blink1073
Copy link
Member Author

I still need to make a PR for the fix to the kms mock server's 404 response.

@blink1073
Copy link
Member Author

CSOT failure is unrelated: PYTHON-5492

# Reuse the active buffer if it has space.
if len(self._buffers):
buffer = self._buffers[-1]
if len(buffer.buffer) - buffer.end_index > sizehint:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If sizehint = -1, which signals that the buffer size can be arbitrary, this check will always succeed, potentially returning an empty buffer, which is an error. We need to check that sizehint is a positive number as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

"""
self.transport = transport # type: ignore[assignment]

async def read(self, bytes_needed: int) -> bytes:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make sure I understand the intended flow here, is this example correct?

  1. We call kms_request and enter the while kms_context.bytes_needed > 0: loop.
  2. The first chunk of data, say 16 bytes worth is written into an existing buffer that still has space.
  3. PyMongoKMSProtocol.read() is called and immediately returns those 10 bytes.
  4. kms_context.bytes_needed updates to need 84 more bytes for a total of 100.
  5. We call PyMongoKMSProtocol.read() again and wait on the _pending_listeners Future we create.
  6. The second chunk of data, the remaining 84 bytes, requires a new buffer since the active buffer is full.
  7. The Future is resolved with those bytes, which we return and feed into kms_context to complete the operation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • We call kms_request and enter the while kms_context.bytes_needed > 0: loop.
  • The first chunk of data, say 16 bytes worth is written into an existing buffer that still has space.
  • PyMongoKMSProtocol.read() is called and immediately returns those 10 bytes, pushing the start_index up by 10
  • kms_context.bytes_needed updates to need 84 more bytes
  • We call PyMongoKMSProtocol.read() again and wait on the _pending_listeners Future we created.
  • The second chunk of data, the remaining 84 bytes, may require a new buffer
  • If any bytes are available, we read in up to the newly requested 84 bytes from the active buffer(s), advancing start_index and exhausting buffers as appropriate
  • Otherwise, we wait on the future to be resolved, which will contain up to the requested bytes.

@blink1073 blink1073 marked this pull request as ready for review August 11, 2025 15:53
@blink1073 blink1073 requested a review from NoahStapp August 11, 2025 15:53

async def async_sendall(conn: PyMongoProtocol, buf: bytes) -> None:
bytes_needed = self._pending_reads.popleft()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if we need more bytes than we have? We've already popped the waiter and set it's result to data, which can only read up to self._bytes_ready bytes. Are we relying on the kms_context.bytes_needed loop to call the protocol read() method again and create a new waiter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we give the partial result back to the kms context, and let it ask for more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could get better performance by doing more of the looping inside the Protocol, but KMS requests won't be a significant part of runtime anyway so not worth spending more time on it. Can you add a comment to this effect somewhere saying that we rely on the looping behavior for this to function correctly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants