Skip to content

Conversation

sebsto
Copy link
Collaborator

@sebsto sebsto commented Oct 14, 2025

Closing #584

The LocalServer now queues concurrent POST /invoke requests from testing client applications and ensures that the requests are delivered to the Lambda Runtime one by one, just like the AWS Lambda Runtime environment does.

The Pool has now two modes : pure FIFO (one element get exactly one next()) and one mode where multiple elements can get pushed and multiple next(for requestId:String) can be called concurrently.

The two modes are needed because invocations are 1:1 (one POST /invoke is always by one matching GET /next) but responses are n:n (a response can have multiple chunks and concurrent invocations can trigger multiple next(for requestId: String)

I made a couple of additional changes while working on this PR

  • I moved the Pool code in a separate file for improved readability

  • I removed an instance of DispatchTime that was hiding in the code, unnoticed until today

  • I removed the async requirement on Pool.push(_) function. This was not required (thank you @t089 for having reported this)

  • I removed the fatalError() that was in the Pool implementation. The pool now throws an error when next() is invoked concurrently, making it easier to test.

  • I added extensive unit tests to validate the Pool behavior

  • I added a test to verify that a rapid succession of client invocations are correctly queued and return no error

  • I moved a continuation(resume:) outside of a lock. Generally speaking, it's a bad idea to resume continuation while owning a lock. I suspect this is causing a error during test execution when we spawn and tear down mutliple Task very quickly. In some rare occasions, the test was failing with an invalid assertion in NIO : NIOCore/NIOAsyncWriter.swift:177: Fatal error: Deinited NIOAsyncWriter without calling finish()

@sebsto sebsto self-assigned this Oct 14, 2025
@sebsto sebsto added kind/bug Feature doesn't work as expected. 🔨 semver/patch No public API change. size/S Small task. (A couple of hours of work.) labels Oct 14, 2025
@sebsto sebsto requested a review from Copilot October 14, 2025 09:30
Copilot

This comment was marked as outdated.

@sebsto sebsto requested review from 0xTim and adam-fowler October 15, 2025 08:05
@sebsto sebsto marked this pull request as draft October 15, 2025 10:09
@sebsto sebsto changed the title Gently decline subsequent POST /invoke requests while the Lambda handl… Accept multiple POST /invoke requests to allow parallel testing Oct 16, 2025
@sebsto sebsto marked this pull request as ready for review October 16, 2025 12:05
@sebsto
Copy link
Collaborator Author

sebsto commented Oct 16, 2025

Note that this PR fixes concurrent invocation for non streaming lambda function only.

Local testing streaming lambda function is affected by another issue #588

@sebsto sebsto requested a review from 0xTim October 16, 2025 12:06
@sebsto
Copy link
Collaborator Author

sebsto commented Oct 16, 2025

@0xTim I think I have a working solution. I update the PR title and description to accurately describe the changes. I'll let you review when you have time. This is a big one - apologies for that but I can't think of a simpler solution to address that issue.

@sebsto sebsto requested a review from Copilot October 16, 2025 17:42
Copilot

This comment was marked as outdated.

@sebsto sebsto requested a review from Copilot October 16, 2025 18:16
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@sebsto sebsto requested review from 0xTim and removed request for 0xTim October 17, 2025 05:44
@adam-fowler
Copy link
Contributor

As I understand this you return a failure response if the pool is already being used.

  • I think your failure should be a 5xx as it is the server who can't deal with the request
  • As an alternative to this solution you could push requests onto an AsyncStream and have a task that processes them linearly, then you don't have to return a error code

@sebsto
Copy link
Collaborator Author

sebsto commented Oct 17, 2025

Thank you @adam for the suggestions

I think your failure should be a 5xx as it is the server who can't deal with the request

You're correct. This should never be thrown by the way. It reflects a programming error in the Pool. The Pool is now designed to accept multiple responses in parallel and to queue them.

I'll sent a commit in a minute

As an alternative to this solution you could push requests onto an AsyncStream and have a task that processes them linearly, then you don't have to return a error code

I tried this approach, but AsyncStream are designed to store one element at a time. When I tried to queue elements, I ended up with the same complexity as we have in the pool now : queue elements, queue continuations, match continuations with requestId.

@adam-fowler
Copy link
Contributor

I tried this approach, but AsyncStream are designed to store one element at a time. When I tried to queue elements, I ended up with the same complexity as we have in the pool now : queue elements, queue continuations, match continuations with requestId.

Actually looking at the code you could possibly do something like this. Instead of adding concurrent tasks for each connection to the server, get rid of the discarding task group and just do this. This means the pool will only be accessed by one connection at a time

try await channel.executeThenClose { inbound in
    for try await connectionChannel in inbound {
            logger.trace("Handling a new connection")
            await server.handleConnection(channel: connectionChannel, logger: logger)
            logger.trace("Done handling the connection")
    }
}

@sebsto
Copy link
Collaborator Author

sebsto commented Oct 17, 2025

@adam-fowler The local HTTP server was designed to act as a single HTTP server. It accepts both the testing client requests (POST /invoke) and the requests made by the lambda function (the runtime client's GET /next and POST /response)
Most of the complexity comes from this design decision that was made for version 0.x.

I think we need to keep the current task group to handle requests from the testing client and requests from the lambda runtime separately.

Does it make sense ? (or do I miss something bigger here)

@adam-fowler
Copy link
Contributor

@adam-fowler The local HTTP server was designed to act as a single HTTP server. It accepts both the testing client requests (POST /invoke) and the requests made by the lambda function (the runtime client's GET /next and POST /response) Most of the complexity comes from this design decision that was made for version 0.x.

I think we need to keep the current task group to handle requests from the testing client and requests from the lambda runtime separately.

Does it make sense ? (or do I miss something bigger here)

Ok that won't work then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Feature doesn't work as expected. 🔨 semver/patch No public API change. size/S Small task. (A couple of hours of work.)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[localserver] concurrent invocations of POST /invoke crashes the local test server

3 participants