Skip to content

Conversation

@davereinhart
Copy link
Collaborator

Improves overall performance by adding gzip middleware to fastapi to compress large responses.

Adds new optional boolean parameter to ScoreSetsSearch and updates score set search endpoint to make retrieval of parent experiment score set URNs and count optional.

…t scoreset URNs and count from response

Including the scoreset URNs and count for all score sets for the parent experiment is computationally expensive and not necessary in all contexts (e.g. the main score set search page). Adding an optional boolean parameter to score set search paramaters to indicate whether or not to include this in the response, which defaults to true to maintain existing behavior.
Copy link
Collaborator

@bencap bencap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, should be a big improvement.

I was thinking about the settings and realizing that we don't really have a good way to decide what is appropriate for our application. I think the 1000 minimum size and compression level 5 are reasonable and there's a good chance we won't ever have to change them, but I think it might also be worthwhile to add a log entry to the log_request logging function that can give us some better insight into the sizes of the responses we are dealing with. It would've been nice to have been able to pull up the last few months of traffic and seen how large responses actually were in practive. There is probably a way to do this in the elastic load balancer logs too, but a snippet like the below might be easier and more centralized in the existing logging than doing that.

async def log_request(request: Request, response: Response, end: int) -> None:
...

        size = None
        content_length = response.headers.get("content-length")
        if content_length:
            try:
                size = int(content_length)
            except ValueError:
                size = None
        if size is None:
            body = getattr(response, "body", None)
            if isinstance(body, (bytes, bytearray)):
                size = len(body)
        if size is None:
            size = 0  # streaming or unknown
        save_to_logging_context({"response_size_bytes": size})

...

I can open a new issue for this if you'd rather just merge the existing code, it looks good to me.

@davereinhart
Copy link
Collaborator Author

davereinhart commented Oct 24, 2025

@bencap Once this change is deployed, wouldn't the content-length header just show the size of the compressed response? If that's the case, this type of logging may not be as informative as if were put in place beforehand.

@bencap
Copy link
Collaborator

bencap commented Oct 27, 2025

@bencap Once this change is deployed, wouldn't the content-length header just show the size of the compressed response? If that's the case, this type of logging may not be as informative as if were put in place beforehand.

Hmm yeah that's true, I guess for it to work properly with these changes we'd need to inject a pre-compressed content length into the logging context via a middleware that runs prior to the gzip compression.

@bencap bencap mentioned this pull request Oct 27, 2025
@davereinhart davereinhart merged commit a150ebd into release-2025.5.0 Oct 27, 2025
6 checks passed
@davereinhart davereinhart deleted the davereinhart/router-performance-improvements branch October 27, 2025 19:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants