Skip to content

Conversation

sdangol
Copy link
Contributor

@sdangol sdangol commented Sep 15, 2025

Summary

This PR adds a built in compress middleware which can be used along with the event handler for providing an easy way to compress response bodies to reduce payload size and improve performance.

Changes

Please provide a summary of what's being changed

  • Added a middleware directory in the rest directory to contain the built-in middlewares
  • Exported the middleware path to be used
  • Created the compress middleware which takes an encoding and a threshold and does the compression if needed

Please add the issue number below, if no issue is present the PR might get blocked and not be reviewed

Issue number: closes #4474


By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.

@sdangol sdangol self-assigned this Sep 15, 2025
@boring-cyborg boring-cyborg bot added dependencies Changes that touch dependencies, e.g. Dependabot, etc. event-handler This item relates to the Event Handler Utility tests PRs that add or change tests labels Sep 15, 2025
@sdangol sdangol requested a review from svozza September 15, 2025 15:44
@pull-request-size pull-request-size bot added the size/L PRs between 100-499 LOC label Sep 15, 2025
@sdangol sdangol requested a review from dreamorosi September 15, 2025 17:26
@sdangol
Copy link
Contributor Author

sdangol commented Sep 15, 2025

@dreamorosi I've addressed the feedback and accepted the regex issue in Sonar

@dreamorosi
Copy link
Contributor

Code duplication is quite high, I suspect because of tests. Can we do anything to bring it down?

Also, and I suspect the answer is no, but is there any way to break down that regex into smaller chunks? Not just because Sonar complains (which I'd like to avoid if possible) but also because of complexity.

Can we, for example, make an educated guess about a subset of content types that are more common and put these in their own smaller (and faster?) regex, and then fall back to the second (or third) if the previous fail? Does this make sense?

@sdangol
Copy link
Contributor Author

sdangol commented Sep 16, 2025

Code duplication is quite high, I suspect because of tests. Can we do anything to bring it down?

Yes, I'll try to optimize the tests

Also, and I suspect the answer is no, but is there any way to break down that regex into smaller chunks? Not just because Sonar complains (which I'd like to avoid if possible) but also because of complexity.
Can we, for example, make an educated guess about a subset of content types that are more common and put these in their own smaller (and faster?) regex, and then fall back to the second (or third) if the previous fail? Does this make sense?

Yeah, makes sense. I'll see if I can break it down

Copy link
Contributor

@dreamorosi dreamorosi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've built this branch (npm run build -w pacakges/event-handler), packed it (npm pack -w packages/event-handler && git restore packages/event-handler/package.json), and installed the tarball in another dummy project created using the triaging template.

My function looks like this:

import { Router } from '@aws-lambda-powertools/event-handler/experimental-rest';
import { compress } from '@aws-lambda-powertools/event-handler/experimental-rest/middleware';
import { Logger } from '@aws-lambda-powertools/logger';
import type { Context } from 'aws-lambda';

const logger = new Logger({ serviceName: 'pong-service' });
const app = new Router();

app.use(async (_, reqCtx, next) => {
  logger.info('Request received', {
    headers: reqCtx.event.headers,
  });
  await next();
});
app.use(compress());

app.get('/ping', async () => {
  return { message: 'pong' };
});

export const handler = async (event: unknown, context: Context) =>
  app.resolve(event, context);
Click here to see CDK Stack
import { CfnOutput, RemovalPolicy, Stack, type StackProps } from 'aws-cdk-lib';
import { RestApi, LambdaIntegration } from 'aws-cdk-lib/aws-apigateway';
import { Runtime } from 'aws-cdk-lib/aws-lambda';
import { NodejsFunction, OutputFormat } from 'aws-cdk-lib/aws-lambda-nodejs';
import { LogGroup, RetentionDays } from 'aws-cdk-lib/aws-logs';
import type { Construct } from 'constructs';

export class TriageStack extends Stack {
  constructor(scope: Construct, id: string, props?: StackProps) {
    super(scope, id, props);

    const fnName = 'PongFn';
    const logGroup = new LogGroup(this, 'MyLogGroup', {
      logGroupName: `/aws/lambda/${fnName}`,
      removalPolicy: RemovalPolicy.DESTROY,
      retention: RetentionDays.ONE_DAY,
    });
    const fn = new NodejsFunction(this, 'MyFunction', {
      functionName: fnName,
      logGroup,
      runtime: Runtime.NODEJS_22_X,
      entry: './src/index.ts',
      handler: 'handler',
      bundling: {
        minify: true,
        mainFields: ['module', 'main'],
        sourceMap: true,
        format: OutputFormat.ESM,
      },
      environment: {
        NODE_OPTIONS: '--enable-source-maps',
      },
    });

    const api = new RestApi(this, 'TriageApi', {
      restApiName: 'TriageApi',
      deployOptions: {
        stageName: 'prod',
      },
    });
    const proxyIntegration = new LambdaIntegration(fn, {
      proxy: true,
    });
    // Catch-all ANY method on root and proxy resource
    api.root.addMethod('ANY', proxyIntegration);
    const proxyResource = api.root.addResource('{proxy+}');
    proxyResource.addMethod('ANY', proxyIntegration);

    new CfnOutput(this, 'ApiUrl', {
      value: api.url,
    });
    new CfnOutput(this, 'FunctionArn', {
      value: fn.functionArn,
    });
  }
}

If I make a request, the response that I get back looks like this: �����V�M-.NLOU�R*��KW���bk� and contains the content-encoding: gzip header.

I tried with curl, httpie, and Insomnia both by turning on and off raw mode in all of them.

I'm not sure if I'm doing something wrong or I'm missing something on the infra side. Can you look into it?

@sdangol
Copy link
Contributor Author

sdangol commented Sep 17, 2025

@dreamorosi Thanks for looking into this. I will see if there are any other edge cases we need to be aware of. I realized that the actual compressed response hasn't been tested in the code as well so missed this. I'll add the tests for that as well.

@svozza
Copy link
Contributor

svozza commented Sep 17, 2025

I think I found the issue: at this point when we convert the Response object to an API Gateway Proxy response we hardcode both isBase64Encoded and body respectively as false and text.

I applied a quick & dirty patch and it works:

// Check if response contains compressed/binary content
const isCompressed = response.headers.has('content-encoding');
let body: string;
let isBase64Encoded = false;

if (isCompressed) {
  // For compressed content, get as buffer and encode to base64
  const buffer = await response.arrayBuffer();
  body = Buffer.from(buffer).toString('base64');
  isBase64Encoded = true;
} else {
  // For text content, use text()
  body = await response.text();
  isBase64Encoded = false;
}

const result: APIGatewayProxyResult = {
  statusCode: response.status,
  headers,
  body,
  isBase64Encoded,
};

I don't know if there's a better way to do this, or if I'm missing any edge cases, so please triple check the code above if you decide to put it in the repo.

Also I'm not sure if compression is the only reason why the isBase64Encoded field should be set to true (probably one of the main ones) - but worth checking.

@svozza - keen to hear your take on this as well.

Yes, the base64 logic is something I knew we'd have to come back to. I'm not sure if the presence of the content-encoding header is enough, I think we need to check for specific values in the header such as deflate and gzip. We also need to check the content-type headers for binary MIME types like image/png, image/jpeg etc.

@dreamorosi
Copy link
Contributor

Ok, thanks.

I think we should address that in a separate issue that possibly has its own spec with requirements and all.

For the sake of this PR, let's just make sure that when the compress middleware is enabled, isBase64Encoded is also set to true in the final response object.

@sdangol
Copy link
Contributor Author

sdangol commented Sep 17, 2025

@dreamorosi @svozza I've refactored the code a bit now.

It's now using the accepted-encoding header value to to find the preferred encoding and is disabled using the identity value.
I've removed the content-type checking to simplify things for now. I think if there are any requests to support additional types, we can add them later.
Also, the isBase64Encoded flag is set to true based on the value of the content-encoding header.

I've tried testing it out, I get the response already decompressed I think but the content-length seems to be different for the same request when enabling/disabling compression.

Screenshot 2025-09-17 at 6 42 18 pm Screenshot 2025-09-17 at 6 43 30 pm

@dreamorosi
Copy link
Contributor

I think but the content-length seems to be different for the same request when enabling/disabling compression

I think that's expected and the result of the content being compressed or not - I guess the only check would be to make sure it matches the expected value in the two cases.

@sdangol
Copy link
Contributor Author

sdangol commented Sep 17, 2025

I think that's expected and the result of the content being compressed or not - I guess the only check would be to make sure it matches the expected value in the two cases.

The responses did match in both the cases

dreamorosi
dreamorosi previously approved these changes Sep 17, 2025
svozza
svozza previously approved these changes Sep 18, 2025
@pull-request-size pull-request-size bot added size/XL PRs between 500-999 LOC, often PRs that grown with feedback and removed size/L PRs between 100-499 LOC labels Sep 18, 2025
Copy link

@sdangol sdangol merged commit 320e0dc into main Sep 18, 2025
40 checks passed
@sdangol sdangol deleted the feat/compress-middleware branch September 18, 2025 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Changes that touch dependencies, e.g. Dependabot, etc. event-handler This item relates to the Event Handler Utility size/XL PRs between 500-999 LOC, often PRs that grown with feedback tests PRs that add or change tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature request: Add compress middleware for Event Handler REST API

3 participants