-
Notifications
You must be signed in to change notification settings - Fork 3.2k
feat: Add CloudWatch logs subscription handling support to awslambdareceiver #44562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Add CloudWatch logs subscription handling support to awslambdareceiver #44562
Conversation
Signed-off-by: Kavindu Dodanduwa <[email protected]>
fced53f to
2ae6e5d
Compare
Signed-off-by: Kavindu Dodanduwa <[email protected]>
2ae6e5d to
e9aa82a
Compare
axw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks fine, but I'd like to revisit the config. Implicitly making it CloudWatch-based because the S3 encoding isn't configured is a bit unintuitive.
| | `s3_encoding` | Name of the encoding extension to use for S3 objects | "awslogs_encoding" | Optional | | ||
| | Name | Description | Required | | ||
| |:--------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------| | ||
| | `s3_encoding` | Name of the encoding extension to use for S3 objects. For example, `awslogs_encoding` to use AWS logs encoding extensions. When unspecified, falls back to CloudWatch subscription filter handling | Optional | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit confusing to me. CloudWatch subscription filter handling isn't relevant when decoding S3 objects, right? So maybe reword to:
Name of the encoding extension to use for S3 objects. This is only used for handling S3 events, and not for CloudWatch Log events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @axw yes this is something I thought about but I upstreamed the existing mechanism as is. However, we now have a chance to improve this usage.
As next step, I will come up with a proposal to improve this
| // If S3Encoding is unspecified, the receiver will return an error for any S3 event notifications. | ||
| // If S3 data is in multiple formats (ex:- VPC flow logs, CloudTrail logs), you should deploy | ||
| // separate Lambda functions with specific extension configurations. | ||
| // | ||
| // If you have objects with multiple different encodings to handle, you should deploy | ||
| // separate Lambda functions with different configurations. | ||
| // If unspecified, the receiver falls back to work with CloudWatch Log subscription encoding extension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we went back and forth on this internally, but I'd really like to keep these things independent. Can we maintain the existing doc comment please? It's confusing to talk about CloudWatch logs in the same breath as S3 encoding - they're totally separate events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am exploring the option described here - #44562 (comment) . Let me know your view on this.
| // If S3Encoding is not set, then we consider this to be CloudWatch subscription mode | ||
| if cfg.S3Encoding == "" { | ||
| return nil, errors.New("s3_encoding is required for logs stored in S3") | ||
| unmarshaler, err := loadSubFilterLogUnmarshaler(ctx, awslogsencodingextension.NewFactory()) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
|
|
||
| return newCWLogsSubscriptionHandler(set.Logger, unmarshaler.UnmarshalLogs, next.ConsumeLogs), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have two separate handlers in the receiver - one for CloudWatch, and an optional one for S3? If no s3_encoding is specified, then the latter handler will be nil. Then if an S3 event comes in, an error will be returned.
Thanks @axw for this remark. Personally, I also saw this to be misleading (detecting CW logs operation mode when the S3 encoder is missing felt wrong). I am exploring a new strategy for this but it will be a considerable refactoring. And I think this is the right time to fix this. What I am considering is the following,
I am actively working on this improvement and will update this as a single commit to have focus. But let me know if you think of a different approach. |
How would that encoding extension be used for CloudWatch Logs? They have a specific structure -- they must be decoded in the CloudWatch log subscription filter format. I think it would make sense to use it for decoding the message field, so e.g. you could use awslogs_encoding to decode CloudTrail logs sent to CloudWatch. You could support both S3 and CloudWatch events in the one Lambda, but they would have to have the same encoding, which still adds a little bit of cognitive overhead. Another option would be to have two separate attribute groups, along these lines: receivers/
awslambda:
s3:
encoding: ... # required for handling S3 events
cloudwatch_logs:
message_encoding: ... # optional; defaults to using the message field as the log record bodyI personally think that's easier to reason about, since it keeps the two event types completely separate. Configuring the encoding for one event type has no impact on the other. |
Description
This PR adds CloudWatch logs Lambda subscription handling support to
awslambdareceiver.With this change, the receiver can now be deployed as a Lambda and support handling either of the signals: S3 triggers or CloudWatch subscription logs.
Note - Receiver can only handle one signal type per deployment.
Link to tracking issue
Completes part of #43504
Testing
I have added unit tests to cover most of the new functionality. One of them (
TestHandleCloudwatchLogEvent) provides end to end validation of a CloudWatch event to plogs conversion.Documentation
Existing documentation already covered some aspects. However, I have attempted to uplift them along with code level comments.