aj/new loop #563

astuyve · 2025-02-21T15:12:30Z

This removes about ~30ms of latency for customers with very busy functions by no longer blocking on the platform.runtimeDone event, which is provided to us by Lambda via the Telemetry API. The call has a minimum 25ms buffer time, which is why the delay is present.

Splits the loop into 3 distinct branches:
- Flushing at the end of a invocation, used for first few invocations
- Flushing at the beginning of an invocation, used for flush cycle for busy periodic functions
- Skip flushing, used for non-flush cycle
we still flush with the race timeout in order to handle long running functions where customers want telemetry data while the function is running. This fixes an issue where we would break until the end for long running functions after flushing once.
includes a fix to read all of the logs flush response

…-flush scenario

…y flush controller to account for this change. Some small http_client tweaks

…-extension into aj/new-loop

This reverts commit d28cfbe.

astuyve · 2025-02-21T19:06:20Z

bottlecap/src/logs/flusher.rs

            Ok(resp) => {
-                if resp.status() != 202 {
+                let status = resp.status();
+                _ = resp.text().await;


Fixes an issue where keepalive can't be used if we don't fully read/exhaust the response buffer

bottlecap/src/logs/flusher.rs

bottlecap/src/logger.rs

duncanista · 2025-02-21T22:09:26Z

bottlecap/src/lifecycle/flush_control.rs

-                        should_periodic_flush
-                    );
-                    return should_periodic_flush;
+                    return false;


Not sure I understand this

previously this method would control whether or not to switch over to flushing periodically as well as determining if we've switched to periodic and it's time to flush. That's partially why the main loop was so confusing.

Instead, we've lifted that logic to the main loop, so this method can now be simplified. It only determines if the user specified they want to flush at end, or if we're on the default strategy – if we've seen enough invocations to flip over to periodic flushing. If the answer is yes, then we should not flush at end.

duncanista · 2025-02-21T22:12:40Z

bottlecap/src/bin/bottlecap/main.rs

+                );
+
+                if let Some(metrics) = metrics {
+                    return Some(RuntimeDoneMeta {


Can't we just sent the event back?

I kinda like having the runtime done meta because right now that's the only thing we care about, and the data we need is buried in an option. If we just return the event we'd push this matching/conditional handling into the main loop in an already kinda complex case.

thoughts?

I don't like it a 100% but we can change it in the future. My idea was still sending the clone of the event as an optional, that way we ensure we don't have to push the conditional back to the loop.

It just feels unnecessary when an object already exist for this which has the same information.

yeah it's a little weirder with the enum because we strip a bunch of the event info away and just return what we need. It's another alloc but it should be stack alloc'd and fast anyway. We can change it later

bottlecap/src/bin/bottlecap/main.rs

astuyve and others added 22 commits February 18, 2025 16:14

feat: add zstd and compress

a50d388

Merge branch 'main' into aj/gzip-logs-payloads

79f8720

hack: skip clippy for a sec

c52d920

WIP commit BROKEN lock logic

c9caf30

working but crappy

56501a0

fix: working next invocation handler method

8ffefe1

feat: new loop workingish

d58c1e6

feat: rearrange biased loop, prioritize looking for next event in non…

f5d2b3c

…-flush scenario

Merge branch 'aj/gzip-logs-payloads' into aj/new-loop

9a066ec

Merge branch 'main' into aj/new-loop

cdb0fb3

feat: Honor logs config settings.

fe2b4ce

fix: dont set zstd header unless we compress

c87e733

fmt

41af647

clippy

eada1ab

Merge branch 'aj/gzip-logs-payloads' into aj/new-loop

7d4a21c

feat: client settings, fix up some measurements in logs flushing

437a834

Merge branch 'main' into aj/new-loop

96c856e

bump reqwest, maximum reqwest client trace

1339e7a

move back to joinset from futures

fa2077d

fix: reset flush interval, use default

cbe256f

feat: Fix missing reset interval

9339a80

feat: Explicit forks for flush at end vs flush at top periodic. Modif…

e76dfa2

…y flush controller to account for this change. Some small http_client tweaks

astuyve requested a review from a team as a code owner February 21, 2025 15:12

astuyve and others added 7 commits February 21, 2025 10:13

fix: clippy

9405ef9

WIP: disable nagles, increase window, debug everything

8629afa

magic-er

bd8b2ad

use hyper instead of http client from reqwest

d28cfbe

Merge branch 'aj/new-loop' of ssh://github.com/DataDog/datadog-lambda…

25a7787

…-extension into aj/new-loop

Revert "use hyper instead of http client from reqwest"

eef7e7e

This reverts commit d28cfbe.

feat: back out client changes

1eb891d

astuyve added 2 commits February 21, 2025 14:02

fix: remove debug logic

a8e2a17

Move to debug log

b359bb7

astuyve commented Feb 21, 2025

View reviewed changes

astuyve added 4 commits February 21, 2025 14:19

feat: Flush all method

70ab900

feat: licenses

19a5f38

Merge branch 'main' into aj/new-loop

c35b639

feat: force http1

b73e9a8

duncanista reviewed Feb 21, 2025

View reviewed changes

bottlecap/src/logs/flusher.rs Outdated Show resolved Hide resolved

duncanista reviewed Feb 21, 2025

View reviewed changes

bottlecap/src/logger.rs Outdated Show resolved Hide resolved

duncanista reviewed Feb 21, 2025

View reviewed changes

bottlecap/src/bin/bottlecap/main.rs Show resolved Hide resolved

astuyve added 2 commits February 24, 2025 09:35

fix: remove unneeded debug/log statements

edd488c

feat: fmt

9d78c2d

duncanista reviewed Feb 24, 2025

View reviewed changes