Skip to content

Conversation

@duncanista
Copy link
Contributor

@duncanista duncanista commented Feb 6, 2025

What?

Provides a proposal to get the tags from the incoming top-level span and attach them to the aws.lambda span generated in this project.

Motivation

SVLS-4645

  • Top level span gets custom tags directly set on the span
  • Missing metrics are set on top level span

also refactored to add a quick constructor with `request_id`
i dont like this type of state sharing tho
against my will, but this would solve a lot of problems that we would have to do in other obscure ways
so we can pass it to trace agent
@duncanista duncanista requested a review from a team as a code owner February 6, 2025 22:52
}
if span.resource == INVOCATION_SPAN_RESOURCE {
let mut guard = context_buffer.lock().expect("lock poisoned");
if let Some(request_id) = span.meta.get("request_id") {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We assume the top level span is including the request_id, allowing us to match the data in the context buffer.

This is done in Ruby and Go.

.NET would have to send the top-level span first, as that one gets ditched and never arrives, should be a quick fix, but might need to get access to context so we can tag the current span with the request_id.

Not sure how Java handles this.

@duncanista duncanista changed the title experimental: get tags from top level span from tracer feat: get tags from top level span from tracer Mar 20, 2025
}
}

impl Default for Context {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we only need this for tests? if so can we wrap it in a test macro? Defaults can cause subtle bugs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do use it for logs processor, where we have default empty at the beginning all the time

_ = offsets.process_chan_tx.send(());
}
}
drop(context_buffer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure we need the manual drop here? Doesn't the closure on 307 help us by 581? Is the drop necessary? If we add new code after, we have to re-lock right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so since we only await at the very end, I decided to lock in the last place we modify the context_buffer, in reality, I'm dropping to be 100% sure that nothing strange happens. I'll add another comment to note that we drop because we await later on 332

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a more idiomatic approach is to create

        let context_for_request= {
            let mut context_buffer = self.context_buffer.lock().expect("lock poisoned");
            context_buffer.add_runtime_duration(request_id, metrics.duration_ms);
            context_buffer.get(request_id)
        };

at line 278 and

        if let Some(context) = context_for_request{

at line 311.

This clearly drops the lock when it's not needed.
We have a number of useless drop() here and there in the code.

But as for previous comment, I would avoid the lock altogether

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead, since I don't need to assign it to anything, I think I can refactor this by moving the first method to the bottom close to where we actually use the get(...) method and then use a scope to directly drop the lock 🤔

Ok(())
}

#[allow(clippy::too_many_arguments)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol maybe it's builder time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fr, when we move the crate to the new repo, might be a good exercise to re-implement it as a builder to see how it would look to simplify this

also added a comment on why we drop the lock
Copy link
Contributor

@alexgallotta alexgallotta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of passing around the context buffer under a lock all over the place,
if only the request_id is needed, can't it be just cloned as string and passed?

If anything else is needed also, the trace_agent already has the ServerlessTraceProcessor which has the context buffer too.
It seems a bit convoluted to pass it around everywhere like that

pub struct Processor {
// Buffer containing context of the previous 5 invocations
pub context_buffer: ContextBuffer,
pub context_buffer: Arc<Mutex<ContextBuffer>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this public?
I tried to remove the pub and run the test and it still works.

Considering that the Processor is already guarded my an arc mutex, I am wondering why we are wrapping it in another mutex and expose it outside

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove the pub, and I'd like to remove the lock altogether, yet I cannot find a way to get the desired outcome without the lock.

The lifecycle processor is just another component which has access to the context buffer, which the trace processor should have access too. They both need access because if the tracer sends by any chance a top level span, we need to make sure to add it to the current context of the given request id, that way whenever there's a platform runtime done, we can attach the tracer top level span information onto the one we create. WDYT?

_ = offsets.process_chan_tx.send(());
}
}
drop(context_buffer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a more idiomatic approach is to create

        let context_for_request= {
            let mut context_buffer = self.context_buffer.lock().expect("lock poisoned");
            context_buffer.add_runtime_duration(request_id, metrics.duration_ms);
            context_buffer.get(request_id)
        };

at line 278 and

        if let Some(context) = context_for_request{

at line 311.

This clearly drops the lock when it's not needed.
We have a number of useless drop() here and there in the code.

But as for previous comment, I would avoid the lock altogether

@duncanista duncanista merged commit 7724ddb into main Mar 27, 2025
27 of 35 checks passed
@duncanista duncanista deleted the jordan.gonzalez/enrich-tags-from-tracer-top-level-span branch March 27, 2025 19:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants