-
Notifications
You must be signed in to change notification settings - Fork 273
feat: Add spans for app start, time to first draw, time on screen #2831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
model/app/spans.yaml
Outdated
requirement_level: opt_in | ||
- ref: app.screen.main_thread_busy_time | ||
requirement_level: opt_in | ||
- id: span.app.screen.visible.client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be an event or even a metric to avoid long running span.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will convert to an event
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually reopening this as I don't think it should be an event given it has a duration.
My understanding with long running spans is that the concern comes from the risk of data loss since it can result in an all or none scenario, assuming the span is being used as the root span for many child spans. For example, the session span. Having a long running session span is risky assuming all the spans within the session with have this as its parent span. If this span isn't sent for some reason, all the child spans are lost.
For this span, however, the idea is to create the span only once at the end of the screen's visibility, like how we would if we were to create a log. We can store the timestamps locally and avoid creating a long span by doing span.setStartTime(prev).end(now) once the screen is no longer visible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this contribution, @limhjgrace. I believe it's nice to have a platform-agnostic way of describing screen-related events, as you've proposed here. I've added some comments with some concerns, but generally speaking, I think we should have something like this in the spec.
One note I'd like to add is that, by nature, it seems like some of the events defined here have some sort of relation or overlap (for example the app.start.client
one with one of the app.screen.(first_time|load)
ones) which I think it would be nice to clarify on their descriptions, and also to name the events (if possible) in a way that these connections/roles can become a little more intuitive. For example, if I understand correctly, the difference between app.screen.first_time
and app.screen.load
is that the former measures the time a screen takes to render the UI, whereas the latter also takes into account the time a user might have to wait until they can interact with the screen, even though the UI is rendered (maybe due to some ongoing network request that's needed to add data into the screen). So in that sense, the former might be implicit in the latter, so maybe the names might be something like screen.pre_load
and screen.load
. Or, maybe making them a bit more specific could probably help too, with something like screen.ui_rendered
and screen.ready
. Names aren't my strong suit, so I wouldn't take them literally, but my point is that the current ones don't seem to intuitively describe these connections, and I think it would be great to do so.
model/app/spans.yaml
Outdated
@@ -0,0 +1,97 @@ | |||
groups: | |||
- id: span.app.screen.first_appear.client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks to me that this could better suit an event, rather than a span, considering that it's mostly about gathering the duration
attribute and that it doesn't seem to matter much when the process started and ended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think there are use cases where process start and end is still important. For example, if I wanted to create a visualization of all the telemetry in chronological order I would need these start and end times.
Maybe this can be achieved by using the duration and the observedTimestamp but with that aside, and maybe I've oversimplified the definition, I thought we should be treating telemetry with a duration as a span and logs to be things that happened in an instance in time? Unless it's a long-running span that has a high risk of not gracefully being ended. Or is the correct definition that we should consider a span to be telemetry with start/end times of significance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the fact that we gather a duration
attribute makes it more suitable for an event rather than a span. To me, a span is useful to track the start/end of a process, from which the duration can be computed. Also, to be able to keep track of children spans in a way that they can be displayed together in a waterfall UI. Unless I missed something, the use case we're discussing here doesn't need those features, though please let me know if that's not the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @LikeTheSalad description that spans are useful for tracking a process and seeing all the child processes triggered as a result. Wheras events are useful for knowing x happened at time Y.
I do however see the use of having one span to describe the navigation process which can act as the root but none currently do that for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, a span is useful to track the start/end of a process, from which the duration can be computed.
+1. But for the duration
calculation, I wanted to do this client side instead of server-side or query-time as this adds scalability, performance, and technical concerns. Is there a concern in adding a duration
attribute to spans even though it can be derived from the start/endtime?
Also, to be able to keep track of children spans in a way that they can be displayed together in a waterfall UI. Unless I missed something, the use case we're discussing here doesn't need those features, though please let me know if that's not the case.
Yes, this is a valid use case for services who want to provide a way to visualize an entire session. There was an on going discussion on how we could provide this view by having the specs support a session span but given the concerns of a long running span, the community didn't align on this.
spans are useful for tracking a process and seeing all the child processes triggered as a result
+1. To me, a screen load and time to first draw are both processes and is analogous to a web page load, not vitals to a screen. I know below that they've have been compared to the web vitals but I don't think that is an apples to apples comparison, especially when not all the web vitals have a timing related component to it e.g. CLS. It was also mentioned that we could have a span for the screen but I'm thinking this will have similar long running concerns as a session span and won't be viable. That and I'm assuming the suggestion was to add these as span events? But I thought those are being deprecated?
Maybe this is a question of what is a span and what is a log event given there is some overlap between the two where a log event should be used in place of creating long running spans. Technically, most spans could also just be defined as a log event and the discussion feels like it's heading towards that direction. If so, are we defining spans to be used only for tracking a process for its child processes? Or is the suggestion to combine both of these spans into one and have the timing defined as attributes (e.g. attributes.time_to_first_draw
and attributes.time_to_first_stable_draw
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, a screen load and time to first draw are both processes and is analogous to a web page load, not vitals to a screen.
The way I look at is the navigation is the process that was triggered and items like first draw are events of the navigation.
It was also mentioned that we could have a span for the screen but I'm thinking this will have similar long running concerns as a session span and won't be viable.
Long running shouldn't be an issue as the span would only represent the navigation process.
I'm assuming the suggestion was to add these as span events? But I thought those are being deprecated?
Yes and no, the span events api is being deprecated however events will still be able to be able to be logged via the logger api. These events can also still be associated with the span/trace.
is the suggestion to combine both of these spans into one and have the timing defined as attributes
That is one option but additional attributes would be needed on that span. the focus has been on thinking about it each stage as an event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all the details you've shared, @limhjgrace, and for your patience. I've had issues in the past around deprecations of attributes, and renaming of events, and it's no fun, because even though they are deprecated in the spec, if a vendor has delivered them in their tools, chances are that they must continue supporting those for a while after the spec is deprecated. Which is why I think it makes sense to try and be sure about how much value a new spec provides upfront, which can be tricky, of course, because we can't predict the future, but I've found that we can follow a couple of guidelines that help reduce the need to add breaking changes later:
- Starting small. That is, with the least amount of attributes/enums and events as possible, because it's always possible and easier to add stuff later rather than having to remove stuff.
- (if possible) Having a real-world use case that needs these signals. It's probably the most helpful one, because it takes the burden off of ourselves of having to "guess" what real users need and how we can present the data in a way that's helpful to them.
- (if possible) Having a working implementation that has been used for a while. This one might sound like doing things backwards, though it often helps with things such as knowing what's even possible to implement and how, such as what I mentioned earlier about not being practical to define a span for an app launch, as the launch also covers the OTel client initialization, so starting a span before the client is ready is not possible, leaving implementations to resort to probably hacky-ish solutions, which I'm not completely against doing, but I'd like to reduce them as much as possible.
Having said that, I think we're making good progress here, though I figured that the communication might be easier if I explained the background of my decision-making process around these issues, which I'm open to discussing too, in case some of the things I mentioned don't make sense. So, regarding your comments:
But for the duration calculation, I wanted to do this client side instead of server-side or query-time as this adds scalability, performance, and technical concerns. Is there a concern in adding a duration attribute to spans even though it can be derived from the start/endtime?
I'm not against adding a duration
attribute to events, for example, as there's no other way to get that info out of those. With spans, however, it can be calculated as I mentioned earlier, which makes it a bit redundant, and so I think it's not compatible with the "starting small" (point 1) mentioned either. It seems like the reason to add it is about performance, though if that's the reason alone, it sounds to me that we might be falling into a premature optimization situation, unless we have a use-case that both requires this to be a span and also to have its duration precalculated. Otherwise, what prevents us from asking the same for every other span as well? I.e. things like "if this span has a precalculated duration attribute, shouldn't every client span have it too to increase performance everywhere?".
Yes, this is a valid use case for services who want to provide a way to visualize an entire session. There was an on going discussion on how we could provide this view by having the specs support a session span but given the concerns of a long running span, the community didn't align on this.
I think this is a key point, because it sheds some light on what could be a potential real-world use case for these signals. If the idea is to provide a session-wide visualization, I love it and I'm all up for being able to provide something like that, though given that the "standard" way of visualizing data covers spans only, it leaves the idea of using events as not too useful for these cases, as they won't typically show up alongside spans, which I think can make the idea of "making everything a span" interesting, as it will ensure visibility for these events in traditional UI solutions. However, as you mentioned, it isn't easy, or even possible in some cases, to successfully create long-running spans on clients, which is why the session-related talks in the client sig focus on events instead. I think ideally we should have UI solutions that show both spans and events in a single timeline, but for the time being, I guess the only way to fulfill a "session view" use case is by using spans, and I'm not even sure if sessions will stay as events or will evolve to something else in the future (@bidetofevil might have some insights on this), so I'd say that this would be a point in favor of choosing a span for this use-case.
To me, a screen load and time to first draw are both processes and is analogous to a web page load, not vitals to a screen
I agree with what @thompson-tomo replied:
The way I look at is the navigation is the process that was triggered and items like first draw are events of the navigation.
Now, it boils down to what's needed, because in the end, both points of view can be valid depending on what's the problem we want to solve. For example, if we want to know not only how long does it take for a screen to load but also what subprocesses happen within that span of time, then a span makes sense. But if we just want to know what screens are slow to load, then an event should do. Just by the name, it seems like it might make sense to make "screen load" a span and "time to first draw" an event with duration
to make queries easier for the duration part alone. So I don't think it has to be either/or, it may be a combination of things, but we can have an endless discussion about it unless we know what we want to solve with these, so if you have a real-world use case that you need to address I think it would help us better understand what's the minimum amount of signals/attributes needed to start small but to solve at least a basic issue first.
It was also mentioned that we could have a span for the screen but I'm thinking this will have similar long running concerns as a session span and won't be viable.
I'd say is less tricky than having a span per session, but it still can be challenging in case the screen is abruptly closed, causing the span not to end properly and therefore losing the data. Still, despite the challenges, if we wanted to capture not only a screen duration but also all the subprocesses that happen within, I can't think of a better solution than using a span. There will be details to polish, such as apps that can have a single screen open for a long time, such as for kiosks, and how much data can UIs show in a single waterfall view, that I think should be defined. Now, I'm not sure we're conflating different issues here, because my understanding so far of this PR is that it's trying to add signals to track times only (time for app start, for screen load, and for time on screen), which, at first glance, sounds like events should do.
That and I'm assuming the suggestion was to add these as span events? But I thought those are being deprecated?
I haven't kept track of this in detail, but my understanding is that the concept of an event being attached to a span as the only way to create events is what's changing, and now it should be possible to have standalone events, as well as events that are somehow attached to spans, and my guess is that it'd make the span-specific APIs to create events become redundant, maybe causing those to probably get deprecated? I don't have too many details on it though, so don't take my word for it.
Maybe this is a question of what is a span and what is a log event given there is some overlap between the two where a log event should be used in place of creating long running spans.
In my experience, these questions are easier to answer when there's a real-world use case to address, because it makes the end goal clear. For example, if the use case is to just track the duration of these events to query them later to do some calculation, such as the average load time, or things like that, to me for this case the answer is easily an event.
Technically, most spans could also just be defined as a log event and the discussion feels like it's heading towards that direction.
If we're creating a pair of events, one that tracks the start of something, and another that tracks the end of that something, I think we'd be overlapping with spans quite a lot (though leaving outside key span features such as trace context propagation). What I've proposed a couple of times is to have only an event that carries the whole duration of a process, because based on the descriptions, it seems like that's the question we're trying to answer.
If so, are we defining spans to be used only for tracking a process for its child processes? Or is the suggestion to combine both of these spans into one and have the timing defined as attributes (e.g. attributes.time_to_first_draw and attributes.time_to_first_stable_draw)?
What's the use case you're trying to address? Is it just having some sort of metrics that can be aggregated, or is it displaying a hierarchy of processes? And if the answer to that is "for now some metrics, but in the future we might need to display some of these processes as traces", then my questions would be: Is it needed to combine these use cases in single span/event? Also, is it needed to have everything defined upfront in this PR, or could further conventions be added in the future as needed?
model/app/spans.yaml
Outdated
**Span kind** MUST be `CLIENT`. | ||
This span captures the time from user initiation (e.g., tapping the app icon or opening a link) | ||
to the moment the app is ready for interaction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this description somewhat links this event to the "screen load" or "screen first time", in the sense that the duration is gathered up to the point where the "root screen" is either finished rendering, or finished fully loading. Though it's not clear which one of those cases it refers to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the intention was not to link it to either as technically an application can be considered launched/started before the first screen has been loaded. My understanding is that for the Android SDK, the end of an app start is defined to be when the "Activity Resumed" life cycle event occurs for the first activity. Similarly, Apple provides didBecomeActiveNotification
which could be used but doesn't necessarily mean a screen has loaded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
technically an application can be considered launched/started before the first screen has been loaded
The description reads:
"to the moment the app is ready for interaction"
How can it be possible that the app is ready for interaction without the first screen being loaded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair, I'll update to remove this specification and make it more broad/applicable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, to completely decouple the app start with the screen load, I've updated the definition so that app start completes the moment before UI rendering begins. This works for both Android (AppStart span ends after the first Activity finishes creating which is before UI rendering) and iOS according to their app launch sequence doc.
582ad9d
to
a0c435c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blocking due to the time measurement units being used & attributes used on the event.
I still agree with @LikeTheSalad that most if not all the spans would be better suited as events. An argument could be made to have 1 span definition to represent a user requesting a screen as a root span but 3 seems over the top. In terms of capturing vitals of that span, I think referring to prior art (https://opentelemetry.io/docs/specs/semconv/browser/events/#webvital-event) is helpful to ensure a consistent experience, perhaps we call the event app.screen_vital.
Fixes #2830
Changes
This PR proposes the following spans definitions for apps:
span.app.start.internal
: time from user initiation (e.g., tapping the app icon or opening a link)to the moment before UI rendering begins
span.app.screen.time_to_first_draw.internal
: time to the first draw of an application screenspan.app.time_on_screen.internal
: time on application screenRevisions
1. App start span with only
cold
andwarm
values defined2. Time to first appear client span --> time to first draw internal span
4. Screen load client span --> screen load internal span. Definition also changed: Time to main thread idle --> time to first stable frame
5. Screen visible client span --> screen visible event
1. App start client span --> app start internal span. Clarified end time to be the moment before UI rendering begins.
2. Removed screen load span. Will raise separate PR to get other proposals unblocked.
3. Unit fix (ns -> s)
4. Type fix (int -> double)
6. Rename
app.screen.first_draw
-->app.screen.time_to_first_draw
7.
app.visible
event ->app.time_on_screen
internal spanMerge requirement checklist
[chore]