feat: Add spans for app start, time to first draw, time on screen #2831

limhjgrace · 2025-09-30T03:49:05Z

Fixes #2830

Changes

This PR proposes the following spans definitions for apps:

span.app.start.internal: time from user initiation (e.g., tapping the app icon or opening a link)
to the moment before UI rendering begins
span.app.screen.time_to_first_draw.internal: time to the first draw of an application screen
span.app.time_on_screen.internal: time on application screen

Revisions

Rev2: Rev1-Rev2 diff
1. App start span with only cold and warm values defined
2. Time to first appear client span --> time to first draw internal span
4. Screen load client span --> screen load internal span. Definition also changed: Time to main thread idle --> time to first stable frame
5. Screen visible client span --> screen visible event
Rev3: Rev2-Rev3 diff
1. App start client span --> app start internal span. Clarified end time to be the moment before UI rendering begins.
2. Removed screen load span. Will raise separate PR to get other proposals unblocked.
3. Unit fix (ns -> s)
4. Type fix (int -> double)
6. Rename app.screen.first_draw --> app.screen.time_to_first_draw
7. app.visible event -> app.time_on_screen internal span

Merge requirement checklist

CONTRIBUTING.md guidelines followed.
Change log entry added, according to the guidelines in When to add a changelog entry.
- If your PR does not need a change log, start the PR title with [chore]
Links to the prototypes or existing instrumentations (when adding or changing conventions)

model/app/registry.yaml

thompson-tomo · 2025-09-30T08:02:43Z

model/app/spans.yaml

+        requirement_level: opt_in
+      - ref: app.screen.main_thread_busy_time
+        requirement_level: opt_in
+  - id: span.app.screen.visible.client


Shouldn't this be an event or even a metric to avoid long running span.

Will convert to an event

Actually reopening this as I don't think it should be an event given it has a duration.

My understanding with long running spans is that the concern comes from the risk of data loss since it can result in an all or none scenario, assuming the span is being used as the root span for many child spans. For example, the session span. Having a long running session span is risky assuming all the spans within the session with have this as its parent span. If this span isn't sent for some reason, all the child spans are lost.

For this span, however, the idea is to create the span only once at the end of the screen's visibility, like how we would if we were to create a log. We can store the timestamps locally and avoid creating a long span by doing span.setStartTime(prev).end(now) once the screen is no longer visible.

model/app/spans.yaml

LikeTheSalad

Thank you for this contribution, @limhjgrace. I believe it's nice to have a platform-agnostic way of describing screen-related events, as you've proposed here. I've added some comments with some concerns, but generally speaking, I think we should have something like this in the spec.

One note I'd like to add is that, by nature, it seems like some of the events defined here have some sort of relation or overlap (for example the app.start.client one with one of the app.screen.(first_time|load) ones) which I think it would be nice to clarify on their descriptions, and also to name the events (if possible) in a way that these connections/roles can become a little more intuitive. For example, if I understand correctly, the difference between app.screen.first_time and app.screen.load is that the former measures the time a screen takes to render the UI, whereas the latter also takes into account the time a user might have to wait until they can interact with the screen, even though the UI is rendered (maybe due to some ongoing network request that's needed to add data into the screen). So in that sense, the former might be implicit in the latter, so maybe the names might be something like screen.pre_load and screen.load. Or, maybe making them a bit more specific could probably help too, with something like screen.ui_rendered and screen.ready. Names aren't my strong suit, so I wouldn't take them literally, but my point is that the current ones don't seem to intuitively describe these connections, and I think it would be great to do so.

model/app/spans.yaml

LikeTheSalad · 2025-10-01T09:07:03Z

model/app/spans.yaml

@@ -0,0 +1,97 @@
+groups:
+  - id: span.app.screen.first_appear.client


It looks to me that this could better suit an event, rather than a span, considering that it's mostly about gathering the duration attribute and that it doesn't seem to matter much when the process started and ended.

No, I think there are use cases where process start and end is still important. For example, if I wanted to create a visualization of all the telemetry in chronological order I would need these start and end times.

Maybe this can be achieved by using the duration and the observedTimestamp but with that aside, and maybe I've oversimplified the definition, I thought we should be treating telemetry with a duration as a span and logs to be things that happened in an instance in time? Unless it's a long-running span that has a high risk of not gracefully being ended. Or is the correct definition that we should consider a span to be telemetry with start/end times of significance?

I think the fact that we gather a duration attribute makes it more suitable for an event rather than a span. To me, a span is useful to track the start/end of a process, from which the duration can be computed. Also, to be able to keep track of children spans in a way that they can be displayed together in a waterfall UI. Unless I missed something, the use case we're discussing here doesn't need those features, though please let me know if that's not the case.

Agree with @LikeTheSalad description that spans are useful for tracking a process and seeing all the child processes triggered as a result. Wheras events are useful for knowing x happened at time Y.

I do however see the use of having one span to describe the navigation process which can act as the root but none currently do that for me.

To me, a span is useful to track the start/end of a process, from which the duration can be computed.

+1. But for the duration calculation, I wanted to do this client side instead of server-side or query-time as this adds scalability, performance, and technical concerns. Is there a concern in adding a duration attribute to spans even though it can be derived from the start/endtime?

Also, to be able to keep track of children spans in a way that they can be displayed together in a waterfall UI. Unless I missed something, the use case we're discussing here doesn't need those features, though please let me know if that's not the case.

Yes, this is a valid use case for services who want to provide a way to visualize an entire session. There was an on going discussion on how we could provide this view by having the specs support a session span but given the concerns of a long running span, the community didn't align on this.

spans are useful for tracking a process and seeing all the child processes triggered as a result

+1. To me, a screen load and time to first draw are both processes and is analogous to a web page load, not vitals to a screen. I know below that they've have been compared to the web vitals but I don't think that is an apples to apples comparison, especially when not all the web vitals have a timing related component to it e.g. CLS. It was also mentioned that we could have a span for the screen but I'm thinking this will have similar long running concerns as a session span and won't be viable. That and I'm assuming the suggestion was to add these as span events? But I thought those are being deprecated?

Maybe this is a question of what is a span and what is a log event given there is some overlap between the two where a log event should be used in place of creating long running spans. Technically, most spans could also just be defined as a log event and the discussion feels like it's heading towards that direction. If so, are we defining spans to be used only for tracking a process for its child processes? Or is the suggestion to combine both of these spans into one and have the timing defined as attributes (e.g. attributes.time_to_first_draw and attributes.time_to_first_stable_draw)?

To me, a screen load and time to first draw are both processes and is analogous to a web page load, not vitals to a screen.

The way I look at is the navigation is the process that was triggered and items like first draw are events of the navigation.

It was also mentioned that we could have a span for the screen but I'm thinking this will have similar long running concerns as a session span and won't be viable.

Long running shouldn't be an issue as the span would only represent the navigation process.

I'm assuming the suggestion was to add these as span events? But I thought those are being deprecated?

Yes and no, the span events api is being deprecated however events will still be able to be able to be logged via the logger api. These events can also still be associated with the span/trace.

is the suggestion to combine both of these spans into one and have the timing defined as attributes

That is one option but additional attributes would be needed on that span. the focus has been on thinking about it each stage as an event.

Thank you for all the details you've shared, @limhjgrace, and for your patience. I've had issues in the past around deprecations of attributes, and renaming of events, and it's no fun, because even though they are deprecated in the spec, if a vendor has delivered them in their tools, chances are that they must continue supporting those for a while after the spec is deprecated. Which is why I think it makes sense to try and be sure about how much value a new spec provides upfront, which can be tricky, of course, because we can't predict the future, but I've found that we can follow a couple of guidelines that help reduce the need to add breaking changes later:

Starting small. That is, with the least amount of attributes/enums and events as possible, because it's always possible and easier to add stuff later rather than having to remove stuff.

(if possible) Having a real-world use case that needs these signals. It's probably the most helpful one, because it takes the burden off of ourselves of having to "guess" what real users need and how we can present the data in a way that's helpful to them.

(if possible) Having a working implementation that has been used for a while. This one might sound like doing things backwards, though it often helps with things such as knowing what's even possible to implement and how, such as what I mentioned earlier about not being practical to define a span for an app launch, as the launch also covers the OTel client initialization, so starting a span before the client is ready is not possible, leaving implementations to resort to probably hacky-ish solutions, which I'm not completely against doing, but I'd like to reduce them as much as possible.

Having said that, I think we're making good progress here, though I figured that the communication might be easier if I explained the background of my decision-making process around these issues, which I'm open to discussing too, in case some of the things I mentioned don't make sense. So, regarding your comments:

But for the duration calculation, I wanted to do this client side instead of server-side or query-time as this adds scalability, performance, and technical concerns. Is there a concern in adding a duration attribute to spans even though it can be derived from the start/endtime?

I'm not against adding a duration attribute to events, for example, as there's no other way to get that info out of those. With spans, however, it can be calculated as I mentioned earlier, which makes it a bit redundant, and so I think it's not compatible with the "starting small" (point 1) mentioned either. It seems like the reason to add it is about performance, though if that's the reason alone, it sounds to me that we might be falling into a premature optimization situation, unless we have a use-case that both requires this to be a span and also to have its duration precalculated. Otherwise, what prevents us from asking the same for every other span as well? I.e. things like "if this span has a precalculated duration attribute, shouldn't every client span have it too to increase performance everywhere?".

Yes, this is a valid use case for services who want to provide a way to visualize an entire session. There was an on going discussion on how we could provide this view by having the specs support a session span but given the concerns of a long running span, the community didn't align on this.

I think this is a key point, because it sheds some light on what could be a potential real-world use case for these signals. If the idea is to provide a session-wide visualization, I love it and I'm all up for being able to provide something like that, though given that the "standard" way of visualizing data covers spans only, it leaves the idea of using events as not too useful for these cases, as they won't typically show up alongside spans, which I think can make the idea of "making everything a span" interesting, as it will ensure visibility for these events in traditional UI solutions. However, as you mentioned, it isn't easy, or even possible in some cases, to successfully create long-running spans on clients, which is why the session-related talks in the client sig focus on events instead. I think ideally we should have UI solutions that show both spans and events in a single timeline, but for the time being, I guess the only way to fulfill a "session view" use case is by using spans, and I'm not even sure if sessions will stay as events or will evolve to something else in the future (@bidetofevil might have some insights on this), so I'd say that this would be a point in favor of choosing a span for this use-case.

To me, a screen load and time to first draw are both processes and is analogous to a web page load, not vitals to a screen

I agree with what @thompson-tomo replied:

The way I look at is the navigation is the process that was triggered and items like first draw are events of the navigation.

Now, it boils down to what's needed, because in the end, both points of view can be valid depending on what's the problem we want to solve. For example, if we want to know not only how long does it take for a screen to load but also what subprocesses happen within that span of time, then a span makes sense. But if we just want to know what screens are slow to load, then an event should do. Just by the name, it seems like it might make sense to make "screen load" a span and "time to first draw" an event with duration to make queries easier for the duration part alone. So I don't think it has to be either/or, it may be a combination of things, but we can have an endless discussion about it unless we know what we want to solve with these, so if you have a real-world use case that you need to address I think it would help us better understand what's the minimum amount of signals/attributes needed to start small but to solve at least a basic issue first.

It was also mentioned that we could have a span for the screen but I'm thinking this will have similar long running concerns as a session span and won't be viable.

I'd say is less tricky than having a span per session, but it still can be challenging in case the screen is abruptly closed, causing the span not to end properly and therefore losing the data. Still, despite the challenges, if we wanted to capture not only a screen duration but also all the subprocesses that happen within, I can't think of a better solution than using a span. There will be details to polish, such as apps that can have a single screen open for a long time, such as for kiosks, and how much data can UIs show in a single waterfall view, that I think should be defined. Now, I'm not sure we're conflating different issues here, because my understanding so far of this PR is that it's trying to add signals to track times only (time for app start, for screen load, and for time on screen), which, at first glance, sounds like events should do.

That and I'm assuming the suggestion was to add these as span events? But I thought those are being deprecated?

I haven't kept track of this in detail, but my understanding is that the concept of an event being attached to a span as the only way to create events is what's changing, and now it should be possible to have standalone events, as well as events that are somehow attached to spans, and my guess is that it'd make the span-specific APIs to create events become redundant, maybe causing those to probably get deprecated? I don't have too many details on it though, so don't take my word for it.

Maybe this is a question of what is a span and what is a log event given there is some overlap between the two where a log event should be used in place of creating long running spans.

In my experience, these questions are easier to answer when there's a real-world use case to address, because it makes the end goal clear. For example, if the use case is to just track the duration of these events to query them later to do some calculation, such as the average load time, or things like that, to me for this case the answer is easily an event.

Technically, most spans could also just be defined as a log event and the discussion feels like it's heading towards that direction.

If we're creating a pair of events, one that tracks the start of something, and another that tracks the end of that something, I think we'd be overlapping with spans quite a lot (though leaving outside key span features such as trace context propagation). What I've proposed a couple of times is to have only an event that carries the whole duration of a process, because based on the descriptions, it seems like that's the question we're trying to answer.

If so, are we defining spans to be used only for tracking a process for its child processes? Or is the suggestion to combine both of these spans into one and have the timing defined as attributes (e.g. attributes.time_to_first_draw and attributes.time_to_first_stable_draw)?

What's the use case you're trying to address? Is it just having some sort of metrics that can be aggregated, or is it displaying a hierarchy of processes? And if the answer to that is "for now some metrics, but in the future we might need to display some of these processes as traces", then my questions would be: Is it needed to combine these use cases in single span/event? Also, is it needed to have everything defined upfront in this PR, or could further conventions be added in the future as needed?

model/app/spans.yaml

LikeTheSalad · 2025-10-01T09:32:08Z

model/app/spans.yaml

+      **Span kind** MUST be `CLIENT`.
+
+      This span captures the time from user initiation (e.g., tapping the app icon or opening a link)
+      to the moment the app is ready for interaction.


I think this description somewhat links this event to the "screen load" or "screen first time", in the sense that the duration is gathered up to the point where the "root screen" is either finished rendering, or finished fully loading. Though it's not clear which one of those cases it refers to.

No, the intention was not to link it to either as technically an application can be considered launched/started before the first screen has been loaded. My understanding is that for the Android SDK, the end of an app start is defined to be when the "Activity Resumed" life cycle event occurs for the first activity. Similarly, Apple provides didBecomeActiveNotification which could be used but doesn't necessarily mean a screen has loaded.

technically an application can be considered launched/started before the first screen has been loaded

The description reads:

"to the moment the app is ready for interaction"

How can it be possible that the app is ready for interaction without the first screen being loaded?

That's fair, I'll update to remove this specification and make it more broad/applicable.

Actually, to completely decouple the app start with the screen load, I've updated the definition so that app start completes the moment before UI rendering begins. This works for both Android (AppStart span ends after the first Activity finishes creating which is before UI rendering) and iOS according to their app launch sequence doc.

model/app/spans.yaml

model/app/registry.yaml

thompson-tomo

Blocking due to the time measurement units being used & attributes used on the event.

I still agree with @LikeTheSalad that most if not all the spans would be better suited as events. An argument could be made to have 1 span definition to represent a user requesting a screen as a root span but 3 seems over the top. In terms of capturing vitals of that span, I think referring to prior art (https://opentelemetry.io/docs/specs/semconv/browser/events/#webvital-event) is helpful to ensure a consistent experience, perhaps we call the event app.screen_vital.

model/app/registry.yaml

model/app/events.yaml

model/app/spans.yaml

…onv units

limhjgrace requested review from a team as code owners September 30, 2025 03:49

github-project-automation bot added this to Semantic Conventions Triage Sep 30, 2025

github-project-automation bot moved this to Untriaged in Semantic Conventions Triage Sep 30, 2025

github-actions bot added enhancement New feature or request area:app labels Sep 30, 2025

thompson-tomo reviewed Sep 30, 2025

View reviewed changes

lmolkova moved this from Untriaged to Awaiting codeowners approval in Semantic Conventions Triage Sep 30, 2025

LikeTheSalad requested changes Oct 1, 2025

View reviewed changes

github-project-automation bot moved this from Awaiting codeowners approval to Blocked in Semantic Conventions Triage Oct 1, 2025

LikeTheSalad mentioned this pull request Oct 3, 2025

feat(instrumentation): Add FirstDraw spans to activity instrumentation open-telemetry/opentelemetry-android#1281

Open

limhjgrace added 5 commits October 8, 2025 15:25

feat: Add spans for app launch, screen load/visible/time to first appear

c40ccc2

fix: Define as app start instead of app launch

ca52bcb

fix: Resolve lint errors

bf828ad

Update TTFA to TTFD, propose new definition for screen loads

a366135

chore: Generate doc updates for latest changes

a0c435c

limhjgrace force-pushed the feat/app-spans branch from 582ad9d to a0c435c Compare October 8, 2025 22:36

limhjgrace added 2 commits October 8, 2025 16:02

chore: Add doc files for new group ids

9588637

fix: Update changlog note with latest

10189ab

limhjgrace changed the title ~~feat: Add spans for app launch, screen load/visible/time to first appear~~ feat: Add spans for app start, screen load/visible/time to first draw Oct 8, 2025

limhjgrace added 3 commits October 8, 2025 16:15

fix: Remove trailing space to resolve lint failure

f17367e

fix: Update docs to sections to indicate singular span

2ca3338

fix: Update brief stability interval example to 150 ms in span def

395004f

thompson-tomo suggested changes Oct 9, 2025

View reviewed changes

williazz mentioned this pull request Oct 14, 2025

REQUEST: New membership for @williazz open-telemetry/community#3070

Closed

limhjgrace added 4 commits October 16, 2025 18:15

fix: Remove screen load span, other fixes to align with standard sevC…

f54572d

…onv units

fix: Remove trailing space to fix lint error

9d52689

fix : Revert back to app.screen.visible

a9b8f78

fix: Update changelog with latest changes

2214ca7

limhjgrace added 2 commits October 16, 2025 18:48

fix: Make time on screen into span instead of event

ea946a0

fix: Add newline to fix yaml lint error

6438bc1

limhjgrace changed the title ~~feat: Add spans for app start, screen load/visible/time to first draw~~ feat: Add spans for app start, time to first draw, time on screen Oct 17, 2025

fix :Update chlog file

c8f9774

		@@ -0,0 +1,97 @@
		groups:
		- id: span.app.screen.first_appear.client

feat: Add spans for app start, time to first draw, time on screen #2831

Are you sure you want to change the base?

feat: Add spans for app start, time to first draw, time on screen #2831

Conversation

limhjgrace commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Revisions

Merge requirement checklist

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

LikeTheSalad left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

limhjgrace Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

thompson-tomo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

limhjgrace commented Sep 30, 2025 •

edited

Loading

limhjgrace Oct 8, 2025 •

edited

Loading