-
Notifications
You must be signed in to change notification settings - Fork 21
fix(experimental): NewScopedClient, AddContext, OverwriteContextByKind emit usage events #310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you check with the event team to make sure they weren't considered about the volume of events?
(I don't speak for the event ingestion team - cc @rengawm @karayusuf - but I think the following is correct) I think the incremental volume is not a concern because this doesn't add any event emission to existing code paths, only to this new usage pattern, so only customers who
We should keep an eye on adoption (using this mechanism!) and if this does get serious adoption then we should prioritize those optimizations you mention. I have a few other thoughts not directly related to volume, coming in subsequent comments. |
// | ||
// For more information on how to use the scoped client, read the documentation | ||
// for LDScopedClient. | ||
// | ||
// This function is not stable, and not subject to any backwards compatibility | ||
// guarantees or semantic versioning. It is not suitable for production usage. Do | ||
// not use it. You have been warned. | ||
func NewScopedClient(client *LDClient, contexts ...ldcontext.Context) *LDScopedClient { | ||
func NewScopedClient(client *LDClient, context ldcontext.Context) *LDScopedClient { | ||
_ = client.TrackData("$ld:scoped:usage", context, ldvalue.String("new")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we send this telemetry using Track*
(i.e. as custom
events) then they will be visible to the customer - is that intended and desirable? e.g. they will show up in the customer's Event Explorer and Live Events.
Besides showing our laundry, I'd worry that the proliferation of these "meta" events might clutter those views and get in the way of customers seeing the actual events they are sending.
I don't think that's a deal breaker since it'll only affect customers who start using this, but it might be something we could improve on for v2.
As an alternative did you look at all about extending the "diagnostic events" that the SDK already sends (that I believe already has a summarization mechanism)? We'd need to check with Data Platform (@karayusuf and @rengawm again) but I think I heard they had plans to (or already were?) ingest diagnostic events into Clickhouse, where it would be straightforward to query them for this data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree I'm not a fan of the showing our laundry here. @wchieng mentioned this is reasonable for now and in line with the metrics team's vision, and in the future we can build filters so that the laundry isn't shown, thanks to the hardcoded keys here. This follows a precedent set in a similar AI usage event.
Some other alternatives I considered:
- Diagnostic events - this would be ideal end result because my use case lines up with what those were made for, but it would take a lot of refactoring to plumb this info through to the part of the SDK that sends the diagnostic events
- Summary events - I could fairly easily abuse the part of the SDK that sends
summary
events, to send some summary events about scoped client usage by recording an "evaluation" with a "flag key" that's actually a special key like$ld:scoped:usage
. Main concern with this are that it's pretty ugly & unusual, and I'm not totally sure about our clickhouse capability to use summary events as internal diagnostics like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although the ingest volume is probably not concerning (per other comment), one issue I could see is if a customer could clog up their SDK event queue with these "meta" events. That'd be especially unfortunate since the event queue doesn't have a sense of priority, so we could end up dropping important events like RG/experiment measurements.
It looks like the number of events is roughly one per "scope" (= per request, in most cases?) per context kind, which isn't excessive, but I believe is more than what a server-side SDK would currently be sending - 1 index
event per multi-context rather than per context kind, and per dedupe interval rather than per request. So a server-side SDK handling a bunch of requests concurrently might hit the max event queue size sooner than it would without this.
Again, not a deal breaker since this is limited to usage of this new feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you pointed out, both of the alternatives I mentioned have the advantage of summarizing the data instead of having distinct events to send about this usage, which is all I really need for basic usage tracking. However, the downsides felt significant enough to go either route.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The summary event idea is interesting. I think it would work in Clickhouse if done as you describe (fake flag key).
I think that it would be hard for customers to encounter the fake flag key, since they can't (e.g.) navigate to the flag page for the fake flag in order to see the fake evals chart.
A downside of that way would be you wouldn't get the granular events per added context. If you don't need those anyway, maybe you could do your own mini summarization in the method and just send one event per call to AddContext
(with ctx
being the current state of the context being built?) and a value
of the number of contexts added. That would cut down the event amplification factor a bit, although they could still just call AddContext
a bunch of times.
continue | ||
} | ||
c.addIndividualContext(ctx) | ||
_ = c.client.TrackData("$ld:scoped:usage", ctx, ldvalue.String("add")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are you planning to use the data
value (this string)? If you're just going to query these events in Clickhouse/Athena this is fine. I think our "metric filters" project might only work for map-valued data
(e.g. {"operation":"add"}
rather than just "add"
), but that would only be if we were creating a customer-visible metric for some of these events, which presumably we don't want to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I stole the idea of the single string from launchdarkly/python-server-sdk-ai#60. I figured this only needs to be distinct enough that any hand-rolled queries in Clickhouse/Athena can use it, but yeah, as you said, I'm not targeting our actual product ways of diving into this data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be pretty neat to extend diagnostic events to allow some external events into them. The idea with these is just to do them until we have an idea that customers are actually getting value from things. Then we should prune them.
So it would probably be good to make a task to come back and prune this once we are happy with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this is a temporary mechanism we don't need to leave in the SDK long term, and this is important functionality to get in customers' hands and get feedback on, I think this is fine to move forward with.
bbd0c91
to
7edbef5
Compare
🤖 I have created a release *beep* *boop* --- ## [7.13.3](v7.13.2...v7.13.3) (2025-08-20) ### Bug Fixes * **experimental:** NewScopedClient can now only take one context argument ([#309](#309)) ([b149429](b149429)) * **experimental:** NewScopedClient, AddContext, OverwriteContextByKind emit usage events ([#310](#310)) ([4a36949](4a36949)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Emit custom events
$ld:scoped:usage
wheneverNewScopedClient
,AddContext
,OverwriteContextByKind
are called.This will give us insight into usage patterns of this tool.
A future optimization should be to either sample these events or use a summarizer, depending on the demand for ScopedClient in super-high-volume applications.