-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add telemetry data catalog collection to wrangler deployments #12708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
c63b75e
e90fc7e
962bd4d
4660983
44121c7
c4c4ca8
2af4c3d
8e4707d
68e476f
03cdc2c
6667928
ad4bb45
2150394
d633274
6cf4b27
8d2f9b5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| --- | ||
| "@cloudflare/workers-utils": minor | ||
| --- | ||
|
|
||
| Improve `PackageJSON` type definition to use `Record<string, string>` for `dependencies`, `devDependencies`, and `scripts`. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| --- | ||
| "@cloudflare/workers-utils": minor | ||
| --- | ||
|
|
||
| Add `getTelemetryDataCatalogWorkerURL` environment variable factory | ||
|
|
||
| Adds a new environment variable factory for `TELEMETRY_DATA_CATALOG_WORKER_URL` which configures the telemetry data catalog endpoint. Also improves PackageJSON type definitions to use `Record<string, string>` for dependencies, devDependencies, and scripts fields. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| --- | ||
| "wrangler": minor | ||
| --- | ||
|
|
||
| Add telemetry data catalog collection to wrangler deployments | ||
|
|
||
| Wrangler now sends anonymized usage data to a telemetry data catalog during `wrangler deploy`. This includes: | ||
|
|
||
| - Account ID and Worker name | ||
emily-shen marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| - Wrangler version and package manager used | ||
| - Deployment timestamp | ||
| - Binding type counts (how many KV namespaces, R2 buckets, D1 databases, etc.) | ||
| - Public project dependency information (package names and versions from public packages in the package.json `dependencies` field) | ||
|
|
||
| This data helps the Cloudflare team understand Wrangler usage patterns. This telemetry respects the user's existing telemetry settings. | ||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -131,7 +131,10 @@ type VariableNames = | |||||
| | "DOCKER_HOST" | ||||||
|
|
||||||
| /** Environment variable used to signal that the current process is being run by the open-next deploy command. */ | ||||||
| | "OPEN_NEXT_DEPLOY"; | ||||||
| | "OPEN_NEXT_DEPLOY" | ||||||
|
|
||||||
| /** URL for the telemetry data catalog worker (defaults to `""` in vitest, the production data collector worker otherwise; falsy disables collection). */ | ||||||
| | "TELEMETRY_DATA_CATALOG_WORKER_URL"; | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
? to make it clear its for wrangler i guess
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was actually trying to distinguish this from "standard telemetry"... if we're going to have two different types of telemetry it'd be nice if we could have name them differently.... 🤔 PS:
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. package catalog? I do agree with Emily that we should count all of this stuff as the same currently. If we do break it up in the future we should do that more deliberately.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That's what you sort of initially suggested I think, the issue with that for me is that we also include the bindings count info though 😓 (which seems a bit unrelated to packages to me?)
sorry, when I said "two different types of telemetry" I meant this new telemetry data collection vs amplitude, these are going to be differently right of the bat, right?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. for a user, i don't think we should make any distinction in different types of telemetry but this env var is for internal use, and we might want to use it for stuff outside wrangler in the future, so idm whatever its called 🤷
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't have a super strong preference either 😅 I think it's unlikely for this to be used by other tools and not just wrangler... but it might happen? 🤷 I'm also happy with whatever name here 😄 |
||||||
|
|
||||||
| type DeprecatedNames = | ||||||
| | "CF_ACCOUNT_ID" | ||||||
|
|
||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,107 @@ | ||
| import { afterEach, beforeEach, describe, it, vi } from "vitest"; | ||
| import { getTelemetryDataCatalogWorkerURL } from "../../src/environment-variables/misc-variables"; | ||
|
|
||
| describe("getTelemetryDataCatalogWorkerURL", () => { | ||
| beforeEach(() => { | ||
| vi.unstubAllEnvs(); | ||
| }); | ||
|
|
||
| afterEach(() => { | ||
| vi.unstubAllEnvs(); | ||
| }); | ||
|
|
||
| it("should return the env URL if TELEMETRY_DATA_CATALOG_WORKER_URL is set", ({ | ||
| expect, | ||
| }) => { | ||
| vi.stubEnv( | ||
| "TELEMETRY_DATA_CATALOG_WORKER_URL", | ||
| "http://custom-telemetry.example.com" | ||
| ); | ||
|
|
||
| const url = getTelemetryDataCatalogWorkerURL({ | ||
| compliance_region: "public", | ||
| }); | ||
|
|
||
| expect(url).toBe("http://custom-telemetry.example.com"); | ||
| }); | ||
|
|
||
| it("should return empty string if TELEMETRY_DATA_CATALOG_WORKER_URL is set to empty string", ({ | ||
| expect, | ||
| }) => { | ||
| vi.stubEnv("TELEMETRY_DATA_CATALOG_WORKER_URL", ""); | ||
|
|
||
| const url = getTelemetryDataCatalogWorkerURL({ | ||
| compliance_region: "public", | ||
| }); | ||
|
|
||
| expect(url).toBe(""); | ||
| }); | ||
|
|
||
| it("should return undefined for fedramp_high compliance region when env URL is not set", ({ | ||
| expect, | ||
| }) => { | ||
| const url = getTelemetryDataCatalogWorkerURL({ | ||
| compliance_region: "fedramp_high", | ||
| }); | ||
|
|
||
| expect(url).toBeUndefined(); | ||
| }); | ||
|
|
||
| it("should return undefined for fedramp_high compliance region from env when env URL is not set", ({ | ||
| expect, | ||
| }) => { | ||
| vi.stubEnv("CLOUDFLARE_COMPLIANCE_REGION", "fedramp_high"); | ||
|
|
||
| const url = getTelemetryDataCatalogWorkerURL({}); | ||
|
|
||
| expect(url).toBeUndefined(); | ||
| }); | ||
|
|
||
| it("should return undefined for staging environment when env URL is not set", ({ | ||
| expect, | ||
| }) => { | ||
| vi.stubEnv("WRANGLER_API_ENVIRONMENT", "staging"); | ||
|
|
||
| const url = getTelemetryDataCatalogWorkerURL({ | ||
| compliance_region: "public", | ||
| }); | ||
|
|
||
| expect(url).toBeUndefined(); | ||
| }); | ||
|
|
||
| it("should honor env URL even for compliance regions", ({ expect }) => { | ||
| vi.stubEnv( | ||
| "TELEMETRY_DATA_CATALOG_WORKER_URL", | ||
| "http://custom-telemetry.example.com" | ||
| ); | ||
|
|
||
| const url = getTelemetryDataCatalogWorkerURL({ | ||
| compliance_region: "fedramp_high", | ||
| }); | ||
|
|
||
| // When the env URL is explicitly set, it takes precedence | ||
| expect(url).toBe("http://custom-telemetry.example.com"); | ||
| }); | ||
|
|
||
| it("should honor env URL even for staging environment", ({ expect }) => { | ||
| vi.stubEnv( | ||
| "TELEMETRY_DATA_CATALOG_WORKER_URL", | ||
| "http://custom-telemetry.example.com" | ||
| ); | ||
| vi.stubEnv("WRANGLER_API_ENVIRONMENT", "staging"); | ||
|
|
||
| const url = getTelemetryDataCatalogWorkerURL({ | ||
| compliance_region: "public", | ||
| }); | ||
|
|
||
| // When the env URL is explicitly set, it takes precedence | ||
| expect(url).toBe("http://custom-telemetry.example.com"); | ||
| }); | ||
|
|
||
| // Note: We cannot easily test the "typeof vitest !== 'undefined'" branch | ||
| // from within vitest itself, since vitest is always defined in this context. | ||
| // The production URL return case is effectively tested by the fact that | ||
| // we're running in vitest, so the function returns undefined instead of | ||
| // the production URL. This behavior is tested implicitly in other test suites | ||
| // that mock the TELEMETRY_DATA_CATALOG_WORKER_URL to avoid skipping the function. | ||
| }); |
dario-piotrowicz marked this conversation as resolved.
Show resolved
Hide resolved
|
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| declare global { | ||
| const vitest: unknown; | ||
| } | ||
| export {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we have a telemetry.md file, can you add this to that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe.... but in regards to
TELEMETRY_DATA_CATALOG_WORKER_URLI thought that maybe we could not surface this too much? it is mainly for us (for manual testing and unit testing). It would be part of the public API but users shouldn't really have a reason to mess with this (and if they want to disable this they could just do so by disabling all the wrangler telemetry). what do you think? 🙂Regarding the other info, I checked and I think
telemetry.mdmore or less generally included this type of information being collected, but I'll double check to see if that's actually the case 👍There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh i don't mean the env var, i meant it would be good to include this list of what we collect in the telemetry.md file for transparency - dependency info (and account ids) is the big new one i guess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah ok, sure 🙂👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Info added: 6b4c759
Please let me if you think this is good 🙂