docs(sdk): add database query params to data-collection spec#18570
docs(sdk): add database query params to data-collection spec#18570ericapisani wants to merge 6 commits into
Conversation
Within the Python SDK, the existing `send_default_pii` gates the inclusion of database parameters in spans and events for integrations such as `pymongo`, `redis`, and `clickhouse_driver`. There were previously no parameters within the spec that allowed for control over the inclusion or filtering of these values, so introducing these changes to close that gap.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
1 Skipped Deployment
|
| database: { | ||
| queryParams: false, | ||
| }, |
There was a problem hiding this comment.
If we add this, this should also be a required part of the commented out docs/on-boarding snippet. We should add a section to this page for this, currently it's userInfo, httpBodies and now this.
For naming, databaseQueryParams might be better, not sure we need the namespace.
There was a problem hiding this comment.
For naming, databaseQueryParams might be better, not sure we need the namespace.
I considered something like this but opted for the dict to make it easier to extend in the future should we need to for whatever reason - I didn't want to take the chance and require us to introduce an unnecessary breaking change if something needed to e added when we could've done this at the beginning.
There was a problem hiding this comment.
@cleptric could you give me some more clarification on which sections you're referring to in:
If we add this, this should also be a required part of the commented out docs/on-boarding snippet.
The database setting has been added the example code snippets already, and in terms of other potential sections - there's a couple possible ones that I think you're referring to and I want to make sure I'm not missing anything/updating the right spots. The ones that I think you're referring to are:
- Setup Snippet Opt-out Comment (include database option)
- Creating a new section alongside the "Request and Response Bodies", "Cookies and Query Params", etc. that explains what's captured by the
databasesetting.
Is there anything else?
| | `queryParams` | Key-value collection | `{ mode: "denyList" }` | 0.1.0 | Collect URL query parameters. All key names are always included; the SDK scrubs values for keys matching the sensitive denylist or custom allow/deny terms. | | ||
| | `graphql` | `{ document?, variables? }` | Both `true` | 0.5.0 | For `document`: Collect the GraphQL document. <br /><br /> For `variables`: Collect the variables that are passed to GraphQL operations. | | ||
| | `genAI` | `{ inputs?, outputs? }` | Both `true` | 0.1.0 | For `inputs`: Include the content of generative AI inputs (e.g. prompt text, tool call arguments). <br /><br /> For `outputs`: Include the content of generative AI outputs (e.g. completion text, tool call results). Metadata such as model name and token counts is always collected regardless of these settings. | | ||
| | `database` | `{ queryParams? }` | `true` | 0.6.0 | Include parameters/arguments passed to database queries. Setting this to false will either omit the value altogether, or replace the value with '[Filtered]'. | |
There was a problem hiding this comment.
Bug: The default value for the database option is documented as true, which is inconsistent and ambiguous compared to other nested options like graphql.
Severity: LOW
Suggested Fix
Update the default value column for the database option to be consistent with other nested options. For example, change true to queryParams: true to clarify the default applies to the nested field. Also, clarify the description to explicitly mention queryParams.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: develop-docs/sdk/foundations/client/data-collection/index.mdx#L561
Potential issue: In the options table, the `database` option's default value is
documented as `true`. This is ambiguous and inconsistent with how similar nested options
like `graphql` and `genAI` are documented, which use `Both true` to clarify that their
nested properties default to `true`. This could lead SDK implementors to misinterpret
the API. The accompanying description, "Setting this to false...", is also unclear about
whether "this" refers to the parent `database` object or the nested `queryParams` field.
While the pseudocode section is correct, the table should be self-consistent to avoid
confusion.
Did we get this right? 👍 / 👎 to inform future reviews.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Want reviews to match your repository better? Bugbot Learning can learn team-specific rules from PR activity. A team admin can enable Learning in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit d1e4b2b. Configure here.
| | `queryParams` | Key-value collection | `{ mode: "denyList" }` | 0.1.0 | Collect URL query parameters. All key names are always included; the SDK scrubs values for keys matching the sensitive denylist or custom allow/deny terms. | | ||
| | `graphql` | `{ document?, variables? }` | Both `true` | 0.5.0 | For `document`: Collect the GraphQL document. <br /><br /> For `variables`: Collect the variables that are passed to GraphQL operations. | | ||
| | `genAI` | `{ inputs?, outputs? }` | Both `true` | 0.1.0 | For `inputs`: Include the content of generative AI inputs (e.g. prompt text, tool call arguments). <br /><br /> For `outputs`: Include the content of generative AI outputs (e.g. completion text, tool call results). Metadata such as model name and token counts is always collected regardless of these settings. | | ||
| | `database` | `{ queryParams? }` | `true` | 0.6.0 | Include parameters/arguments passed to database queries. Setting this to false will either omit the value altogether, or replace the value with '[Filtered]'. | |
There was a problem hiding this comment.
Boolean false behavior contradicts spec
Medium Severity
The new database option description says queryParams: false may omit values or replace them with [Filtered], but the Boolean Options section defines false as not collecting the category at all. SDKs can implement incompatible opt-out behavior for the same setting.
Reviewed by Cursor Bugbot for commit d1e4b2b. Configure here.
| | `queryParams` | Key-value collection | `{ mode: "denyList" }` | 0.1.0 | Collect URL query parameters. All key names are always included; the SDK scrubs values for keys matching the sensitive denylist or custom allow/deny terms. | | ||
| | `graphql` | `{ document?, variables? }` | Both `true` | 0.5.0 | For `document`: Collect the GraphQL document. <br /><br /> For `variables`: Collect the variables that are passed to GraphQL operations. | | ||
| | `genAI` | `{ inputs?, outputs? }` | Both `true` | 0.1.0 | For `inputs`: Include the content of generative AI inputs (e.g. prompt text, tool call arguments). <br /><br /> For `outputs`: Include the content of generative AI outputs (e.g. completion text, tool call results). Metadata such as model name and token counts is always collected regardless of these settings. | | ||
| | `database` | `{ queryParams? }` | `true` | 0.6.0 | Include parameters/arguments passed to database queries. Setting this to false will either omit the value altogether, or replace the value with '[Filtered]'. | |
There was a problem hiding this comment.
Bug: The description for database.queryParams when false is ambiguous, stating it could either omit the value or replace it with '[Filtered]', contradicting the spec's definition for boolean options.
Severity: MEDIUM
Suggested Fix
Update the description for database.queryParams when set to false to align with the standard behavior for boolean options. It should clearly state that setting it to false means the data category will not be collected, removing the 'either omit... or replace' ambiguity.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: develop-docs/sdk/foundations/client/data-collection/index.mdx#L561
Potential issue: The documentation for `database.queryParams` provides an ambiguous
definition for when the option is set to `false`. It states the value will be 'either
omit the value altogether, or replace the value with '[Filtered]''. This conflicts with
the general specification for boolean options, which dictates that `false` should mean
'Do not collect this data category'. This ambiguity could lead to different SDKs
implementing the feature inconsistently, defeating the purpose of a unified
specification.


Adds
database.queryParamsas a new configuration option to the SDK data-collection spec. This covers the ability for SDKs to include or omit parameters passed to database queries, consistent with howqueryParamsworks for HTTP and howgenAIinputs/outputs are handled.IS YOUR CHANGE URGENT?
Help us prioritize incoming PRs by letting us know when the change needs to go live.
SLA
Thanks in advance for your help!
PRE-MERGE CHECKLIST
Make sure you've checked the following before merging your changes:
Fixes PY-2562