Development Log

This file documents detailed technical changes, internal refactorings, and development notes. For user-facing highlights, see CHANGELOG.md.

Version 0.9.4 — Detailed Changes

New Features — Technical Details

Pipes Monitoring Plugin

Implemented PipesPlugin to monitor Snowpipe status and validation
Uses SYSTEM$PIPE_STATUS function for real-time pipe monitoring
Uses VALIDATE_PIPE_LOAD function for validation checks
Delivers telemetry as logs, metrics, and events

Streams Monitoring Plugin

Implemented StreamsPlugin to monitor Snowflake Streams
Tracks stream staleness using SHOW STREAMS output
Monitors pending changes and stream health
Reports stale streams as warning events

Stage Monitoring Plugin

Implemented StagePlugin to monitor staged data
Tracks internal and external stages
Monitors COPY INTO activities from QUERY_HISTORY and COPY_HISTORY views
Reports on staged file sizes, counts, and load patterns

Data Lineage Plugin

Implemented DataLineagePlugin combining static and dynamic lineage
Static lineage from OBJECT_DEPENDENCIES view (DDL-based relationships)
Dynamic lineage from ACCESS_HISTORY view (runtime data flow)
Column-level lineage tracking with direct and indirect dependencies
Lineage graphs delivered as structured events

SNOWFLAKE.TELEMETRY.EVENTS Support (BDX-1172)

Issue: When a customer account had EVENT_TABLE = snowflake.telemetry.events (the Snowflake-managed shared event table), SETUP_EVENT_TABLE() listed it in a_no_custom_event_t — the "not a real custom table" array — and took the IF branch, creating DSOA's own DTAGENT_DB.STATUS.EVENT_LOG table and ignoring the Snowflake-managed table entirely.
Root cause: 'snowflake.telemetry.events' was excluded from the view-creation path because the original ELSE branch attempted GRANT SELECT ON TABLE snowflake.telemetry.events TO ROLE DTAGENT_VIEWER, which Snowflake rejects — privileges cannot be granted on Snowflake-managed objects.
Fix: Two-part change in src/dtagent/plugins/event_log.sql/init/009_event_log_init.sql:
1. Removed 'snowflake.telemetry.events' from a_no_custom_event_t so it falls through to the ELSE branch
2. Wrapped the GRANT SELECT in a BEGIN/EXCEPTION WHEN OTHER THEN SYSTEM$LOG_WARN() block — attempts the grant and logs warnings, ignoring failures for any read-only or Snowflake-managed table; more robust than a string comparison
Behaviour after fix: When EVENT_TABLE = snowflake.telemetry.events, DSOA creates DTAGENT_DB.STATUS.EVENT_LOG as a view over it, exactly as for any other pre-existing customer event table. All three event_log SQL views continue to query DTAGENT_DB.STATUS.EVENT_LOG unchanged — no Python changes needed.

Configurable Lookback Time

Motivation: Lookback windows were hardcoded across SQL views in every plugin that uses F_LAST_PROCESSED_TS. This could not be tuned per deployment without modifying SQL files.
Approach: Replace each literal with CONFIG.F_GET_CONFIG_VALUE('plugins.<plugin>.lookback_hours', <default>) and add lookback_hours to each plugin's config YAML — consistent with how retention_hours is already handled in P_CLEANUP_EVENT_LOG.
Pattern: timeadd(hour, -1*F_GET_CONFIG_VALUE('plugins.<plugin>.lookback_hours', <N>), current_timestamp) — the -1* multiplier converts the positive config value to a negative offset.
Note: The F_LAST_PROCESSED_TS guard in each view's GREATEST(...) clause ensures normal incremental runs are unaffected; lookback_hours only bounds the fallback window when no prior timestamp exists.
Files changed (SQL views + config YAMLs):

Plugin	SQL view(s)	Default
`event_log`	`051_v_event_log.sql`, `051_v_event_log_metrics_instrumented.sql`, `051_v_event_log_spans_instrumented.sql`	`24`h
`login_history`	`061_v_login_history.sql`, `061_v_sessions.sql`	`24`h
`warehouse_usage`	`070_v_warehouse_event_history.sql`, `071_v_warehouse_load_history.sql`, `072_v_warehouse_metering_history.sql`	`24`h
`tasks`	`061_v_serverless_tasks.sql` → `lookback_hours` (`4`h); `063_v_task_versions.sql` → `lookback_hours_versions` (`720`h = 1 month)	separate keys, original defaults preserved
`event_usage`	`051_v_event_usage.sql`	`6`h
`data_schemas`	`051_v_data_schemas.sql`	`4`h

Bug Fixes — Technical Details

Dynamic Tables Grant — Schema-Level Granularity (BDX-640)

Issue: P_GRANT_MONITOR_DYNAMIC_TABLES() always granted MONITOR at database level, even when the include pattern specified a particular schema (e.g. PROD_DB.ANALYTICS.%). This caused the procedure to over-grant: a user expecting grants only on PROD_DB.ANALYTICS received grants on all schemas in PROD_DB.
Root cause: The CTE extracted only split_part(value, '.', 0) (the database part) and the schema part was never inspected.
Fix: Three-pass approach in 032_p_grant_monitor_dynamic_tables.sql:
1. Database pass — split_part(value, '.', 1) = '%' → GRANT … IN DATABASE.
2. Schema pass — split_part(value, '.', 1) != '%' and split_part(value, '.', 2) = '%' → GRANT … IN SCHEMA db.schema.
3. Table pass — split_part(value, '.', 1) != '%' and split_part(value, '.', 2) != '%' → GRANT … ON DYNAMIC TABLE db.schema.table (no FUTURE grant — not supported by Snowflake at individual table level).
Grant matrix:

Include pattern Grant level

%.%.% All databases

PROD_DB.%.% Database PROD_DB

PROD_DB.ANALYTICS.% Schema PROD_DB.ANALYTICS

PROD_DB.ANALYTICS.ORDERS_DT Table PROD_DB.ANALYTICS.ORDERS_DT
Files changed: 032_p_grant_monitor_dynamic_tables.sql, bom.yml, config.md
Tests added: test/bash/test_grant_monitor_dynamic_tables.bats — structural content checks covering both grant paths

Include pattern	Grant level
`%.%.%`	All databases
`PROD_DB.%.%`	Database `PROD_DB`
`PROD_DB.ANALYTICS.%`	Schema `PROD_DB.ANALYTICS`
`PROD_DB.ANALYTICS.ORDERS_DT`	Table `PROD_DB.ANALYTICS.ORDERS_DT`

Log ObservedTimestamp Unit Correction

Issue: OTel log observed_timestamp field was sent in milliseconds
Root cause: OTLP spec requires nanoseconds for observed_timestamp, but code was converting to milliseconds
Fix: Modified process_timestamps_for_telemetry() to return observed_timestamp_ns in nanoseconds
Impact: Logs now comply with OTLP spec
Note: Dynatrace OTLP Logs API still requires milliseconds for timestamp field (deviation from spec)

Inbound Shares Reporting Flag

Issue: HAS_DB_DELETED flag incorrectly reported for deleted shared databases in TMP_SHARES view
Root cause: Logic error in SQL view predicate
Fix: Corrected SQL logic in shares.sql/ view definition
Impact: Accurate reporting of deleted shared database status

Self-Monitoring Log Filtering

Issue: Database name filtering logic failed to correctly identify DTAGENT_DB references
Root cause: String matching logic didn't account for fully qualified names
Fix: Updated filtering logic in self-monitoring plugin
Impact: Self-monitoring logs now correctly exclude internal agent operations

Improvements — Technical Details

Timestamp Handling Refactoring

Motivation: Eliminate wasteful ns→ms→ns conversions and clarify API requirements
Approach: Unified timestamp handling with smart unit detection
Implementation:
- All SQL views produce nanoseconds via extract(epoch_nanosecond ...)
- Conversion to appropriate unit occurs only at API boundary
- validate_timestamp() works internally in nanoseconds to preserve precision
- Added return_unit parameter ("ms" or "ns") for explicit output control
- Added skip_range_validation parameter for observed_timestamp (no time range check)
- Created process_timestamps_for_telemetry() utility for standard timestamp processing pattern
Changes to validate_timestamp():
- Works internally in nanoseconds throughout validation logic
- Converts to requested unit only at the end
- Raises ValueError if return_unit not in ["ms", "ns"]
- Added skip_range_validation for observed_timestamp (preserves original value without range checks)
Changes to process_timestamps_for_telemetry():
- New utility function implementing standard pattern for logs and events
- Extracts timestamp and observed_timestamp from data dict
- Falls back to timestamp value when observed_timestamp not provided
- Validates timestamp with range checking (returns milliseconds)
- Validates observed_timestamp without range checking (returns nanoseconds)
- Returns (timestamp_ms, observed_timestamp_ns) tuple
- Hardcoded units: always milliseconds for timestamp, nanoseconds for observed_timestamp
Removed obsolete functions:
- get_timestamp_in_ms() — replaced by validate_timestamp(value, return_unit="ms")
- validate_timestamp_ms() — replaced by validate_timestamp(value, return_unit="ms")
Added new functions:
- get_timestamp() — returns nanoseconds from SQL query results
API Documentation:
- Added comprehensive documentation links in all telemetry classes
- Documented Dynatrace OTLP Logs API deviation (milliseconds for timestamp field)
- Documented OTLP standard requirements (nanoseconds for most timestamp fields)
Fallback Logic:
- observed_timestamp now correctly falls back to timestamp value when not provided
- Only event_log plugin provides explicit observed_timestamp values
- All other plugins rely on fallback mechanism

Build System Virtual Environment

Change: All scripts/dev/ scripts now auto-activate .venv/
Implementation: Added source .venv/bin/activate to script preambles
Impact: Eliminates common "wrong Python" errors during development

Documentation — Autogenerated Files

Change: Updated .github/copilot-instructions.md with autogenerated file documentation
Coverage:
- Documentation files: docs/PLUGINS.md, docs/SEMANTICS.md, docs/APPENDIX.md
- Build artifacts: build/_dtagent.py, build/_send_telemetry.py, build/_semantics.py, build/_version.py, build/_metric_semantics.txt
Guidance: Never edit autogenerated files manually; edit source files and regenerate

Budgets Plugin Enhancement

Change: Enhanced budget data collection using SYSTEM$SHOW_BUDGETS_IN_ACCOUNT()
Previous: Manual query construction
New: Leverages Snowflake system function for comprehensive budget data
Impact: More accurate and complete budget information

Error Handling — Two-Phase Commit for Query Telemetry (BDX-694 / BDX-706)

Issue: STATUS.UPDATE_PROCESSED_QUERIES was called regardless of whether the OTLP trace flush succeeded, meaning queries could be silently lost on export failures without being retried on the next cycle.
Root cause: _process_span_rows in src/dtagent/plugins/__init__.py called UPDATE_PROCESSED_QUERIES unconditionally after flush_traces().
Fix: Captured the boolean return value of flush_traces() into flush_succeeded and gated the UPDATE_PROCESSED_QUERIES call behind if report_status and flush_succeeded.
Impact: Queries whose spans fail to export are re-queued on the next agent run, ensuring at-least-once delivery semantics for span telemetry.

Event Log Lookback — Configurable Window (BDX-706)

Issue: V_EVENT_LOG used a hardcoded timeadd(hour, -24, current_timestamp) lower bound, preventing operators from adjusting the lookback window without editing SQL.
Fix:
- src/dtagent/plugins/event_log.sql/051_v_event_log.sql: replaced literal with CONFIG.F_GET_CONFIG_VALUE('plugins.event_log.lookback_hours', 24)::int.
- src/dtagent/plugins/event_log.config/event_log-config.yml: added lookback_hours: 24 (default preserves prior behaviour).
Impact: Operators can increase the window for initial deployments or decrease it for high-volume environments without any SQL change.

Query Hierarchy Validation (BDX-620)

Goal: Confirm that nested stored procedure call chains are correctly represented as OTel parent-child spans.
Validation approach:
- P_REFRESH_RECENT_QUERIES sets IS_ROOT=TRUE for top-level calls (no parent_query_id) and IS_PARENT=TRUE for any query that has at least one child in the same batch. Leaf queries have IS_ROOT=FALSE, IS_PARENT=FALSE.
- _process_span_rows in src/dtagent/plugins/__init__.py iterates only IS_ROOT=TRUE rows as top-level spans; child spans are fetched recursively via Spans._get_sub_rows using PARENT_QUERY_ID.
- ExistingIdGenerator in src/dtagent/otel/spans.py propagates the root's _TRACE_ID and _SPAN_ID down the hierarchy so every sub-span shares the correct trace context.
New test fixture: test/test_data/query_history_nested_sp.ndjson — 3-row synthetic SP chain: outer SP (root) → inner SP (mid) → leaf SELECT.
New test file: test/plugins/test_query_history_span_hierarchy.py
- test_span_hierarchy: integration test verifying 3 entries processed, 3 spans, 3 logs, 27 metrics across all disabled_telemetry combinations.
- test_is_root_only_processes_top_level: unit test confirming only 1 root row and 2 non-root rows in the fixture.
- test_is_parent_flags_intermediate_nodes: unit test asserting correct IS_ROOT/IS_PARENT/PARENT_QUERY_ID values for each level of the hierarchy.
Impact: Span hierarchies for stored procedure chains are confirmed correct and regression-protected.

Test Infrastructure Refactoring

Change: Refactored tests to use synthetic JSON fixtures
Previous: Live Dynatrace API calls for validation
New: Input/output validation against golden JSON files
Impact: Faster, more reliable, deterministic tests

Event Tables Cost Optimization Documentation (BDX-688)

Change: Expanded event_log.config/config.md from a minimal 5-line note to a full configuration reference
Content added:
- Configuration options table covering all 7 plugin settings with types, defaults, and descriptions
- Cost optimization guidance section explaining the cost impact of LOOKBACK_HOURS, MAX_ENTRIES, RETENTION_HOURS, and SCHEDULE
- Key guidance: retention_hours should be >= lookback_hours to prevent cleanup from removing events before they are processed
Files changed:
- src/dtagent/plugins/event_log.config/config.md — full configuration reference + cost guidance
- src/dtagent/plugins/event_log.config/readme.md — updated to mention configurable lookback window

Span Timestamp Handling Fix (BDX-706)

Issue: _process_span_rows() in src/dtagent/plugins/__init__.py called _report_execution() with current_timestamp() (a Snowflake lazy column expression) instead of the actual last-row timestamp.
Root cause: When STATUS.LOG_PROCESSED_MEASUREMENTS stored this value, it received the string 'Column[current_timestamp]' rather than a real timestamp. On the next run, F_LAST_PROCESSED_TS would return a malformed value, causing the GREATEST(...) guard in each SQL view to use the fallback lookback window — potentially re-processing spans already sent.
Fix: Added last_processed_timestamp variable tracking row_dict.get("TIMESTAMP", last_processed_timestamp) within the row iteration loop, mirroring the identical pattern used by _log_entries(). Passed str(last_processed_timestamp) to _report_execution() instead of current_timestamp().
Side effect removed: Dropped the now-unused from snowflake.snowpark.functions import current_timestamp import — pylint flagged this as unused after the fix.
Impact: Spans and traces will no longer be re-processed after an agent restart. The F_LAST_PROCESSED_TS('event_log_spans') guard now advances correctly after each run.
Affects: event_log plugin (_process_span_entries) and any future plugin using _process_span_rows with log_completion=True

Version 0.9.3 — Detailed Changes

Detailed technical changes for prior versions can be added here as needed.

Version 0.9.2 — Detailed Changes

Detailed technical changes for prior versions can be added here as needed.

Notes

This file is not auto-generated. Manual maintenance required.
Focus on technical implementation details, root causes, and internal changes.
For user-facing release notes, see CHANGELOG.md.
Entries should help future developers understand decisions and troubleshoot issues.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Development Log

Version 0.9.4 — Detailed Changes

New Features — Technical Details

Pipes Monitoring Plugin

Streams Monitoring Plugin

Stage Monitoring Plugin

Data Lineage Plugin

SNOWFLAKE.TELEMETRY.EVENTS Support (BDX-1172)

Configurable Lookback Time

Bug Fixes — Technical Details

Dynamic Tables Grant — Schema-Level Granularity (BDX-640)

Log ObservedTimestamp Unit Correction

Inbound Shares Reporting Flag

Self-Monitoring Log Filtering

Improvements — Technical Details

Timestamp Handling Refactoring

Build System Virtual Environment

Documentation — Autogenerated Files

Budgets Plugin Enhancement

Error Handling — Two-Phase Commit for Query Telemetry (BDX-694 / BDX-706)

Event Log Lookback — Configurable Window (BDX-706)

Query Hierarchy Validation (BDX-620)

Test Infrastructure Refactoring

Event Tables Cost Optimization Documentation (BDX-688)

Span Timestamp Handling Fix (BDX-706)

Version 0.9.3 — Detailed Changes

Version 0.9.2 — Detailed Changes

Notes

FilesExpand file tree

DEVLOG.md

Latest commit

History

DEVLOG.md

File metadata and controls

Development Log

Version 0.9.4 — Detailed Changes

New Features — Technical Details

Pipes Monitoring Plugin

Streams Monitoring Plugin

Stage Monitoring Plugin

Data Lineage Plugin

SNOWFLAKE.TELEMETRY.EVENTS Support (BDX-1172)

Configurable Lookback Time

Bug Fixes — Technical Details

Dynamic Tables Grant — Schema-Level Granularity (BDX-640)

Log ObservedTimestamp Unit Correction

Inbound Shares Reporting Flag

Self-Monitoring Log Filtering

Improvements — Technical Details

Timestamp Handling Refactoring

Build System Virtual Environment

Documentation — Autogenerated Files

Budgets Plugin Enhancement

Error Handling — Two-Phase Commit for Query Telemetry (BDX-694 / BDX-706)

Event Log Lookback — Configurable Window (BDX-706)

Query Hierarchy Validation (BDX-620)

Test Infrastructure Refactoring

Event Tables Cost Optimization Documentation (BDX-688)

Span Timestamp Handling Fix (BDX-706)

Version 0.9.3 — Detailed Changes

Version 0.9.2 — Detailed Changes

Notes