Skip to content

Commit a470127

Browse files
Dev/skruk/imp event log (#64)
* fix: Correct span timestamp tracking to prevent re-processing after agent restart - Track actual last row timestamp in `_process_span_rows()` instead of using `current_timestamp()` column expression - Remove unused `current_timestamp` import from snowflake.snowpark.functions - Add `lookback_hours` configuration parameter to event_log plugin (default: 24) - Update all three event_log SQL views to use configurable lookback window - Expand event_log configuration documentation with cost optimization guidance * feat: Add support for SNOWFLAKE.TELEMETRY.EVENTS as account-level event table - Remove 'snowflake.telemetry.events' from excluded event tables array in SETUP_EVENT_TABLE() - Wrap GRANT SELECT statement in exception handler to silently ignore failures on read-only or Snowflake-managed tables - Create DTAGENT_DB.STATUS.EVENT_LOG as view over snowflake.telemetry.events when configured - Update CHANGELOG and DEVLOG with BDX-1172 fix details and behavior explanation * feat: Add configurable lookback_hours parameter to all plugins with incremental views - Add `lookback_hours` configuration parameter to data_schemas (4h), event_usage (6h), login_history (24h), tasks (4h), and warehouse_usage (24h) plugins - Replace hardcoded lookback values in SQL views with `F_GET_CONFIG_VALUE('plugins.<plugin>.lookback_hours', <default>)` calls - Unify tasks plugin to use single `lookback_hours` parameter for both serverless_tasks and task_versions views (previously 4h and 1 * refactor: Split tasks plugin lookback configuration into separate keys for serverless and version views - Add `lookback_hours_versions` parameter (default: 720h = 1 month) to tasks plugin configuration - Update `063_v_task_versions.sql` to use `plugins.tasks.lookback_hours_versions` instead of shared `lookback_hours` - Preserve original defaults: `lookback_hours` remains 4h for serverless tasks, `lookback_hours_versions` uses 720h for task versions - Update documentation to clarify separate look * chore: Update markdown linting to use markdownlint-cli2 and adjust configuration tables for consistency * feat: Update documentation and configuration for plugins - Enhanced the `INSTALL.md` and `PLUGINS.md` files to clarify the `lookback_hours` parameter and its behavior across various plugins. - Added detailed configuration options for the `data_schemas`, `event_log`, `event_usage`, `login_history`, `tasks`, and `warehouse_usage` plugins, ensuring consistency in descriptions and default values. - Updated the SQL initialization script for the event log to improve clarity on table handling. - Adjusted the `event_log-config.yml` to modify default values for `lookback_hours` and `retention_hours`. * default lookback was 24h - adjusting to avoid product change * feat: Update event log plugin to log warnings on grant failures and adjust lookback and retention hours in configuration
1 parent 3757d85 commit a470127

34 files changed

+246
-47
lines changed

.github/copilot-instructions.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -148,6 +148,8 @@ scripts/dev/test.sh
148148
## 📖 Documentation (MANDATORY)
149149

150150
Documentation is a first-class deliverable. Update relevant docs with every change.
151+
**Important:** Always update documentation with `./scripts/update_docs.sh` when making changes to the codebase.
152+
**Never** update `docs/PLUGINS.md` or `docs/SEMANTICS.md` directly; use plugin-specific files instead.
151153

152154
### What to Update
153155

@@ -159,7 +161,8 @@ Documentation is a first-class deliverable. Update relevant docs with every chan
159161
| New version / release | `docs/CHANGELOG.md` (user-facing highlights), `docs/DEVLOG.md` (technical details) |
160162
| Config change | `conf/config-template.yml`, plugin's `{name}-config.yml` |
161163

162-
**Note:** Do not update `docs/PLUGINS.md` or `docs/SEMANTICS.md` as those are generated automatically.
164+
**Note:** Do not update `docs/PLUGINS.md` or `docs/SEMANTICS.md` as those are generated automatically;
165+
use `readme.md` and `config.md` files in plugin directories instead.
163166

164167
### CHANGELOG vs DEVLOG
165168

@@ -226,7 +229,7 @@ DEVLOG.md:
226229

227230
### Other Documentation Requirements
228231

229-
- **Docstrings**: Google style, required for all public modules/classes/functions in `src/`
232+
- **Docstrings**: Google style, required for all public modules/classes/functions in `src/`, table columns width aligned
230233
- **BOM**: Each plugin ships `bom.yml` listing delivered/referenced Snowflake objects (validated against `test/src-bom.schema.json`)
231234

232235
## 🔧 Build & CI/CD

Makefile

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,4 +22,7 @@ lint-bom:
2222
find src -name "bom.yml" -exec sh -c 'printf "%-50s " "$$1"; .venv/bin/check-jsonschema --schemafile test/src-bom.schema.json "$$1" || check-jsonschema --schemafile test/src-bom.schema.json "$$1"' _ {} \;
2323

2424
# Run all linting checks (stops on first failure, like CI)
25-
lint: lint-python lint-format lint-pylint lint-sql lint-yaml lint-markdown lint-bom
25+
lint: lint-python lint-format lint-pylint lint-sql lint-yaml lint-markdown lint-bom
26+
27+
docs:
28+
./scripts/dev/build_docs.sh

docs/CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,11 @@ Released on TBD
1212

1313
- **New Plugins**: Added Pipes, Streams, Stage, and Data Lineage monitoring plugins
1414
- **Configurable Lookback Time**: Per-plugin configuration for historical data catchup window
15+
- **SNOWFLAKE.TELEMETRY.EVENTS Support**: Agent now correctly reads from the Snowflake-managed shared event table when it is configured as the account-level event table
1516

1617
### Fixed in 0.9.4
1718

19+
- **Span Timestamp Handling**: Fixed spans being re-processed after agent restart due to incorrect timestamp being recorded as last-processed marker
1820
- **OTLP Compliance**: Fixed log `observed_timestamp` field to use nanoseconds per OTLP specification
1921

2022
### Changed in 0.9.4

docs/DEVLOG.md

Lines changed: 42 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -35,12 +35,31 @@ This file documents detailed technical changes, internal refactorings, and devel
3535
- Column-level lineage tracking with direct and indirect dependencies
3636
- Lineage graphs delivered as structured events
3737

38+
#### SNOWFLAKE.TELEMETRY.EVENTS Support (BDX-1172)
39+
40+
- **Issue**: When a customer account had `EVENT_TABLE = snowflake.telemetry.events` (the Snowflake-managed shared event table), `SETUP_EVENT_TABLE()` listed it in `a_no_custom_event_t` — the "not a real custom table" array — and took the `IF` branch, creating DSOA's own `DTAGENT_DB.STATUS.EVENT_LOG` table and **ignoring** the Snowflake-managed table entirely.
41+
- **Root cause**: `'snowflake.telemetry.events'` was excluded from the view-creation path because the original `ELSE` branch attempted `GRANT SELECT ON TABLE snowflake.telemetry.events TO ROLE DTAGENT_VIEWER`, which Snowflake rejects — privileges cannot be granted on Snowflake-managed objects.
42+
- **Fix**: Two-part change in `src/dtagent/plugins/event_log.sql/init/009_event_log_init.sql`:
43+
1. Removed `'snowflake.telemetry.events'` from `a_no_custom_event_t` so it falls through to the `ELSE` branch
44+
2. Wrapped the `GRANT SELECT` in a `BEGIN/EXCEPTION WHEN OTHER THEN SYSTEM$LOG_WARN()` block — attempts the grant and logs warnings, ignoring failures for any read-only or Snowflake-managed table; more robust than a string comparison
45+
- **Behaviour after fix**: When `EVENT_TABLE = snowflake.telemetry.events`, DSOA creates `DTAGENT_DB.STATUS.EVENT_LOG` as a **view** over it, exactly as for any other pre-existing customer event table. All three `event_log` SQL views continue to query `DTAGENT_DB.STATUS.EVENT_LOG` unchanged — no Python changes needed.
46+
3847
#### Configurable Lookback Time
3948

40-
- Added per-plugin `lookback_hours` configuration parameter
41-
- Previously hardcoded to 24 hours for all plugins
42-
- Allows fine-tuning historical data catchup per use case
43-
- Default remains 24 hours for backward compatibility
49+
- **Motivation**: Lookback windows were hardcoded across SQL views in every plugin that uses `F_LAST_PROCESSED_TS`. This could not be tuned per deployment without modifying SQL files.
50+
- **Approach**: Replace each literal with `CONFIG.F_GET_CONFIG_VALUE('plugins.<plugin>.lookback_hours', <default>)` and add `lookback_hours` to each plugin's config YAML — consistent with how `retention_hours` is already handled in `P_CLEANUP_EVENT_LOG`.
51+
- **Pattern**: `timeadd(hour, -1*F_GET_CONFIG_VALUE('plugins.<plugin>.lookback_hours', <N>), current_timestamp)` — the `-1*` multiplier converts the positive config value to a negative offset.
52+
- **Note**: The `F_LAST_PROCESSED_TS` guard in each view's `GREATEST(...)` clause ensures normal incremental runs are unaffected; `lookback_hours` only bounds the fallback window when no prior timestamp exists.
53+
- **Files changed** (SQL views + config YAMLs):
54+
55+
| Plugin | SQL view(s) | Default |
56+
| ----------------- | -------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------ |
57+
| `event_log` | `051_v_event_log.sql`, `051_v_event_log_metrics_instrumented.sql`, `051_v_event_log_spans_instrumented.sql` | `24`h |
58+
| `login_history` | `061_v_login_history.sql`, `061_v_sessions.sql` | `24`h |
59+
| `warehouse_usage` | `070_v_warehouse_event_history.sql`, `071_v_warehouse_load_history.sql`, `072_v_warehouse_metering_history.sql` | `24`h |
60+
| `tasks` | `061_v_serverless_tasks.sql``lookback_hours` (`4`h); `063_v_task_versions.sql``lookback_hours_versions` (`720`h = 1 month) | separate keys, original defaults preserved |
61+
| `event_usage` | `051_v_event_usage.sql` | `6`h |
62+
| `data_schemas` | `051_v_data_schemas.sql` | `4`h |
4463

4564
### Bug Fixes — Technical Details
4665

@@ -163,12 +182,25 @@ This file documents detailed technical changes, internal refactorings, and devel
163182
- **New**: Input/output validation against golden JSON files
164183
- **Impact**: Faster, more reliable, deterministic tests
165184

166-
#### Event Tables Cost Optimization
167-
168-
- **Change**: Added guidance documentation for Event Table usage
169-
- **Content**: Fine-tuning strategies to manage Snowflake costs
170-
- **Location**: Plugin documentation and configuration examples
171-
- **Impact**: Users can optimize costs based on their use case
185+
#### Event Tables Cost Optimization Documentation (BDX-688)
186+
187+
- **Change**: Expanded `event_log.config/config.md` from a minimal 5-line note to a full configuration reference
188+
- **Content added**:
189+
- Configuration options table covering all 7 plugin settings with types, defaults, and descriptions
190+
- Cost optimization guidance section explaining the cost impact of `LOOKBACK_HOURS`, `MAX_ENTRIES`, `RETENTION_HOURS`, and `SCHEDULE`
191+
- Key guidance: `retention_hours` should be `>= lookback_hours` to prevent cleanup from removing events before they are processed
192+
- **Files changed**:
193+
- `src/dtagent/plugins/event_log.config/config.md` — full configuration reference + cost guidance
194+
- `src/dtagent/plugins/event_log.config/readme.md` — updated to mention configurable lookback window
195+
196+
#### Span Timestamp Handling Fix (BDX-706)
197+
198+
- **Issue**: `_process_span_rows()` in `src/dtagent/plugins/__init__.py` called `_report_execution()` with `current_timestamp()` (a Snowflake lazy column expression) instead of the actual last-row timestamp.
199+
- **Root cause**: When `STATUS.LOG_PROCESSED_MEASUREMENTS` stored this value, it received the string `'Column[current_timestamp]'` rather than a real timestamp. On the next run, `F_LAST_PROCESSED_TS` would return a malformed value, causing the `GREATEST(...)` guard in each SQL view to use the fallback lookback window — potentially re-processing spans already sent.
200+
- **Fix**: Added `last_processed_timestamp` variable tracking `row_dict.get("TIMESTAMP", last_processed_timestamp)` within the row iteration loop, mirroring the identical pattern used by `_log_entries()`. Passed `str(last_processed_timestamp)` to `_report_execution()` instead of `current_timestamp()`.
201+
- **Side effect removed**: Dropped the now-unused `from snowflake.snowpark.functions import current_timestamp` import — pylint flagged this as unused after the fix.
202+
- **Impact**: Spans and traces will no longer be re-processed after an agent restart. The `F_LAST_PROCESSED_TS('event_log_spans')` guard now advances correctly after each run.
203+
- **Affects**: `event_log` plugin (`_process_span_entries`) and any future plugin using `_process_span_rows` with `log_completion=True`
172204

173205
## Version 0.9.3 — Detailed Changes
174206

docs/INSTALL.md

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -779,7 +779,17 @@ The `plugins` section allows you to configure plugin behavior globally and indiv
779779
| `plugins.disabled_by_default` | Boolean | `false` | When set to `true`, all plugins are disabled by default unless explicitly enabled |
780780
| `plugins.deploy_disabled_plugins` | Boolean | `true` | Deploy plugin code even if the plugin is disabled. When `true`, disabled plugins' SQL objects and procedures are deployed but not scheduled to run |
781781
782-
Each individual plugin can be configured with plugin-specific options. See the plugin documentation for available configuration options per plugin.
782+
Each individual plugin supports the following common configuration keys (set under `plugins.<plugin_name>`):
783+
784+
| Configuration Key | Type | Default | Description |
785+
| ----------------- | ------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
786+
| `lookback_hours` | Integer | varies | Maximum lookback window (in hours) the plugin uses when scanning for new data on each run. The effective start time is the later of `current time - lookback_hours` and the stored last-processed timestamp, so this caps how far back the agent will scan when the marker is missing or older than the lookback window (for example, on first run, after a reset, or following a long outage). During normal operation the plugin advances from the last processed timestamp automatically. See each plugin's `config.md` for the default value and any additional per-context lookback keys (e.g., `lookback_hours_versions` for the `tasks` plugin). |
787+
788+
| `schedule` | String | varies | Cron or interval schedule for the plugin's Snowflake task. See [Plugin Scheduling](#plugin-scheduling) for supported formats. |
789+
| `is_disabled` | Boolean | `false` | Set to `true` to disable this plugin. |
790+
| `telemetry` | List | varies | List of telemetry types to emit (`logs`, `metrics`, `spans`, `events`, `biz_events`). Remove items to suppress specific signal types. |
791+
792+
For plugin-specific options (e.g., `max_entries`, `retention_hours`, `include`/`exclude` filters), see the `config.md` file in each plugin's configuration directory.
783793
784794
#### OpenTelemetry Configuration Options
785795

0 commit comments

Comments
 (0)