Skip to content

Conversation

@himdel
Copy link
Contributor

@himdel himdel commented Dec 18, 2025

Issue: AAP-59317 (storagedb). AAP-59482 (psycopg.sql)

Drop psycopg2 support - since awx 22.4.0, so 2.4 should be fine.
Use psycopg.sql & query params instead of string interpolations.
Add metrics_utility.library.storage helpers - converting dict/list to files, and back
Add StoragePostgres(db, table='name', key_field='key', value_field='value', timestamp_field=None), and a create_storage_table helper
Add example workers/6-gather-postgres.py which runs a collector and saves results in DB.
Fixup workers/ settings handling, add prepare() call - uv run python workers/6-... works now, collects to db.

Assisted by Claude

Copy link
Contributor

@MilanPospisil MilanPospisil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :-) 👍

himdel and others added 10 commits December 19, 2025 15:27
Removes psycopg2 compatibility code and functions:
- Remove try/except import fallback for psycopg.errors.UndefinedTable
- Remove _copy_table_aap_2_4_and_below and _copy_table_aap_2_5_and_above helper functions
- Simplify copy_table to only use psycopg3 cursor.copy() method
- Remove psycopg2 mock setup from tests

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Migrates all query construction to use psycopg.sql with parameterized
queries instead of string formatting for better security and reliability:
- Update date_where() to return (sql.SQL, params) tuple
- Migrate config.py settings query to use sql.SQL and placeholders
- Migrate job_host_summary_service to use parameterized time ranges
- Migrate main_host_daily to use parameterized date_where
- Migrate main_jobevent_service to use sql.SQL for WHERE clauses
- Update all copy_table calls to pass params separately
- Update tests to verify parameterized queries instead of embedded values

This prevents SQL injection and improves PostgreSQL query handling.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Introduces reusable helper functions for loading and saving data in
common formats (CSV, JSON, Parquet):

- load_csv(), load_json(), load_parquet() - Accept filename or file-like objects
- save_csv(), save_json(), save_parquet() - Support both filename= and fileobj= parameters
- Consistent API following storage put() convention
- Proper UTF-8 encoding support
- Type validation and error handling
- Comprehensive test coverage

These helpers provide a consistent interface for file I/O operations
that can be reused across storage backends and data processing code.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Introduces StoragePostgres class for storing data in PostgreSQL using
a key-value table with JSONB values:

Features:
- get(key) - Retrieve data by key
- put(key, filename=|fileobj=|dict=) - Store data from various sources
- glob(pattern, since=, until=) - List keys with pattern matching and time filtering
- exists(key) - Check key existence
- remove(key) - Delete key
- Automatic CSV/JSON file loading via storage helpers
- Safe query building with psycopg.sql
- Optional timestamp tracking with create_storage_table() helper

The implementation uses JSONB for flexible schema-free storage and
supports the same interface conventions as other storage backends.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Implements a consistent get_data() method across storage backends to retrieve and parse data directly without temporary files. Supports auto-detection and explicit format specification for JSON, CSV, and Parquet files.

Changes:
- Add get_data() to StorageDirectory and StorageS3 with format auto-detection
- Refactor StoragePostgres.get() to context manager, add get_data() for direct access
- Add NotImplementedError for get()/get_data() in write-only StorageCRC and StorageSegment
- Add comprehensive test coverage for new functionality

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
The tests were failing because:
1. test_get_data_json used an object without a file extension, preventing auto-detection
2. test_get_data_json_explicit_format tried to read an object deleted by an earlier test

Fixed by having each test create its own test object with proper extension (.json for auto-detection, .unknown for testing the format parameter) and clean up afterward.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Fix ConnectionHandler pattern from ConnectionHandler(settings.foo) to ConnectionHandler({'default': settings.foo})['default']
- Add missing import os to workers/settings.py
- Populate settings with actual controller_db and metrics_db configurations

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add new worker script that collects data from controller database using
config and job_host_summary collectors, and uploads CSV files to
PostgreSQL storage in metrics database.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
uv run workers/6-gather-postgres.py successfully collects config to db,
and empty job_host_summary too
@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
79.4% Coverage on New Code (required ≥ 80%)
3.0% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

'timestamp_field': 'updated_at',
}

library.storage.postgres.create_storage_table(**storage_config)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect module path access for postgres submodule

The code accesses library.storage.postgres.create_storage_table(), but the postgres submodule is not exported in metrics_utility/library/storage/__init__.py. The __init__.py only imports specific items from the postgres module (from .postgres import StoragePostgres, create_storage_table), not the module itself. This will raise an AttributeError at runtime. The correct call is library.storage.create_storage_table(**storage_config).

Fix in Cursor Fix in Web

@himdel himdel marked this pull request as draft January 6, 2026 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants