-
Notifications
You must be signed in to change notification settings - Fork 20
StoragePostgres - support saving csv/json/dict/list in a postgres key-value table #303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: devel
Are you sure you want to change the base?
Conversation
metrics_utility/library/collectors/controller/job_host_summary_service.py
Show resolved
Hide resolved
MilanPospisil
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM :-) 👍
Removes psycopg2 compatibility code and functions: - Remove try/except import fallback for psycopg.errors.UndefinedTable - Remove _copy_table_aap_2_4_and_below and _copy_table_aap_2_5_and_above helper functions - Simplify copy_table to only use psycopg3 cursor.copy() method - Remove psycopg2 mock setup from tests 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Migrates all query construction to use psycopg.sql with parameterized queries instead of string formatting for better security and reliability: - Update date_where() to return (sql.SQL, params) tuple - Migrate config.py settings query to use sql.SQL and placeholders - Migrate job_host_summary_service to use parameterized time ranges - Migrate main_host_daily to use parameterized date_where - Migrate main_jobevent_service to use sql.SQL for WHERE clauses - Update all copy_table calls to pass params separately - Update tests to verify parameterized queries instead of embedded values This prevents SQL injection and improves PostgreSQL query handling. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Introduces reusable helper functions for loading and saving data in common formats (CSV, JSON, Parquet): - load_csv(), load_json(), load_parquet() - Accept filename or file-like objects - save_csv(), save_json(), save_parquet() - Support both filename= and fileobj= parameters - Consistent API following storage put() convention - Proper UTF-8 encoding support - Type validation and error handling - Comprehensive test coverage These helpers provide a consistent interface for file I/O operations that can be reused across storage backends and data processing code. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Introduces StoragePostgres class for storing data in PostgreSQL using a key-value table with JSONB values: Features: - get(key) - Retrieve data by key - put(key, filename=|fileobj=|dict=) - Store data from various sources - glob(pattern, since=, until=) - List keys with pattern matching and time filtering - exists(key) - Check key existence - remove(key) - Delete key - Automatic CSV/JSON file loading via storage helpers - Safe query building with psycopg.sql - Optional timestamp tracking with create_storage_table() helper The implementation uses JSONB for flexible schema-free storage and supports the same interface conventions as other storage backends. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Implements a consistent get_data() method across storage backends to retrieve and parse data directly without temporary files. Supports auto-detection and explicit format specification for JSON, CSV, and Parquet files. Changes: - Add get_data() to StorageDirectory and StorageS3 with format auto-detection - Refactor StoragePostgres.get() to context manager, add get_data() for direct access - Add NotImplementedError for get()/get_data() in write-only StorageCRC and StorageSegment - Add comprehensive test coverage for new functionality 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
The tests were failing because: 1. test_get_data_json used an object without a file extension, preventing auto-detection 2. test_get_data_json_explicit_format tried to read an object deleted by an earlier test Fixed by having each test create its own test object with proper extension (.json for auto-detection, .unknown for testing the format parameter) and clean up afterward. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Fix ConnectionHandler pattern from ConnectionHandler(settings.foo) to ConnectionHandler({'default': settings.foo})['default']
- Add missing import os to workers/settings.py
- Populate settings with actual controller_db and metrics_db configurations
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add new worker script that collects data from controller database using config and job_host_summary collectors, and uploads CSV files to PostgreSQL storage in metrics database. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
uv run workers/6-gather-postgres.py successfully collects config to db, and empty job_host_summary too
|
| 'timestamp_field': 'updated_at', | ||
| } | ||
|
|
||
| library.storage.postgres.create_storage_table(**storage_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect module path access for postgres submodule
The code accesses library.storage.postgres.create_storage_table(), but the postgres submodule is not exported in metrics_utility/library/storage/__init__.py. The __init__.py only imports specific items from the postgres module (from .postgres import StoragePostgres, create_storage_table), not the module itself. This will raise an AttributeError at runtime. The correct call is library.storage.create_storage_table(**storage_config).


Issue: AAP-59317 (storagedb). AAP-59482 (psycopg.sql)
Drop psycopg2 support - since awx 22.4.0, so 2.4 should be fine.
Use
psycopg.sql& query params instead of string interpolations.Add
metrics_utility.library.storagehelpers - converting dict/list to files, and backAdd
StoragePostgres(db, table='name', key_field='key', value_field='value', timestamp_field=None), and acreate_storage_tablehelperAdd example
workers/6-gather-postgres.pywhich runs a collector and saves results in DB.Fixup workers/ settings handling, add
prepare()call -uv run python workers/6-...works now, collects to db.Assisted by Claude