-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[timescaledb] Initial contribution #20412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ulbi
wants to merge
9
commits into
openhab:main
Choose a base branch
from
ulbi:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 3 commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
914d7eb
[timescaledb] Add native TimescaleDB persistence service
ulbi 5bb29cd
Merge pull request #8 from ulbi/timescaledb-implementation
ulbi fb5420c
[timescaledb] Fix review issues: cache-miss fallback, SQL injection, …
ulbi eefb22d
[timescaledb] Address review comments: compression warning, runtime s…
ulbi 908d6a0
[timescaledb] Fix feature.xml bundle duplication and external-IT docs
ulbi 0d29df0
[timescaledb] Replace deprecated getDefaultStrategies() with getSugge…
ulbi b90cc67
[timescaledb] Fix README: cfg filename, code block types, table align…
ulbi 9be57aa
[timescaledb] Revert cfg filename: PID is org.openhab.persistence.tim…
ulbi ba2009a
Remove PERFORMANCE_TESTS.md and update README.md for configuration pa…
ulbi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,312 @@ | ||
| # AGENTS.md - TimescaleDB Persistence Development Guide | ||
|
|
||
| ## Context | ||
|
|
||
| You are working on a **native TimescaleDB persistence service** for openHAB. | ||
| TimescaleDB is a time-series extension for PostgreSQL — all standard PostgreSQL JDBC drivers work, but the schema and queries use TimescaleDB-specific features. | ||
|
|
||
| **openHAB has no built-in downsampling/aggregation framework.** `FilterCriteria`, `PersistenceStrategy` and `PersistenceItemConfiguration` contain no aggregation concepts. Everything must be implemented inside this service. | ||
|
|
||
| --- | ||
|
|
||
| ## Architecture | ||
|
|
||
| ### Key Classes | ||
|
|
||
| | Class | Role | | ||
| |---|---| | ||
| | `TimescaleDBPersistenceService` | Main OSGi service, implements `ModifiablePersistenceService` | | ||
| | `TimescaleDBMapper` | `State` ↔ SQL value conversion (all openHAB item types) | | ||
| | `TimescaleDBSchema` | Schema creation and migration on startup | | ||
| | `TimescaleDBQuery` | SQL query builder for all persistence operations | | ||
| | `TimescaleDBMetadataService` | Reads per-item downsampling config from `MetadataRegistry` | | ||
| | `TimescaleDBDownsampleJob` | Scheduled daily job: aggregates + deletes raw rows in-place | | ||
|
|
||
| ### OSGi Service Registration | ||
|
|
||
| - Service ID: `timescaledb` | ||
| - Implements: `ModifiablePersistenceService` (= `QueryablePersistenceService` + `remove()`) | ||
| - Config PID: `org.openhab.persistence.timescaledb` | ||
| - `OH-INF/addon/addon.xml` required — registers the addon in the openHAB UI (Settings → Add-ons → TimescaleDB). Without it the bundle runs but is invisible to the UI. | ||
| - Config description: `OH-INF/config/timescaledb.xml` | ||
| - `ConfigurationPolicy.REQUIRE` — service does not start without configuration | ||
| - Scheduler: `ThreadPoolManager.getScheduledPool("timescaledb")` (shared pool — never call `shutdownNow()`) | ||
| - Deactivate: `ScheduledFuture.cancel(false)`, then `HikariDataSource.close()` | ||
| - State indicator: `dataSource != null` — no `initialized` boolean | ||
|
|
||
| ### Dependencies | ||
|
|
||
| - JDBC Driver: `org.postgresql:postgresql` | ||
| - Connection pooling: HikariCP (already used in other openHAB bundles) | ||
| - openHAB Core: `org.openhab.core.persistence`, `org.openhab.core.items` (for `MetadataRegistry`) | ||
|
|
||
| --- | ||
|
|
||
| ## Database Schema | ||
|
|
||
| ```sql | ||
| CREATE TABLE item_meta ( | ||
| id SERIAL PRIMARY KEY, | ||
| name TEXT NOT NULL UNIQUE, | ||
| label TEXT, | ||
| created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() | ||
| ); | ||
|
|
||
| CREATE TABLE items ( | ||
| time TIMESTAMPTZ NOT NULL, | ||
| item_id INTEGER NOT NULL REFERENCES item_meta(id), | ||
| value DOUBLE PRECISION, | ||
| string TEXT, | ||
| unit TEXT, -- stored per row, NOT in item_meta | ||
| downsampled BOOLEAN NOT NULL DEFAULT FALSE | ||
| ); | ||
|
|
||
| SELECT create_hypertable('items', 'time'); | ||
| CREATE INDEX ON items (item_id, time DESC); | ||
| ``` | ||
|
|
||
| ### Why `unit` is per row, not in `item_meta` | ||
|
|
||
| A `QuantityType` unit can change over time (sensor reconfiguration, firmware update, etc.). Storing it in `item_meta` would corrupt historical reads. The unit is stored with each measurement and read back from the row when reconstructing `QuantityType` states. | ||
|
|
||
| ### Why downsampling is in-place (same hypertable) | ||
|
|
||
| openHAB reads persisted data directly from the hypertable via `QueryablePersistenceService`. If aggregated data lived in separate views or tables, openHAB would not see it without query-layer changes. In-place replacement (delete raw rows → insert aggregated rows with `downsampled=TRUE`) keeps a single source of truth that openHAB reads transparently. | ||
|
|
||
| --- | ||
|
|
||
| ## State Type Mapping | ||
|
|
||
| All openHAB item types are fully supported. `TimescaleDBMapper` handles the conversion in both directions. | ||
|
|
||
| ### Store direction (`toRow`) | ||
|
|
||
| | State type | `value` column | `string` column | `unit` column | | ||
| |---|---|---|---| | ||
| | `QuantityType` | numeric | null | unit string (e.g. `"°C"`) | | ||
| | `DecimalType` | numeric | null | null | | ||
| | `OnOffType` | `ON=1.0 / OFF=0.0` | null | null | | ||
| | `OpenClosedType` | `OPEN=1.0 / CLOSED=0.0` | null | null | | ||
| | `PercentType` | 0.0–100.0 | null | null | | ||
| | `UpDownType` | `UP=0.0 / DOWN=1.0` | null | null | | ||
| | `HSBType` | null | `"H,S,B"` | null | | ||
| | `DateTimeType` | null | ISO-8601 string | null | | ||
| | `PointType` | null | `"lat,lon[,alt]"` | null | | ||
| | `PlayPauseType` | null | enum name (`"PLAY"`, `"PAUSE"`, …) | null | | ||
| | `StringListType` | null | comma-separated values | null | | ||
| | `RawType` | null | Base64-encoded bytes | MIME type | | ||
| | `StringType` | null | raw string | null | | ||
|
|
||
| ### Load direction (`toState`) | ||
|
|
||
| `GroupItem` is unwrapped to its base item before dispatch. Item type determines how the row is interpreted: | ||
|
|
||
| - `ColorItem` → `HSBType` (parsed from `string`) | ||
| - `DateTimeItem` → `DateTimeType` (parsed from `string`) | ||
| - `LocationItem` → `PointType` (parsed from `string`) | ||
| - `PlayerItem` → `PlayPauseType` (parsed from `string`) | ||
| - `CallItem` → `StringListType` (parsed from `string`) | ||
| - `ImageItem` → `RawType` (Base64-decoded from `string`, MIME type from `unit`) | ||
| - `DimmerItem` / `RollershutterItem` → `PercentType` (**must be checked before `SwitchItem`**) | ||
| - `SwitchItem` → `OnOffType` | ||
| - `ContactItem` → `OpenClosedType` | ||
| - `NumberItem` with `unit != null` → `QuantityType` | ||
| - `NumberItem` without unit → `DecimalType` | ||
| - anything else with `string` → `StringType` | ||
|
|
||
| **Critical instanceof ordering in `toRow()`:** `HSBType` before `PercentType` before `DecimalType` | ||
| (because `HSBType extends PercentType extends DecimalType`). | ||
|
|
||
| --- | ||
|
|
||
| ## Per-Item Downsampling via Item Metadata | ||
|
|
||
| ### How to read metadata (same pattern as InfluxDB persistence) | ||
|
|
||
| ```java | ||
| @Reference | ||
| private MetadataRegistry metadataRegistry; | ||
|
|
||
| private Optional<Metadata> getItemMetadata(String itemName) { | ||
| MetadataKey key = new MetadataKey("timescaledb", itemName); | ||
| return Optional.ofNullable(metadataRegistry.get(key)); | ||
| } | ||
| ``` | ||
|
|
||
| `Metadata` has: | ||
| - `getValue()` → main value string, e.g. `"AVG"`, `"MAX"`, `"MIN"`, `"SUM"`, or `""` (no aggregation) | ||
| - `getConfiguration()` → `Map<String, Object>` with keys like `"downsampleInterval"`, `"retainRawDays"`, `"retentionDays"` | ||
|
|
||
| ### Metadata format (configured by users in .items files) | ||
|
|
||
| ```java | ||
| Number:Temperature MySensor { | ||
| timescaledb="AVG" [ downsampleInterval="1h", retainRawDays="5", retentionDays="365" ] | ||
| } | ||
| ``` | ||
|
|
||
| ### Parsing the metadata | ||
|
|
||
| ```java | ||
| public record DownsampleConfig( | ||
| AggregationFunction function, // AVG / MAX / MIN / SUM | ||
| Duration interval, // e.g. Duration.ofHours(1) | ||
| int retainRawDays, // default 5 | ||
| int retentionDays // default 0 = disabled | ||
| ) {} | ||
|
|
||
| public enum AggregationFunction { AVG, MAX, MIN, SUM } | ||
| ``` | ||
|
|
||
| Interval parsing — **validate against an allowlist** (used in SQL string formatting): | ||
|
|
||
| | Metadata value | SQL interval literal | | ||
| |---|---| | ||
| | `1m` | `1 minute` | | ||
| | `5m` | `5 minutes` | | ||
| | `15m` | `15 minutes` | | ||
| | `30m` | `30 minutes` | | ||
| | `1h` | `1 hour` | | ||
| | `6h` | `6 hours` | | ||
| | `1d` | `1 day` | | ||
|
|
||
| Throw `IllegalArgumentException` for any value not in this list to prevent SQL injection. | ||
|
|
||
| --- | ||
|
|
||
| ## Downsampling Job (`TimescaleDBDownsampleJob`) | ||
|
|
||
| Runs daily (e.g. via `@Scheduled` or openHAB's `CronScheduler`). For each item that has `timescaledb` metadata with a non-empty aggregation function: | ||
|
|
||
| ```sql | ||
| -- Step 1: aggregate raw rows older than retainRawDays into buckets | ||
| INSERT INTO items (time, item_id, value, unit, downsampled) | ||
| SELECT | ||
| time_bucket('<interval>', time) AS time, | ||
| item_id, | ||
| <AGG_FN>(value) AS value, | ||
| last(unit, time) AS unit, -- keep most recent unit in bucket | ||
| TRUE AS downsampled | ||
| FROM items | ||
| WHERE item_id = ? | ||
| AND downsampled = FALSE | ||
| AND time < NOW() - INTERVAL '<retainRawDays> days' | ||
| GROUP BY time_bucket('<interval>', time), item_id | ||
| ON CONFLICT DO NOTHING; | ||
|
|
||
| -- Step 2: delete replaced raw rows | ||
| DELETE FROM items | ||
| WHERE item_id = ? | ||
| AND downsampled = FALSE | ||
| AND time < NOW() - INTERVAL '<retainRawDays> days'; | ||
|
|
||
| -- Step 3 (if retentionDays > 0): drop everything older than retention window | ||
| DELETE FROM items | ||
| WHERE item_id = ? | ||
| AND time < NOW() - INTERVAL '<retentionDays> days'; | ||
| ``` | ||
|
|
||
| **Important:** | ||
| - `<interval>`, `<AGG_FN>`, `<retainRawDays>`, `<retentionDays>` are formatted into the SQL string — **never from user input directly**. Validate interval against allowlist, validate function against enum. Use `?` for `item_id`. | ||
| - `last(unit, time)` is a TimescaleDB hyperfuction — verify it is available, otherwise use `MAX(unit)` as fallback. | ||
| - Run steps 1+2 in a transaction per item to avoid partial state. | ||
| - Log errors per item and continue (don't abort the entire job on a single-item failure). | ||
|
|
||
| --- | ||
|
|
||
| ## Query Implementation | ||
|
|
||
| - All item name / date / state lookups use JDBC `PreparedStatement` — no string concatenation for user-controlled values. | ||
| - `time_bucket()` interval is formatted as a string but validated against the allowlist above. | ||
| - `historicState`: `WHERE item_id=? AND time <= ? ORDER BY time DESC LIMIT 1` | ||
| - `getAllStatesBetween`: `WHERE item_id=? AND time BETWEEN ? AND ? ORDER BY time ASC` — returns both raw and downsampled rows. | ||
| - Aggregate queries (`averageSince`, `minSince`, etc.): `WHERE item_id=? AND time >= ?` — operate on all rows including downsampled ones, which is correct. | ||
|
|
||
| --- | ||
|
|
||
| ## item_id Caching | ||
|
|
||
| Cache `name → item_id` in a `ConcurrentHashMap` to avoid a SELECT on every `store()` call. Invalidate on service restart. Auto-insert into `item_meta` on first `store()` if the item is unknown. | ||
|
|
||
| ```java | ||
| private final Map<String, Integer> itemIdCache = new ConcurrentHashMap<>(); | ||
|
|
||
| private int getOrCreateItemId(String name, @Nullable String label) { | ||
| return itemIdCache.computeIfAbsent(name, n -> fetchOrInsertItemMeta(n, label)); | ||
| } | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## Testing | ||
|
|
||
| ### Unit Tests (no DB required) | ||
|
|
||
| Location: `src/test/java/org/openhab/persistence/timescaledb/internal/` | ||
|
|
||
| - `BundleManifestTest` — OSGi Import-Package allowlist + presence of `OH-INF/addon/addon.xml` | ||
| - `TimescaleDBMapperTest` — State ↔ SQL value round-trips for all state types | ||
| - `TimescaleDBMetadataServiceTest` — parsing of metadata values and config keys | ||
| - `TimescaleDBDownsampleJobTest` — SQL generation for aggregation/delete, interval allowlist validation | ||
|
|
||
| Run with `mvn test` — last result: **183 tests, 0 failures** (2026-03-13). | ||
|
|
||
| ### Integration Tests (requires Docker + TimescaleDB) | ||
|
|
||
| Tagged `@Tag("integration")`, run automatically via Testcontainers during `mvn test`: | ||
|
|
||
| ```java | ||
| @Container | ||
| static PostgreSQLContainer<?> db = new PostgreSQLContainer<>("timescale/timescaledb:latest-pg16") | ||
| .withDatabaseName("openhab_test") | ||
| .withUsername("openhab") | ||
| .withPassword("openhab"); | ||
| ``` | ||
|
|
||
| Test: schema creation, store/query round-trips, downsampling job result, compression policy creation. | ||
|
|
||
| ### Performance Tests (requires external DB) | ||
|
|
||
| Tagged `@Tag("performance")` — **excluded from `mvn test`** via `<excludedGroups>performance,external-integration</excludedGroups>` in pom.xml. | ||
|
|
||
| Run explicitly against an external TimescaleDB: | ||
|
|
||
| ```bash | ||
| HOST=... PORT=5432 DBNAME=openhab USER=openhab PASSWORD=... \ | ||
| mvn test -Dtest=TimescaleDBPerformanceIT \ | ||
| -pl bundles/org.openhab.persistence.timescaledb | ||
| ``` | ||
|
|
||
| See `PERFORMANCE_TESTS.md` for SLOs, scale constants, and the heavy 18-month scenario. | ||
|
|
||
| --- | ||
|
|
||
| ## Common Pitfalls | ||
|
|
||
| 1. **TimescaleDB extension not installed**: check on startup with `SELECT extname FROM pg_extension WHERE extname='timescaledb'`, fail with a clear error if missing. | ||
| 2. **`last()` availability**: `last(unit, time)` requires the TimescaleDB Toolkit — check availability, fall back to `MAX(unit)` otherwise. | ||
| 3. **Compression + INSERT conflict**: compressed chunks are read-only. The downsampling INSERT must target the uncompressed region (data newer than `compressionAfterDays`). Ensure `retainRawDays < compressionAfterDays`. | ||
| 4. **Interval allowlist is mandatory**: `time_bucket('1h', time)` is dynamically formatted — any non-allowlisted value must throw before it reaches SQL. | ||
| 5. **`ON CONFLICT DO NOTHING`** on the aggregation INSERT: the job may run twice if interrupted; duplicate bucket rows must be prevented. | ||
| 6. **`QuantityType` unit changes**: never update `item_meta` with a unit — the unit lives on each row. On read, take the `unit` value from the row. | ||
|
|
||
| --- | ||
|
|
||
| ## Relevant openHAB Core APIs | ||
|
|
||
| - `org.openhab.core.persistence.QueryablePersistenceService` — implement this | ||
| - `org.openhab.core.persistence.FilterCriteria` — query parameters passed to `query()` | ||
| - `org.openhab.core.items.MetadataRegistry` — OSGi service, inject via `@Reference` | ||
| - `org.openhab.core.items.Metadata` — `getValue()` + `getConfiguration()` for per-item config | ||
| - `org.openhab.core.items.MetadataKey` — constructed as `new MetadataKey("timescaledb", itemName)` | ||
| - `org.openhab.core.library.types.*` — `QuantityType`, `DecimalType`, `OnOffType`, etc. | ||
|
|
||
| ## References | ||
|
|
||
| - [TimescaleDB docs](https://docs.timescale.com/) | ||
| - [time_bucket()](https://docs.timescale.com/api/latest/hyperfunctions/time_bucket/) | ||
| - [last()](https://docs.timescale.com/api/latest/hyperfunctions/last/) | ||
| - [Compression](https://docs.timescale.com/use-timescale/latest/compression/) | ||
| - InfluxDB persistence (metadata pattern): `bundles/org.openhab.persistence.influxdb/src/main/java/.../InfluxDBMetadataService.java` | ||
| - Existing downsampling logic (Python/MongoDB): `DOWNSAMPLE_IMPLEMENTATION_GUIDE.md` in this bundle | ||
|
|
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lsiepel Please be more specific why this should be removed? This makes live much easier when working with agents on this bundle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a strict policy about bundle structure and files added to source control. This file is not part of it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lsiepel Thanks for your very strict answer, you may think about further evolving the policies, but as you insist on this, I will move the file in my own storage and keep it there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lsiepel - I think we can allow AI agent instructions at binding level to give agents more context and thereby improve the quality of their output. We also have an agent file here by now: https://github.com/openhab/openhab-addons/blob/main/AGENTS.md