Fix RRD4j persistence tests for parallel CI execution #20181

Copilot · 2026-02-05T05:33:05Z

RRD4j tests fail in CI with -T1.5C (6 parallel modules) due to hardcoded 1-second sleep being insufficient under disk I/O and CPU contention from async storage operations.

Changes

Replace fixed sleep with adaptive polling:

Poll every 250ms for data availability, return immediately when ready
20-second timeout handles CI contention (20× increase from 1s)
Log actual storage timing and attempt count for CI diagnostics
Detailed failure messages explain contention issues

Implementation:

Add waitForStorage(itemName, timeoutMs) method that queries persistence until data available
Update 4 test methods: storeAndRetrieveNumberValue, storeAndRetrieveSwitchValue, queryWithTimeRange, storeAndRetrieveWithInvalidDBConfig

Before:

service.store(numberItem);
Thread.sleep(STORAGE_TIMEOUT_MS);  // Fixed 1s wait, fails under contention

After:

service.store(numberItem);
waitForStorage(numberItem.getName(), STORAGE_TIMEOUT_MS);  // Polls until ready, up to 20s

Result: Tests complete faster in normal conditions while handling CI parallel execution robustly.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

openhab.jfrog.io
- Triggering command: /usr/lib/jvm/temurin-17-jdk-amd64/bin/java /usr/lib/jvm/temurin-17-jdk-amd64/bin/java --enable-native-access=ALL-UNNAMED -classpath /usr/share/apache-maven-3.9.12/boot/plexus-classworlds-2.9.0.jar -Dclassworlds.conf=/usr/share/apache-maven-3.9.12/bin/m2.conf -Dmaven.home=/usr/share/apache-maven-3.9.12 -Dlibrary.jansi.path=/usr/share/apache-maven-3.9.12/lib/jansi-native -Dmaven.multiModuleProjectDirectory=/home/REDACTED/work/openhab-addons/openhab-addons org.codehaus.plexus.classworlds.launcher.Launcher clean test -q (dns block)
- Triggering command: /usr/lib/jvm/temurin-17-jdk-amd64/bin/java /usr/lib/jvm/temurin-17-jdk-amd64/bin/java --enable-native-access=ALL-UNNAMED -classpath /usr/share/apache-maven-3.9.12/boot/plexus-classworlds-2.9.0.jar -Dclassworlds.conf=/usr/share/apache-maven-3.9.12/bin/m2.conf -Dmaven.home=/usr/share/apache-maven-3.9.12 -Dlibrary.jansi.path=/usr/share/apache-maven-3.9.12/lib/jansi-native -Dmaven.multiModuleProjectDirectory=/home/REDACTED/work/openhab-addons/openhab-addons org.codehaus.plexus.classworlds.launcher.Launcher spotless:check (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

Original prompt

Problem

The RRD4j persistence tests are failing in CI but passing locally. The CI build uses -T1.5C which runs up to 6 modules in parallel, causing resource contention.

Failing CI Job: https://github.com/openhab/openhab-addons/actions/runs/21686302162/job/62534160911

Root Cause:

Tests use Thread.sleep(1000) to wait for async storage

In CI with parallel execution, disk I/O and CPU are heavily contested

1 second is insufficient when 6+ modules are testing simultaneously

Tests pass locally because there's no resource contention

Solution

Replace the hardcoded Thread.sleep(1000) with a robust polling mechanism that:

Polls for data availability every 250ms

Has a 20-second timeout for CI environments

Logs timing information to debug CI performance

Provides clear failure messages explaining contention issues

Changes Required

Update bundles/org.openhab.persistence.rrd4j/src/test/java/org/openhab/persistence/rrd4j/internal/RRD4jPersistenceServiceTest.java:

Add Constants
private static final long STORAGE_TIMEOUT_MS = 20000; // 20 seconds for CI
private static final long POLL_INTERVAL_MS = 250; // Check every 250ms
Add Polling Method
/**
 * Waits for data to be persisted by polling the database.
 * This is more robust than Thread.sleep() in CI environments with resource contention.
 * 
 * @param itemName the name of the item to check
 * @param timeoutMs maximum time to wait in milliseconds
 * @throws InterruptedException if interrupted while waiting
 */
private void waitForStorage(String itemName, long timeoutMs) throws InterruptedException {
    long startTime = System.currentTimeMillis();
    int attempts = 0;
    
    while (System.currentTimeMillis() - startTime < timeoutMs) {
        attempts++;
        
        FilterCriteria criteria = new FilterCriteria();
        criteria.setItemName(itemName);
        criteria.setPageSize(1);
        
        try {
            Iterable<HistoricItem> results = service.query(criteria);
            if (results.iterator().hasNext()) {
                long elapsed = System.currentTimeMillis() - startTime;
                logger.info("Storage completed for '{}' after {}ms ({} attempts)", 
                           itemName, elapsed, attempts);
                return; // Success!
            }
        } catch (Exception e) {
            // Query might fail if data not ready yet, continue polling
            logger.debug("Query attempt {} failed: {}", attempts, e.getMessage());
        }
        
        Thread.sleep(POLL_INTERVAL_MS);
    }
    
    long elapsed = System.currentTimeMillis() - startTime;
    fail(String.format(
        "Data for item '%s' was not persisted within %dms (%d polling attempts). " +
        "This likely indicates CI resource contention with parallel test execution (-T1.5C). " +
        "Consider increasing STORAGE_TIMEOUT_MS or running tests sequentially.",
        itemName, elapsed, attempts
    ));
}
Replace All Thread.sleep(STORAGE_TIMEOUT_MS) Calls

In all test methods (e.g., storeAndRetrieveNumberValue(), storeAndRetrieveStringValue(), etc.), replace:
service.store(numberItem);
Thread.sleep(STORAGE_TIMEOUT_MS);
With:
service.store(numberItem);
waitForStorage(numberItem.getName(), STORAGE_TIMEOUT_MS);
Add Logger Field
private final Logger logger = LoggerFactory.getLogger(RRD4jPersistenceServiceTest.class);
Add import:
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
Important Constraints

DO NOT modify pom.xml - keep build configuration as-is

This is a test-only change - no production code modifications

Maintain all existing test assertions and behavior

The solution must work with -T1.5C parallel builds

Expected Outcome

Tests pass reliably in CI with parallel execution

Helpful logging shows actual storage timing

Clear failure messages if timeout is still insufficient

No changes to production code or build configuration

This pull request was created from Copilot chat.

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

… tests - Add STORAGE_TIMEOUT_MS constant (20 seconds for CI) - Add POLL_INTERVAL_MS constant (250ms polling interval) - Add Logger field and imports (SLF4J Logger/LoggerFactory) - Add fail import for better assertions - Implement waitForStorage() method that polls for data availability - Replace all Thread.sleep calls with waitForStorage in 4 test methods - Add detailed logging and failure messages for CI diagnostics Co-authored-by: holgerfriedrich <[email protected]>

Refs: openhab#20181 Signed-off-by: Holger Friedrich <[email protected]> Co-authored-by: Copilot <[email protected]>

Initial plan

a334fcd

Copilot AI assigned Copilot and holgerfriedrich Feb 5, 2026

Copilot started work on behalf of holgerfriedrich February 5, 2026 05:33 View session

Copilot AI changed the title ~~[WIP] Fix RRD4j persistence tests for CI environment~~ Fix RRD4j persistence tests for parallel CI execution Feb 5, 2026

Copilot AI requested a review from holgerfriedrich February 5, 2026 05:39

Copilot finished work on behalf of holgerfriedrich February 5, 2026 05:39

holgerfriedrich added a commit to holgerfriedrich/openhab-addons that referenced this pull request Feb 6, 2026

Carry over adaptive polling from rrd4j persistence.

1d61a1e

Refs: openhab#20181 Signed-off-by: Holger Friedrich <[email protected]> Co-authored-by: Copilot <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix RRD4j persistence tests for parallel CI execution #20181

Fix RRD4j persistence tests for parallel CI execution #20181

Uh oh!

Copilot AI commented Feb 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Fix RRD4j persistence tests for parallel CI execution #20181

Are you sure you want to change the base?

Fix RRD4j persistence tests for parallel CI execution #20181

Uh oh!

Conversation

Copilot AI commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

I tried to connect to the following addresses, but was blocked by firewall rules:

Problem

Solution

Changes Required

Add Constants

Add Polling Method

Replace All Thread.sleep(STORAGE_TIMEOUT_MS) Calls

Add Logger Field

Important Constraints

Expected Outcome

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Feb 5, 2026 •

edited

Loading

Replace All `Thread.sleep(STORAGE_TIMEOUT_MS)` Calls