Skip to content

Fix daily energy zero-bounce spikes during reconnect#314

Open
Q-Claw wants to merge 4 commits intoTypQxQ:mainfrom
Q-Claw:fix-313-zero-bounce
Open

Fix daily energy zero-bounce spikes during reconnect#314
Q-Claw wants to merge 4 commits intoTypQxQ:mainfrom
Q-Claw:fix-313-zero-bounce

Conversation

@Q-Claw
Copy link
Contributor

@Q-Claw Q-Claw commented Feb 28, 2026

Summary

This PR hardens daily total_increasing energy sensors against transient reconnect glitches that can cause phantom energy in the HA Energy Dashboard.

Changes

  • custom_components/sigen/calculated_sensor.py

    • In _calculate_total_inverter_energy, track valid inverter samples per poll.
    • If all values are missing/invalid, return None (unavailable) instead of 0.0.
  • custom_components/sigen/sensor.py

    • Add a guard for daily energy keys:
      • plant_daily_pv_energy
      • plant_daily_battery_charge_energy
      • plant_daily_battery_discharge_energy
      • inverter_daily_pv_energy
      • inverter_ess_daily_charge_energy
      • inverter_ess_daily_discharge_energy
    • If value drops from >0 to 0 outside a midnight reset window, publish None for that cycle.
    • Keep legitimate midnight reset behavior by allowing a reset window around 00:00.

Why

Under transient Modbus/network interruptions, some daily counters can briefly report zero, which HA interprets as new production when values recover. This inflates statistics.

Result

Transient reconnect glitches become unavailable instead of false zero resets, preventing phantom energy spikes while preserving real midnight resets.

Closes #313

- Return unavailable when all inverter daily-energy samples are missing/invalid in a poll
- Guard daily total_increasing energy sensors against transient daytime 0 drops
- Keep legitimate midnight resets by allowing a reset time window
@greptile-apps
Copy link

greptile-apps bot commented Feb 28, 2026

Greptile Summary

This PR adds two complementary guards to prevent daily total_increasing energy sensors from producing phantom spikes in the HA Energy Dashboard when Modbus/network reconnects cause inverters to briefly report zero. calculated_sensor.py returns None instead of Decimal(0) when every inverter datum is absent, and sensor.py adds an in-memory per-sensor guard that suppresses a drop to exactly 0 outside a symmetric ±20-minute midnight window, preserving legitimate end-of-day resets.

Key observations:

  • calculated_sensor.py change is clean — the valid_sample_count guard correctly handles the all-missing case with no regressions.
  • sensor.py guard has an edge case: if a Modbus/network outage spans the entire midnight window (e.g. connection drops before 23:40 and recovers after 00:20), the first post-reconnect 0 (a legitimate midnight reset) is incorrectly classified as a transient glitch and suppressed. The sensor becomes unavailable until production resumes. This does not cause phantom energy but does delay acknowledgment of the midnight reset and may confuse users or automations.
  • _last_valid_daily_energy_value is in-memory only — a HA restart clears it; the first post-restart poll is intentionally unguarded (documented in code).
  • The symmetric midnight window (_is_near_daily_reset) correctly handles both pre- and post-midnight polling, improving on an earlier asymmetric implementation.

Confidence Score: 3/5

  • Safe to merge for the primary use-case; a niche extended-outage-across-midnight scenario can temporarily misclassify a legitimate reset as a glitch, suppressing the zero without causing phantom energy.
  • The core zero-bounce protection logic is correct and well-structured. The symmetric midnight window is a sound improvement. The main concern is the extended-outage edge case in _apply_daily_energy_zero_guard where an outage spanning the full ±20-minute midnight window causes a post-reconnect legitimate 0 to be suppressed — this doesn't produce phantom energy but does leave sensors unavailable until first real activity, which could surprise users or break automations relying on a clean daily reset. The calculated_sensor.py change is independently solid.
  • Pay close attention to custom_components/sigen/sensor.py — specifically the _apply_daily_energy_zero_guard method's behaviour when a reconnect occurs outside the midnight window after a prolonged outage.

Important Files Changed

Filename Overview
custom_components/sigen/sensor.py Adds in-memory zero-bounce guard for six daily energy keys. Logic is mostly sound but an extended Modbus outage that spans the entire ±20-minute midnight window can cause the post-reconnect legitimate midnight reset (value=0) to be incorrectly suppressed.
custom_components/sigen/calculated_sensor.py Correctly adds valid_sample_count tracking and returns None instead of Decimal(0) when every inverter datum is missing/invalid, preventing false zero totals from propagating to the Energy Dashboard on reconnect.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[native_value called] --> B{value_fn present?}
    B -- yes --> C[Run value_fn / calculated_sensor]
    C --> D{All inverter samples missing?}
    D -- yes --> E[Return None\ncalculated_sensor.py]
    D -- no --> F[Sum valid samples → Decimal total]
    F --> G[_apply_daily_energy_zero_guard]
    B -- no --> H[Read raw_value from coordinator]
    H --> I[_apply_daily_energy_zero_guard]
    G --> J{key in PROTECTED_DAILY_ENERGY_KEYS?}
    I --> J
    J -- no --> K[Return value unchanged]
    J -- yes --> L{value is None?}
    L -- yes --> M[Return None\nunavailable]
    L -- no --> N{value_dec == 0\nAND last > 0\nAND NOT near midnight?}
    N -- yes --> O[Suppress: Return None\nunavailable\nDo NOT update last_valid]
    N -- no --> P[Update _last_valid_daily_energy_value\nReturn value]
    style O fill:#f99,stroke:#c00
    style E fill:#f99,stroke:#c00
    style M fill:#fdd,stroke:#c00
Loading

Last reviewed commit: 5dbf28d

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 7 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +247 to +251
def _is_near_daily_reset(self) -> bool:
"""Return True around midnight to allow legitimate daily counter resets."""
now = dt_util.now()
midnight = now.replace(hour=0, minute=0, second=0, microsecond=0)
return abs(now - midnight) <= DAILY_RESET_GUARD_WINDOW
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the logic only handles 20 minutes AFTER midnight, not before. at 23:50 (10 min before midnight), midnight = today 00:00, now - midnight = 23h50m, which is > 20 minutes, so it returns False. this will incorrectly suppress legitimate early resets.

Suggested change
def _is_near_daily_reset(self) -> bool:
"""Return True around midnight to allow legitimate daily counter resets."""
now = dt_util.now()
midnight = now.replace(hour=0, minute=0, second=0, microsecond=0)
return abs(now - midnight) <= DAILY_RESET_GUARD_WINDOW
def _is_near_daily_reset(self) -> bool:
"""Return True around midnight to allow legitimate daily counter resets."""
now = dt_util.now()
time_of_day = now.hour * 60 + now.minute
# Within 20 minutes before or after midnight (23:40-00:20)
return time_of_day <= 20 or time_of_day >= (24 * 60 - 20)

@greptile-apps
Copy link

greptile-apps bot commented Feb 28, 2026

Additional Comments (1)

custom_components/sigen/calculated_sensor.py
safe_decimal(None) returns Decimal(0.0), not None (see common.py:269), so when inverter_data.get(energy_key) returns None (missing key), it's converted to 0 and counted as a valid sample. this means valid_sample_count will never be 0 if inverters exist but have missing data, defeating the purpose of the check on line 439.

            raw_value = inverter_data.get(energy_key)
            if raw_value is None:
                _LOGGER.debug(
                    "[%s] Missing '%s' for inverter %s",
                    log_prefix,
                    energy_key,
                    inverter_name
                )
                continue
            
            energy_value = safe_decimal(raw_value)
            if energy_value is not None:
                try:
                    total_energy += energy_value
                    valid_sample_count += 1

- Fix symmetric midnight reset window (before + after 00:00)
- Skip missing inverter daily values before safe_decimal conversion
- Keep missing/invalid poll behavior as unavailable when no valid samples
@Q-Claw
Copy link
Contributor Author

Q-Claw commented Feb 28, 2026

Thanks for the review, addressed in commit 2ec6f00.

Changes made:

  • Fixed midnight guard to be symmetric around 00:00 (covers both 23:40-00:00 and 00:00-00:20).
  • Fixed missing-value handling in _calculate_total_inverter_energy: now skips None raw values before safe_decimal, so missing samples are not counted as valid zeroes.
  • Kept valid_sample_count == 0 -> None behavior so reconnect gaps become unavailable instead of false zeroes.

Please re-check when convenient.

@Q-Claw
Copy link
Contributor Author

Q-Claw commented Feb 28, 2026

@greptile-apps please re-review after commit 2ec6f00

Comment on lines +275 to +280
_LOGGER.warning(
"[%s] Ignoring transient daily energy drop to 0 outside reset window (last=%s)",
self.entity_id,
self._last_valid_daily_energy_value,
)
return None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING level causes log spam during extended outages

_LOGGER.warning(...) is called on every poll cycle while the guard is active. If polling runs every 30 seconds and an inverter is offline for an hour, this produces ~120 consecutive warning-level entries — all identical — which will visibly flood HA's logs.

Consider logging the first suppression at warning and subsequent consecutive suppressions at debug, or switching entirely to debug/info since this is expected operational behaviour (the sensor is already publishing unavailable, which provides the user-facing signal):

Suggested change
_LOGGER.warning(
"[%s] Ignoring transient daily energy drop to 0 outside reset window (last=%s)",
self.entity_id,
self._last_valid_daily_energy_value,
)
return None
_LOGGER.debug(
"[%s] Ignoring transient daily energy drop to 0 outside reset window (last=%s)",
self.entity_id,
self._last_valid_daily_energy_value,
)
return None

Comment on lines +450 to +452
if valid_sample_count == 0:
_LOGGER.debug("[%s] No valid '%s' samples in this poll", log_prefix, energy_key)
return None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None return only covers the all-invalid case; partial-offline totals still decrease

The new valid_sample_count == 0 guard correctly prevents publishing 0 when every inverter is offline. However, consider a two-inverter setup where Inverter A goes offline (None, skipped) while Inverter B legitimately continues reporting. The function returns Inverter B's value alone — potentially a large drop from the previous combined total (e.g. 10 kWh → 5 kWh). This is less than zero (not caught by the sensor.py guard which only triggers on a drop to exactly 0), and HA's total_increasing logic may record it as a reset.

This is a pre-existing issue and this PR's scope is zero-bounce, but it's worth a comment here so the partial-offline decrease case isn't overlooked in a follow-up.

@greptile-apps
Copy link

greptile-apps bot commented Feb 28, 2026

Additional Comments (1)

custom_components/sigen/sensor.py
Guard state is lost on HA restart

_last_valid_daily_energy_value is initialised to None on every HA restart. The first poll after restart has no prior reference point, so if the device happens to return 0 during that first read (e.g. the glitch coincides with startup), the guard cannot detect it — the zero is published and _last_valid_daily_energy_value is set to 0. The guard then remains dormant until a legitimate non-zero reading is recorded, leaving a window of vulnerability identical to the pre-PR state.

For most users this will be rare, but it is worth documenting (e.g. a code comment) so future maintainers understand the deliberate trade-off and are not tempted to remove or weaken the "first-poll pass-through" path.

@Q-Claw
Copy link
Contributor Author

Q-Claw commented Mar 1, 2026

Addressed the latest review note in commit 264b1f8.

What I changed:

  • Added an explicit code comment on _last_valid_daily_energy_value in custom_components/sigen/sensor.py documenting that the guard state is in-memory only and resets on HA restart.
  • Documented the intended trade-off: first post-restart poll has no prior reference and is intentionally passed through.

Validation:

  • python3 -m compileall custom_components/sigen/sensor.py custom_components/sigen/calculated_sensor.py

@greptile-apps could you please take another look?

@Q-Claw
Copy link
Contributor Author

Q-Claw commented Mar 1, 2026

Follow-up fix pushed in commit \ to address log-noise feedback:\n\n- Changed transient zero-guard suppression log from -> in .\n- This avoids repeated warning-level spam during extended reconnect/offline periods while preserving traceability at debug level.\n\nValidation:\n- Listing 'custom_components/sigen/sensor.py'...
Can't list 'custom_components/sigen/sensor.py'\n\n@greptile-apps please re-review.

@Q-Claw
Copy link
Contributor Author

Q-Claw commented Mar 1, 2026

Follow-up fix pushed in commit 5dbf28d to address log-noise feedback:

  • Changed transient zero-guard suppression log from warning -> debug in custom_components/sigen/sensor.py.
  • This avoids repeated warning-level spam during extended reconnect/offline periods while preserving traceability at debug level.

Validation:

  • python3 -m compileall custom_components/sigen/sensor.py

@greptile-apps please re-review.

Comment on lines +271 to +285
if (
value_dec == 0
and self._last_valid_daily_energy_value is not None
and self._last_valid_daily_energy_value > 0
and not self._is_near_daily_reset()
):
_LOGGER.debug(
"[%s] Ignoring transient daily energy drop to 0 outside reset window (last=%s)",
self.entity_id,
self._last_valid_daily_energy_value,
)
return None

self._last_valid_daily_energy_value = value_dec
return value
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extended outage crossing midnight suppresses legitimate reset

If HA stays running but the Modbus/network connection drops before the pre-midnight window opens (e.g. connection goes offline at 23:39, last poll left _last_valid_daily_energy_value = 14.0), and doesn't recover until after the post-midnight window closes (e.g. 00:21), the inverter correctly reports 0 (daily counter has reset) but the guard sees:

  • value_dec == 0
  • _last_valid_daily_energy_value = 14.0 > 0
  • _is_near_daily_reset()False (00:21 is outside the ±20 min window) ✓

The 0 is suppressed and unavailable is published instead.

This does not cause phantom energy (HA's total_increasing will handle the eventual reset correctly once production provides a non-zero reading), but it delays acknowledgment of the midnight reset and leaves daily sensors as unavailable until first real post-sunrise activity — which can look alarming in the Energy Dashboard and may interfere with automations that rely on a clean 0 at start of day.

One approach is to track the wall-clock timestamp of the last non-None reading and, if the gap exceeds a threshold (e.g. 30 min), allow the 0 through unconditionally as a probable legitimate counter reset rather than a transient glitch:

# In __init__
self._last_successful_daily_energy_ts: datetime | None = None

# In _apply_daily_energy_zero_guard, replace the suppression block:
STALE_THRESHOLD = timedelta(minutes=30)
now = dt_util.now()
stale = (
    self._last_successful_daily_energy_ts is None
    or (now - self._last_successful_daily_energy_ts) > STALE_THRESHOLD
)
if (
    value_dec == 0
    and self._last_valid_daily_energy_value is not None
    and self._last_valid_daily_energy_value > 0
    and not self._is_near_daily_reset()
    and not stale
):
    ...  # suppress

This keeps the glitch protection for short outages while allowing long-gap reconnects to pass a 0 through regardless.

Comment on lines +284 to +285
self._last_valid_daily_energy_value = value_dec
return value
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_last_valid_daily_energy_value updates on value_dec == 0 during midnight window

When a midnight-window 0 passes the guard (intentionally allowed), _last_valid_daily_energy_value is set to Decimal(0). This is actually correct — once the daily counter legitimately resets, future 0 polls should not fire the guard (> 0 check prevents it). However, consider adding a short comment here to make the intent explicit, since a reader could easily assume the update should only happen for non-zero values:

Suggested change
self._last_valid_daily_energy_value = value_dec
return value
# Update reference only for non-suppressed values (includes legitimate midnight 0 resets).
self._last_valid_daily_energy_value = value_dec
return value

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Daily energy sensors occasionally report 0.0 instead of unavailable on reconnection

1 participant