fix: missing production and consumption column #177

Mohammad-Tayyab-Frequenz · 2025-10-29T14:21:02Z

No description provided.

Copilot

Pull Request Overview

This PR enhances the add_energy_flows() function to handle more flexible energy flow calculations by:

Adding support for grid column input to enable consumption inference
Transforming production columns using the asset_production() function before aggregation
Introducing an optional clip_non_negative parameter to helper functions for controlled clipping behavior

Comments suppressed due to low confidence (1)

src/frequenz/lib/notebooks/reporting/utils/helpers.py:1

The new consumption inference logic with three branches (existing column, inferred from grid, fallback to zeros) lacks test coverage. Tests should verify each branch executes correctly and produces expected results.

# License: MIT

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/frequenz/lib/notebooks/reporting/utils/helpers.py

cwasicki · 2025-10-29T14:33:08Z

This needs a rebase

cwasicki · 2025-10-29T15:14:26Z

src/frequenz/lib/notebooks/reporting/utils/helpers.py


-def _sum_cols(df: pd.DataFrame, cols: list[str] | None) -> pd.Series:
+def _sum_cols(
+    df: pd.DataFrame, cols: list[str] | None, *, clip_non_negative: bool = False


The new argument is not used anywhere, so why introducing it?

cwasicki · 2025-10-29T15:17:43Z

src/frequenz/lib/notebooks/reporting/utils/helpers.py

+        series = df.reindex(columns=[col], fill_value=0)[col].astype("float64")
+
+    if clip_non_negative:
+        series = series.clip(lower=0)


IIUC you only use it in one place, and I also don't see the point of having this flag (which is just a boolean for lower=0) when you could simply do _get_numeric_series(...).clip(lower=0), which is more flexible and readable.

Yes, makes sense. I will do that

cwasicki · 2025-10-29T15:18:22Z

src/frequenz/lib/notebooks/reporting/utils/helpers.py

-    # Surplus vs. consumption
+    # Normalize production, grid and consumption columns by removing None entries
+    resolved_production_cols = [
+        col for col in (production_cols or []) if col is not None


Why are there Nones in the columns?

This is just for handling the cases where users do not give any production/consumption/grid cols.

…eation Signed-off-by: Mohammad Tayyab <[email protected]>

Mohammad-Tayyab-Frequenz · 2025-10-31T14:17:45Z

src/frequenz/lib/notebooks/reporting/metrics/reporting_metrics.py

+        .reindex(production_excess_series.index, fill_value=0.0)
+        .clip(lower=0)
+    )
    return pd.concat([production_excess_series, battery], axis=1).min(axis=1)


Added this to ensure that battery is properly aligned with production_excess_series.

Why can they be not aligned?

Yeah, I wasn’t sure if reindexing would be the best approach moving forward or not.
Consider cases where users provide input data in which production and consumption have matching indices and are aligned correctly, but the battery data contains extra rows (for any unknown reason). In such cases, the downstream functions would end up with more rows than the original production and consumption data.

Should we handle such scenarios within this Python codebase, or can we assume that the input data is already clean and pre-processed?

Which values would the rows have? If it's NaNs I wouldn't be concerned.

Mohammad-Tayyab-Frequenz · 2025-10-31T14:19:16Z

src/frequenz/lib/notebooks/reporting/metrics/reporting_metrics.py

    return share


 def consumption(


Added battery series separately and changed the function to work on pd.series rather than pd.dataframes.

Mohammad-Tayyab-Frequenz · 2025-10-31T14:23:10Z

src/frequenz/lib/notebooks/reporting/utils/helpers.py

        df_flows["production_total"],
        df_flows["consumption_total"],
-        production_is_positive=production_is_positive,
+        production_is_positive=True,


Initial input production is negative but is handled by asset_production to invert the sign. So, we have to change the arguments to positive.

Mohammad-Tayyab-Frequenz · 2025-10-31T14:24:12Z

src/frequenz/lib/notebooks/reporting/utils/helpers.py

    )

    # Battery charging power (optional)
-    bat_in = _get_numeric_series(df_flows, battery_charge_col)


Created a separate battery series containing only positive values to represent the battery charge.

cwasicki · 2025-11-03T13:24:38Z

src/frequenz/lib/notebooks/reporting/metrics/reporting_metrics.py

+        .reindex(production_excess_series.index, fill_value=0.0)
+        .clip(lower=0)
+    )
    return pd.concat([production_excess_series, battery], axis=1).min(axis=1)


Why can they be not aligned?

cwasicki · 2025-11-03T13:27:00Z

src/frequenz/lib/notebooks/reporting/metrics/reporting_metrics.py

-        production_cols: List of production column names (e.g., "pv", "chp", "battery" or "ev").
-            Can be None or empty if no on-site generation is present.
-        grid_cols: List of one or more grid column names.
+        grid: Series of grid import values (e.g., kW or MW).


Why only grid import and not grid power?

Yeah, it should be grid power. I will update the docstring.

cwasicki · 2025-11-03T13:29:45Z

src/frequenz/lib/notebooks/reporting/metrics/reporting_metrics.py


-    if not grid_cols:
-        raise ValueError("At least one grid column must be specified in grid_cols.")
+    grid_s = grid.astype("float64").fillna(0)


I think we should not fill with zeroes. There can be various reasons that the grid measurements are missing and we cannot safely know if they were zero.

We need to fill it with 0 so as to avoid the cases of the consumption also being NaN in those cases.
For eg.
Production - 15, battery - 5, grid - NaN
Then consumption will also be NaN.

So, in this case should we have the consumption as NaN or 20?

Consumption should be NaN, because you don't know what the grid power is.

cwasicki · 2025-11-03T13:32:52Z

src/frequenz/lib/notebooks/reporting/metrics/reporting_metrics.py

-    return consumption
+    if (result < 0).any():
+        warnings.warn(
+            "Negative inferred consumption detected. This can occur during net export "


What do you mean by net export and why can this cause negatives here? I think the most likely case to get negative values here is unmeasured energy generation sources or short-term spikes.

I think the most likely case of negative consumption is only going to be when the production sign was positive and after subtracting it from grid we get a negative value.

Net export are the cases where there was less consumption than production on-site and hence, the consumption had a negative sign. I just added this as a warning to make sure that users know that they have such periods and to check if there was no calculation error.

and hence, the consumption had a negative sign.

Do you mean "grid power had a negative sign"?

And when you say net export, does it refer to the negative part of grid power?

cwasicki · 2025-11-03T13:38:27Z

src/frequenz/lib/notebooks/reporting/utils/helpers.py

+        series = pd.Series(0.0, index=df.index, dtype="float64")
+    else:
+        raw = df.reindex(columns=[col], fill_value=0)[col]
+        series = pd.to_numeric(raw, errors="coerce").fillna(0.0).astype("float64")


Not introduced in this PR, but also here not sure if we should by default fill with 0s.

This is also to make sure we don't get NaNs during downstream calculations.

cwasicki · 2025-11-03T13:41:42Z

src/frequenz/lib/notebooks/reporting/utils/helpers.py



-# pylint: disable=too-many-arguments, too-many-locals
+def _column_has_data(df: pd.DataFrame, col: str | None) -> bool:


Maybe _has_nonzero_values(df, *, column: str | None)? And why would you call it with None at all?

This could occur in cases where the any column input (production, battery, consumption) from the user was not provided and by default it is assumed as None.
So, we need to safely pass those cases from our function

Mohammad-Tayyab-Frequenz requested a review from cwasicki October 29, 2025 14:21

Mohammad-Tayyab-Frequenz self-assigned this Oct 29, 2025

Copilot AI review requested due to automatic review settings October 29, 2025 14:21

Mohammad-Tayyab-Frequenz requested a review from a team as a code owner October 29, 2025 14:21

Copilot AI reviewed Oct 29, 2025

View reviewed changes

src/frequenz/lib/notebooks/reporting/utils/helpers.py Outdated Show resolved Hide resolved

src/frequenz/lib/notebooks/reporting/utils/helpers.py Outdated Show resolved Hide resolved

src/frequenz/lib/notebooks/reporting/utils/helpers.py Outdated Show resolved Hide resolved

Mohammad-Tayyab-Frequenz force-pushed the fix-energy-flows-df branch from a1fbe82 to 26afd0b Compare October 29, 2025 14:28

github-actions bot added part:docs Affects the documentation part:tooling Affects the development tooling (CI, deployment, dependency management, etc.) labels Oct 29, 2025

Mohammad-Tayyab-Frequenz force-pushed the fix-energy-flows-df branch 3 times, most recently from 4d4bdb9 to e559cb3 Compare October 29, 2025 14:57

Mohammad-Tayyab-Frequenz enabled auto-merge October 29, 2025 15:08

cwasicki reviewed Oct 29, 2025

View reviewed changes

Mohammad-Tayyab-Frequenz force-pushed the fix-energy-flows-df branch from e559cb3 to 908a341 Compare October 31, 2025 14:12

fix: update consumption calculation, update energy_flows dataframe cr…

0ea5ed5

…eation Signed-off-by: Mohammad Tayyab <[email protected]>

Mohammad-Tayyab-Frequenz force-pushed the fix-energy-flows-df branch from 908a341 to 0ea5ed5 Compare October 31, 2025 14:15

Mohammad-Tayyab-Frequenz commented Oct 31, 2025

View reviewed changes

Mohammad-Tayyab-Frequenz requested a review from cwasicki October 31, 2025 14:24

cwasicki reviewed Nov 3, 2025

View reviewed changes



		# pylint: disable=too-many-arguments, too-many-locals
		def _column_has_data(df: pd.DataFrame, col: str \| None) -> bool:

fix: missing production and consumption column #177

Are you sure you want to change the base?

fix: missing production and consumption column #177

Uh oh!

Conversation

Mohammad-Tayyab-Frequenz commented Oct 29, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cwasicki commented Oct 29, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Mohammad-Tayyab-Frequenz Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Mohammad-Tayyab-Frequenz Nov 3, 2025 •

edited

Loading