Treat NaN as gap in ring buffer #641

cwasicki · 2023-09-05T16:17:06Z

Fixes #601

Signed-off-by: cwasicki <[email protected]>

llucax

Just a couple of bike shedding naming comments, but LGTM in general!

llucax · 2023-09-06T08:02:18Z

src/frequenz/sdk/timeseries/_ringbuffer/buffer.py

        """
        return self._gaps

+    def is_missing_value(self, value: QuantityT | None) -> bool:


Nitpick:

Suggested change

def is_missing_value(self, value: QuantityT | None) -> bool:

def is_value_missing(self, value: QuantityT | None) -> bool:

Or simply is_missing(). If you take a Sample instead of a QuantityT, I think is_value_missing() makes more sense.

Also, using a negative as part of a function name is usually not recommended, because you can end up with a double negation, which is harder to reason about. What about:

Suggested change

def is_missing_value(self, value: QuantityT | None) -> bool:

def has_value(self, sample: Sample) -> bool:

But I'm not 100% sure in this case, because I guess here missing is important, as you are talking about gaps. Another alternative could be is_value_present(sample). So just food for thought, I'm not convinced either way :P

Also, infinite is considered as not missing, right?

I can make it taking a sample instead. Do you think it can be moved to a method of Sample at some point?

Will change it to has_value. There is already a is_missing, which is about timestamps. I guess another candidate to improve naming.

Also, infinite is considered as not missing, right?

So far yes, on the typical microgrid data we work with infinities will already be turned into NaNs and for other use-cases I did not want to restrict it too much.

I can make it taking a sample instead. Do you think it can be moved to a method of Sample at some point?

I'm not sure, because again what's a present value could depend on the application, for example is infinite a value or not? I would be in favor of adding more common methods from math that would also check for None, like is_finite(), but that is not isnan() and not isinf().

There is already a is_missing, which is about timestamps. I guess another candidate to improve naming.

Oh! Yes :)

Another alternative could be is_gap() which ties it up pretty well with the concept.

But I was thinking if maybe is_missing(timestamp) can be unified into a is_gap(sample), then update() could already pass if a timestamp is missing to self._update_gaps(). But maybe this is too much of a refactor to be on the scope of this PR.

Not sure about is_gap(sample): is_missing checks whether a sample is already in our gap list, has_value checks whether a sample can be considered a gap.

cwasicki · 2023-09-06T09:32:38Z

@llucax Updated

llucax

All optional comments, LGTM as it is too.

llucax · 2023-09-06T12:25:31Z

src/frequenz/sdk/timeseries/_ringbuffer/buffer.py

+        Returns:
+            True if the sample has a value and it's not NaN.
+        """
+        return not ((sample.value is None) or sample.value.isnan())


If you want to save some parenthesis (which might make it easier to read):

Suggested change

return not ((sample.value is None) or sample.value.isnan())

return sample.value is not None and not sample.value.isnan()

Was thinking about that and found it even less readable. But removed one level.

llucax · 2023-09-06T12:38:55Z

src/frequenz/sdk/timeseries/_ringbuffer/buffer.py

        # Update data
-        value: float = np.nan if sample.value is None else sample.value.base_value
+        if self.has_value(sample):
+            assert sample.value is not None


I just noticed this, too bad that this is needed, because it was just checked by has_value().

You could avoid this by making has_value a function instead and using TypeGuard, but that would also need to take it back to operate on float | None:

from typing import TypeGuard def has_value(sample: float | None) -> TypeGuard[float]: """Check if a sample has a value and it's not NaN. Args: sample: sample to check. Returns: True if the sample has a value and it's not NaN. """ return value is not None and not value.isnan() if has_value(sample): # If this returns True, then the type checker narrows the type to `float` value = sample.value.base_value # this is now fine without the assert else: value = np.nan

I thought about alternatives and all seem awful, so I guess we'll have to live with the assert 🤷

llucax · 2023-09-06T12:41:01Z

src/frequenz/sdk/timeseries/_ringbuffer/buffer.py

-        value: float = np.nan if sample.value is None else sample.value.base_value
+        if self.has_value(sample):
+            assert sample.value is not None
+            value = sample.value.base_value
+        else:
+            value = np.nan


Or.... ❗❗❗ just not use the function in this context... The original code is pretty simple and understandable (and works the same, as if it is NaN we'll use np.nan anyway, or is it np.nan different from float('nan')?) without it too:

valueL = np.nan if sample.value is None else sample.value self._buffer[self.datetime_to_index(timestamp)] = value self._update_gaps(timestamp, prev_newest, not self.has_value(sample))

I would like to leave it as it is required in both these places. So you have one location where you define what a missing value is and what not (e.g. if we want to change it to include infs at some point.)

Actually I find it not so bad, even more readable. But yea, the assertion is annoying.

Signed-off-by: cwasicki <[email protected]>

cwasicki · 2023-09-06T12:58:59Z

@llucax @matthias-wende-frequenz Updated, but I am not authorized to merge.

cwasicki requested a review from a team as a code owner September 5, 2023 16:17

cwasicki requested a review from daniel-zullo-frequenz September 5, 2023 16:17

github-actions bot added part:docs Affects the documentation part:tests Affects the unit, integration and performance (benchmarks) tests part:data-pipeline Affects the data pipeline labels Sep 5, 2023

cwasicki added this to the v1.0.0-rc milestone Sep 5, 2023

cwasicki self-assigned this Sep 5, 2023

matthias-wende-frequenz previously approved these changes Sep 5, 2023

View reviewed changes

Add more unit tests for gaps

0a41680

Signed-off-by: cwasicki <[email protected]>

cwasicki dismissed matthias-wende-frequenz’s stale review via 9eb0bb9 September 5, 2023 20:50

cwasicki force-pushed the gapdef branch from b910a14 to 9eb0bb9 Compare September 5, 2023 20:50

llucax previously approved these changes Sep 6, 2023

View reviewed changes

matthias-wende-frequenz previously approved these changes Sep 6, 2023

View reviewed changes

cwasicki dismissed stale reviews from matthias-wende-frequenz and llucax via aaf4294 September 6, 2023 09:31

cwasicki force-pushed the gapdef branch from 9eb0bb9 to aaf4294 Compare September 6, 2023 09:31

llucax previously approved these changes Sep 6, 2023

View reviewed changes

cwasicki mentioned this pull request Sep 6, 2023

Add capacity, oldest and newest timestamp to moving window #598

Merged

Treat NaNs as missing values for gaps in ring buffer

c1c8828

Signed-off-by: cwasicki <[email protected]>

cwasicki dismissed llucax’s stale review via c1c8828 September 6, 2023 12:55

cwasicki force-pushed the gapdef branch from aaf4294 to c1c8828 Compare September 6, 2023 12:55

matthias-wende-frequenz approved these changes Sep 6, 2023

View reviewed changes

matthias-wende-frequenz added this pull request to the merge queue Sep 6, 2023

Merged via the queue into frequenz-floss:v0.x.x with commit 01b8d9d Sep 6, 2023

cwasicki deleted the gapdef branch September 6, 2023 14:13

cwasicki mentioned this pull request Sep 13, 2023

MovingWindow: Create Interface for requesting data and handling the data copy behaviour #214

Closed

	def is_missing_value(self, value: QuantityT \| None) -> bool:
	def is_value_missing(self, value: QuantityT \| None) -> bool:

	def is_missing_value(self, value: QuantityT \| None) -> bool:
	def has_value(self, sample: Sample) -> bool:

	return not ((sample.value is None) or sample.value.isnan())
	return sample.value is not None and not sample.value.isnan()

Treat NaN as gap in ring buffer #641

Treat NaN as gap in ring buffer #641

Uh oh!

Conversation

cwasicki commented Sep 5, 2023

Uh oh!

llucax left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cwasicki commented Sep 6, 2023

Uh oh!

llucax left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cwasicki commented Sep 6, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants