Add monitor normalization #252

jl-wynen · 2025-07-07T15:34:19Z

I moved the code here because it is rather involved and it will be needed by spectroscopy and reflectometry as well. Once this is merged, I will prepare a PR in ESSdiffraction to use these new implementations.

Note that I changed the behaviour. The code now favours bin coordinates over event coordinates when selecting the monitor range. Here is an example why: Let's say we have detector data where the last event is at 1.9Å but the last bin ends at 2Å. And we have a monitor with data up to 1.91Å. If we check ranges based on events, we would be allowed to normalize. But if we histogrammed after normalization, we would get a detector bin that extends to 2Å even though we only have normalization data for part of this bin. The chosen implementation avoids this problem. In practice this probably does not matter because bin widths should be small enough to limit detector bins to be within the monitor range.

Further, for histogrammed detectors, the monitor is now rebinned to match the detector instead of using lookup. This assigns more accurate weights to each bin. And, AFAIK, this matches Mantid.

src/ess/reduce/correction.py

nvaytet · 2025-07-09T13:09:56Z

src/ess/reduce/normalization.py

+        # each detector bin and ignore other values lying in the same detector bin.
+        # But integration would pick up all monitor bins.
+        return monitor.rebin({dim: det_coord})
+    return monitor[dim, lo:hi]


Changes look good but can you briefly outline which code was simply copied over and which code is new? Thanks :-)

Nothing is copied without change. But normalize_by_monitor_integrated is unchanged apart from extracting _clip_monitor_to_detector_range. However, the latter has changed as I described in my initial comment.

normalize_by_monitor_histogram now also clips the data. And the actual normalisation has changed. It used to be https://github.com/scipp/essdiffraction/blob/c9c29dd6155cdb0f9fd201e6c9d6923a754b1e37/src/ess/powder/correction.py#L57-L67, i.e., only det / mon without extra scaling factors.

SimonHeybrock · 2025-07-11T05:45:02Z

src/ess/reduce/correction.py

+        # Mask zero-count bins, which are an artifact from the rectangular 2-D binning.
+        # The wavelength of those bins must be excluded when determining the range.


This makes the assumption that we are not in detector space any more, but something like two_theta, as we did in essdiffraction?

The comment does. But the code does not. Do you think we shouldn't mask in all cases?

I think we should not. Detector pixels can have true zeros (maybe unlikely with significant background?), but that should not limit the wavelength range?

SimonHeybrock · 2025-07-11T05:51:14Z

src/ess/reduce/normalization.py

+            f"Missing '{dim}' coordinate in detector for monitor normalization."
+        )
+
+    if monitor.coords[dim].min() > lo or monitor.coords[dim].max() < hi:


hi is the largest event, don't we need hi < mon.max for lookup to do its job?

Yes. That is what this condition checks.

My point is, shouldn't it raise if hi is not less then the max? Currently it checks of max is less than hi, which is not the same, or is it?

Are you asking whether the error condition should be monitor.coords[dim].max() <= hi?

This would not be correct for histogrammed detectors:

essreduce/tests/correction_test.py

Line 61 in 52763fc

def test_normalize_by_monitor_histogram_aligned_bins_hist() -> None:

For binned detectors, it is also fine as long as there is a bin-coord:

essreduce/tests/correction_test.py

Line 15 in 52763fc

def test_normalize_by_monitor_histogram_aligned_bins_w_event_coord() -> None:

If there is no bin-coord, then it gets tricky. I am not sure the code does the right thing in this case. E.g., consider

def test_normalize_by_monitor_histogram_aligned_bins_wo_bin_coord() -> None: detector = ( sc.DataArray( sc.array(dims=['w'], values=[0, 10, 30], unit='counts'), coords={'w': sc.arange('w', 3.0, unit='Å')}, ) .bin(w=sc.array(dims=['w'], values=[0.0, 2, 3], unit='Å')) .drop_coords('w') ) monitor = sc.DataArray( sc.array(dims=['w'], values=[5.0, 6.0], unit='counts'), coords={'w': sc.array(dims=['w'], values=[0.0, 2, 3], unit='Å')}, ) normalized = normalize_by_monitor_histogram( detector, monitor=monitor, uncertainty_broadcast_mode=UncertaintyBroadcastMode.fail, ) expected = ( sc.DataArray( sc.array(dims=['w'], values=[0.0, 44 / 3, 55 / 3], unit='counts'), coords={'w': sc.arange('w', 3.0, unit='Å')}, ) .bin(w=sc.array(dims=['w'], values=[0.0, 2, 3], unit='Å')) .drop_coords('w') ) sc.testing.assert_identical(normalized, expected)

I would expect this to work based on the event coords. But hi is computed to be 2Å in this case which removes the upper monitor bin. This is one of the issues I was trying to avoid with this implementation but apparently failed. I am not sure how to best handle this case.

SimonHeybrock · 2025-07-11T05:51:25Z

src/ess/reduce/normalization.py

+        raise ValueError(
+            f"Cannot normalize by monitor: The {dim} range of the monitor "
+            f"({monitor.coords[dim].min().value} to {monitor.coords[dim].max().value}) "
+            f"is smaller than the range of the detector ({lo.value} to {hi.value})."
+        )


How do we deal with cases where the range is different, like I encounter in essdiffraction/beamlime? Above it seems not even a wavelength mask is taken into account, that is I this is just a dead end?

We can extend this to take masks into account. But in general, you should slice your data to be within the monitor range.

Note that this has not changed compared to the implementation in ESSdiffraction.

Really? The only way I could make essdiffraction work for me was by masking a wavelength range, or else I would get exceptions in the monitor normalization.

But in general, you should slice your data to be within the monitor range.

You who? Where do the workflows do this?

In a separate step. It should be clearly visible that we automatically limit the wavelength range of the data.

SimonHeybrock · 2025-07-11T05:53:11Z

src/ess/reduce/normalization.py

+    """
+    clipped = _clip_monitor_to_detector_range(monitor=monitor, detector=detector)
+    coord = clipped.coords[clipped.dim]
+    norm = (clipped * (coord[1:] - coord[:-1])).data.sum()


I was going to ask:

Why ignore masks?

Do we need nansum?
But then I wondered if both of that is anyway invalid in the integrated case (unless the detector is masked in the same range)? So maybe the real question should be:

If there are masks (or nans) and we integrate, we should:

Apply the mask to the detector.

Mask the nan ranges (of the monitor) in the detector

Do not ignore the the mask in the integration.

Use nansum.

Or am I missing something?

I think you are right. We need the union of the detector and monitor masks.

src/ess/reduce/normalization.py

SimonHeybrock · 2025-07-11T06:01:26Z

src/ess/reduce/normalization.py

+    coord = clipped.coords[dim]
+    delta_w = coord[1:] - coord[:-1]
+    total_monitor_weight = broadcast_uncertainties(
+        clipped.sum() / delta_w.sum(),


nansum, maybe?

Where would NaN values come from? And if there are any, should we instead mask them? See discussion above.

SimonHeybrock · 2025-10-20T04:50:10Z

@jl-wynen Should we get this ready?

jl-wynen · 2025-10-20T13:30:08Z

~~The implementation for monitor masks is incomplete. I need to figure out how to handle mismatched shapes...~~

jl-wynen · 2025-10-24T07:28:42Z

Ready now. @SimonHeybrock can you take another look?

SimonHeybrock · 2025-10-24T08:00:30Z

For now, here is Claude's assessment. Will have a look myself now!

Review of PR #252: Add monitor normalization

I've reviewed the PR with particular focus on the recent changes (commits 489c65e "Respect masks in range calculation" and 8d45af3 "Apply monitor masks"). Here's my detailed analysis:

Overall Assessment

The PR adds well-designed monitor normalization functionality with two approaches:

Histogram normalization - bin-by-bin normalization preserving spectral information
Integrated normalization - simple scalar normalization factor

The implementation is mathematically sound and well-documented. The recent changes improve mask handling significantly.

Recent Changes Analysis

Commit `489c65e`: "Respect masks in range calculation"

Strengths:

Replaced special-case zero-count handling with proper mask respect - much cleaner approach
Simplified code using walrus operators (:=)
Correctly uses mask.all() to reduce along non-monitor dimensions, ensuring bins with ANY unmasked data are included
Improved error messages using :c format specifier

Design rationale (lines 172-176):
The use of mask.all() is correct: it reduces masks along dimensions perpendicular to the monitor dimension, marking a bin as masked only if ALL its elements are masked. This ensures bins with any valid data contribute to range calculations.

Commit `8d45af3`: "Apply monitor masks"

Strengths:

Adds proper monitor mask handling via _mask_detector_for_norm() and _monitor_mask()
Automatically masks non-finite monitor values (NaN, Inf)
Excellent test coverage (aligned/unaligned bins, multiple masks, edge cases, non-finite values)
Clear documentation warnings about mask impact on normalization factors

Implementation details:

The mask reshaping via rebin() (lines 221-226) is clever:

mask = sc.DataArray(monitor_mask, coords={dim: monitor.coords[dim]}).rebin(
    {dim: detector.coords[dim]}
).data != sc.scalar(0, unit=None)

The != 0 comparison handles fractional mask values from rebinning - any bin with partial masking gets fully masked (conservative approach).

Issues Found

1. Typo (Line 215)

# This can lead to masking more vents than strictly necessary

Should be "events" not "vents".

2. Potential Issue: Mask combination could be more idiomatic

Lines 237-243 manually combine masks:

mask = None
for m in masks:
    if mask is None:
        mask = m
    else:
        mask |= m

This works but could use functools.reduce(operator.or_, masks) if you wanted to be more Pythonic. However, the current approach handles the empty case more explicitly, so this is fine as-is.

Design Questions

Question 1: Line 91 wraps delta_w in a DataArray with masks, but this seems necessary only for the sum. Is there a reason not to compute the mask-aware sum directly?

Question 2: The behavior when detector has masked bins is well-documented, but it means masked detector bins still affect the normalization factor for unmasked bins. Is this the intended behavior? (The documentation suggests yes, with the warning to "only drop masks after normalization if you know what you are doing.")

Code Quality

Positive aspects:

Complete type annotations ✓
Excellent documentation with mathematical formulas ✓
Clear attention blocks warning about mask behavior ✓
Comprehensive test suite (633 lines of tests for 244 lines of code!) ✓
Good error messages with context ✓

Minor style observations:

Consistent with existing codebase style
Good use of helper functions for separation of concerns
Clear variable names

Test Coverage

The test suite is exemplary:

Tests for binned and histogrammed detectors
Aligned and unaligned binning scenarios
Monitor finer/coarser than detector
Edge cases (monitor range too narrow, boundary conditions)
Mask handling (detector masks, monitor masks, multiple masks, non-finite values)
Both normalization methods thoroughly tested

Physics/Science Correctness

The approach of:

Preferring bin coordinates over event coordinates for range determination (lines 178-181)
Rebinning monitor to detector for histogrammed data vs using lookup for binned data
Weighting by bin widths and renormalizing

...all appears physically sound and matches the Mantid reference implementation as stated.

Recommendations

Fix typo on line 215: "vents" → "events"
Consider adding a sentence to the docstring explaining why _monitor_mask is set on the output (so users understand where it comes from)
Optional: Add a simple usage example in the module docstring showing typical workflow

Conclusion

This is high-quality code with excellent test coverage and documentation. The recent mask-handling changes are well-thought-out improvements. The only required change is fixing the typo. The PR is ready to merge after that minor fix.

SimonHeybrock · 2025-10-24T08:06:48Z

src/ess/reduce/normalization.py

+    mask = None
+    for m in masks:
+        if mask is None:
+            mask = m
+        else:
+            mask |= m
+
+    return mask


Reminds my of https://github.com/scipp/scipp/blob/d1e45c4d3476ebf6ffb41088696a99a53b6217d9/src/scipp/core/concepts.py#L76 but I don't know if that is any easier.

I could use that function. But we don't need the multi-dim handling here.

SimonHeybrock · 2025-10-24T08:09:05Z

src/ess/reduce/normalization.py

+    mask = sc.DataArray(monitor_mask, coords={dim: monitor.coords[dim]}).rebin(
+        {dim: detector.coords[dim]}
+    ).data != sc.scalar(0, unit=None)


Won't this go badly wrong (masking everything) if we have an event-mode detector with just 1 or few bins along wavelength?

Yes. But why would we? I think a more realistic problem is that we don't have a wavelength bin coord because we never binned in wavelength. I will update for that.

If you want to support arrays like you describe, then we pretty much have to ignore the existing binning and always operate on the events. Meaning we need to create an event mask and use the events in the range calculations above.

Well, if we require a "reasonable" wavelength dim length for the detector then we need to make that very clear. I think it also implies that data must NOT be in detector-space any more, as otherwise we get too many bins?

Sorry, was this addressed somehow?

jokasimr · 2025-10-28T18:17:27Z

src/ess/reduce/normalization.py

+
+    where :math:`m_i` is the monitor intensity in bin :math:`i`,
+    :math:`x_i` is the lower bin edge of bin :math:`i`, and
+    :math:`I(x_i, x_{i+1})` selects bins that are within the range of the detector.


Is I(x_i, x_{i+1}) an indicator function that is 1 if the interval (x_i, x_{i+1}) is in the wavelength range of the detector and 0 otherwise?

The in-place or modified an input mask.

jl-wynen · 2025-11-10T13:57:47Z

It should be ready now. Please take a fresh look at the equations! I made some changes:

The monitor is no longer clipped to the detector. This must now be done by the caller. The reason is that the detector range may be too narrow for and remove important parts of the monitor histogram. (Esp. in spectroscopy)
The histogram norm no longer has the sum terms. They are likely needed for spectroscopy but not for diffraction. We will build something for bifrost once we know exactly what is needed.
The integrated norm is no longer an actual integral but a plain sum to satisfy test_independent_of_monitor_binning.

jokasimr

LGTM 👍

jokasimr · 2025-11-13T14:30:40Z

src/ess/reduce/normalization.py

+    *, detector: sc.DataArray, monitor: sc.DataArray
+) -> sc.DataArray:
+    """Mask the detector where the monitor is masked.
+
+    For performance, this applies the monitor mask to the detector bins.
+    This can lead to masking more events than strictly necessary if we
+    used an event mask.
+    """
+    if (monitor_mask := _monitor_mask(monitor)) is None:
+        return detector
+
+    # Use rebin to reshape the mask to the detector:
+    dim = monitor.dim
+    mask = sc.DataArray(monitor_mask, coords={dim: monitor.coords[dim]}).rebin(
+        {dim: detector.coords[dim]}
+    ).data != sc.scalar(0, unit=None)


I missed this part when I reviewed earlier.

Looks to me like we assume the detector shares a dimension with the monitor. Don't we need to check detector.bins first in that case? If detector.bins is not None then the detector dimensions probably represent the detector geometry, while the monitor always has dimension wavelength. Is that not the case?

This seems to masks all regions of the monitor hat are either "not finite" or are masked. But does it mask the regions where the monitor has 0 counts? If we don't mask those regions that will divide by zero when we do the normalization.

nvaytet reviewed Jul 9, 2025

View reviewed changes

SimonHeybrock reviewed Jul 11, 2025

View reviewed changes

jl-wynen force-pushed the monitor-normalization branch from 52763fc to 93eff4b Compare October 20, 2025 13:29

jl-wynen force-pushed the monitor-normalization branch from 93eff4b to 8d45af3 Compare October 24, 2025 07:28

SimonHeybrock reviewed Oct 24, 2025

View reviewed changes

jl-wynen mentioned this pull request Oct 24, 2025

Normalization by integrated monitor seems to be broken scipp/essdiffraction#146

Open

jokasimr reviewed Oct 28, 2025

View reviewed changes

jl-wynen marked this pull request as draft November 10, 2025 10:10

jl-wynen added 12 commits November 10, 2025 14:52

Add monitor normalization

02682c7

Rename correction to normalization

fcb95f9

Fix typo

341ccd8

Add normalization to docs

214f645

Improve docs

ff6f184

Respect masks in range calculation

f229294

Apply monitor masks

b4987e0

Fix typo

eeafdcb

Do not use in-place op

7cba862

The in-place or modified an input mask.

Add explanation of _monitor_mask

2397d65

Do not clip and no extra integral

53b861f

Use simple sum, not integral

99fa7f9

jl-wynen force-pushed the monitor-normalization branch from c1621cb to 99fa7f9 Compare November 10, 2025 13:56

jl-wynen marked this pull request as ready for review November 10, 2025 13:56

jokasimr approved these changes Nov 10, 2025

View reviewed changes

jokasimr reviewed Nov 13, 2025

View reviewed changes

		# Mask zero-count bins, which are an artifact from the rectangular 2-D binning.
		# The wavelength of those bins must be excluded when determining the range.

Add monitor normalization #252

Are you sure you want to change the base?

Add monitor normalization #252

Conversation

jl-wynen commented Jul 7, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SimonHeybrock commented Oct 20, 2025

Uh oh!

jl-wynen commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jl-wynen commented Oct 24, 2025

Uh oh!

SimonHeybrock commented Oct 24, 2025

Review of PR #252: Add monitor normalization

Overall Assessment

Recent Changes Analysis

Commit 489c65e: "Respect masks in range calculation"

Commit 8d45af3: "Apply monitor masks"

Issues Found

1. Typo (Line 215)

2. Potential Issue: Mask combination could be more idiomatic

Design Questions

Code Quality

Test Coverage

Physics/Science Correctness

Recommendations

Conclusion

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jl-wynen commented Oct 20, 2025 •

edited

Loading

Commit `489c65e`: "Respect masks in range calculation"

Commit `8d45af3`: "Apply monitor masks"

jl-wynen commented Nov 10, 2025 •

edited

Loading

jokasimr Nov 13, 2025 •

edited

Loading