-
Notifications
You must be signed in to change notification settings - Fork 1
Add monitor normalization #252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
src/ess/reduce/normalization.py
Outdated
| # each detector bin and ignore other values lying in the same detector bin. | ||
| # But integration would pick up all monitor bins. | ||
| return monitor.rebin({dim: det_coord}) | ||
| return monitor[dim, lo:hi] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good but can you briefly outline which code was simply copied over and which code is new? Thanks :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing is copied without change. But normalize_by_monitor_integrated is unchanged apart from extracting _clip_monitor_to_detector_range. However, the latter has changed as I described in my initial comment.
normalize_by_monitor_histogram now also clips the data. And the actual normalisation has changed. It used to be https://github.com/scipp/essdiffraction/blob/c9c29dd6155cdb0f9fd201e6c9d6923a754b1e37/src/ess/powder/correction.py#L57-L67, i.e., only det / mon without extra scaling factors.
src/ess/reduce/correction.py
Outdated
| # Mask zero-count bins, which are an artifact from the rectangular 2-D binning. | ||
| # The wavelength of those bins must be excluded when determining the range. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes the assumption that we are not in detector space any more, but something like two_theta, as we did in essdiffraction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment does. But the code does not. Do you think we shouldn't mask in all cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should not. Detector pixels can have true zeros (maybe unlikely with significant background?), but that should not limit the wavelength range?
| f"Missing '{dim}' coordinate in detector for monitor normalization." | ||
| ) | ||
|
|
||
| if monitor.coords[dim].min() > lo or monitor.coords[dim].max() < hi: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi is the largest event, don't we need hi < mon.max for lookup to do its job?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. That is what this condition checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is, shouldn't it raise if hi is not less then the max? Currently it checks of max is less than hi, which is not the same, or is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you asking whether the error condition should be monitor.coords[dim].max() <= hi?
This would not be correct for histogrammed detectors:
essreduce/tests/correction_test.py
Line 61 in 52763fc
| def test_normalize_by_monitor_histogram_aligned_bins_hist() -> None: |
For binned detectors, it is also fine as long as there is a bin-coord:
essreduce/tests/correction_test.py
Line 15 in 52763fc
| def test_normalize_by_monitor_histogram_aligned_bins_w_event_coord() -> None: |
If there is no bin-coord, then it gets tricky. I am not sure the code does the right thing in this case. E.g., consider
def test_normalize_by_monitor_histogram_aligned_bins_wo_bin_coord() -> None:
detector = (
sc.DataArray(
sc.array(dims=['w'], values=[0, 10, 30], unit='counts'),
coords={'w': sc.arange('w', 3.0, unit='Å')},
)
.bin(w=sc.array(dims=['w'], values=[0.0, 2, 3], unit='Å'))
.drop_coords('w')
)
monitor = sc.DataArray(
sc.array(dims=['w'], values=[5.0, 6.0], unit='counts'),
coords={'w': sc.array(dims=['w'], values=[0.0, 2, 3], unit='Å')},
)
normalized = normalize_by_monitor_histogram(
detector,
monitor=monitor,
uncertainty_broadcast_mode=UncertaintyBroadcastMode.fail,
)
expected = (
sc.DataArray(
sc.array(dims=['w'], values=[0.0, 44 / 3, 55 / 3], unit='counts'),
coords={'w': sc.arange('w', 3.0, unit='Å')},
)
.bin(w=sc.array(dims=['w'], values=[0.0, 2, 3], unit='Å'))
.drop_coords('w')
)
sc.testing.assert_identical(normalized, expected)I would expect this to work based on the event coords. But hi is computed to be 2Å in this case which removes the upper monitor bin. This is one of the issues I was trying to avoid with this implementation but apparently failed. I am not sure how to best handle this case.
| raise ValueError( | ||
| f"Cannot normalize by monitor: The {dim} range of the monitor " | ||
| f"({monitor.coords[dim].min().value} to {monitor.coords[dim].max().value}) " | ||
| f"is smaller than the range of the detector ({lo.value} to {hi.value})." | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we deal with cases where the range is different, like I encounter in essdiffraction/beamlime? Above it seems not even a wavelength mask is taken into account, that is I this is just a dead end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can extend this to take masks into account. But in general, you should slice your data to be within the monitor range.
Note that this has not changed compared to the implementation in ESSdiffraction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really? The only way I could make essdiffraction work for me was by masking a wavelength range, or else I would get exceptions in the monitor normalization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But in general, you should slice your data to be within the monitor range.
You who? Where do the workflows do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a separate step. It should be clearly visible that we automatically limit the wavelength range of the data.
src/ess/reduce/normalization.py
Outdated
| """ | ||
| clipped = _clip_monitor_to_detector_range(monitor=monitor, detector=detector) | ||
| coord = clipped.coords[clipped.dim] | ||
| norm = (clipped * (coord[1:] - coord[:-1])).data.sum() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to ask:
- Why ignore masks?
- Do we need
nansum?
But then I wondered if both of that is anyway invalid in the integrated case (unless the detector is masked in the same range)? So maybe the real question should be:
If there are masks (or nans) and we integrate, we should:
- Apply the mask to the detector.
- Mask the nan ranges (of the monitor) in the detector
- Do not ignore the the mask in the integration.
- Use
nansum.
Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right. We need the union of the detector and monitor masks.
src/ess/reduce/normalization.py
Outdated
| coord = clipped.coords[dim] | ||
| delta_w = coord[1:] - coord[:-1] | ||
| total_monitor_weight = broadcast_uncertainties( | ||
| clipped.sum() / delta_w.sum(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nansum, maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where would NaN values come from? And if there are any, should we instead mask them? See discussion above.
|
@jl-wynen Should we get this ready? |
52763fc to
93eff4b
Compare
|
|
93eff4b to
8d45af3
Compare
|
Ready now. @SimonHeybrock can you take another look? |
|
For now, here is Claude's assessment. Will have a look myself now! Review of PR #252: Add monitor normalizationI've reviewed the PR with particular focus on the recent changes (commits 489c65e "Respect masks in range calculation" and 8d45af3 "Apply monitor masks"). Here's my detailed analysis: Overall AssessmentThe PR adds well-designed monitor normalization functionality with two approaches:
The implementation is mathematically sound and well-documented. The recent changes improve mask handling significantly. Recent Changes AnalysisCommit 489c65e: "Respect masks in range calculation"Strengths:
Design rationale (lines 172-176): Commit 8d45af3: "Apply monitor masks"Strengths:
Implementation details: The mask reshaping via mask = sc.DataArray(monitor_mask, coords={dim: monitor.coords[dim]}).rebin(
{dim: detector.coords[dim]}
).data != sc.scalar(0, unit=None)The Issues Found1. Typo (Line 215)# This can lead to masking more vents than strictly necessaryShould be "events" not "vents". 2. Potential Issue: Mask combination could be more idiomaticLines 237-243 manually combine masks: mask = None
for m in masks:
if mask is None:
mask = m
else:
mask |= mThis works but could use Design QuestionsQuestion 1: Line 91 wraps Question 2: The behavior when detector has masked bins is well-documented, but it means masked detector bins still affect the normalization factor for unmasked bins. Is this the intended behavior? (The documentation suggests yes, with the warning to "only drop masks after normalization if you know what you are doing.") Code QualityPositive aspects:
Minor style observations:
Test CoverageThe test suite is exemplary:
Physics/Science CorrectnessThe approach of:
...all appears physically sound and matches the Mantid reference implementation as stated. Recommendations
ConclusionThis is high-quality code with excellent test coverage and documentation. The recent mask-handling changes are well-thought-out improvements. The only required change is fixing the typo. The PR is ready to merge after that minor fix. |
src/ess/reduce/normalization.py
Outdated
| mask = None | ||
| for m in masks: | ||
| if mask is None: | ||
| mask = m | ||
| else: | ||
| mask |= m | ||
|
|
||
| return mask |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reminds my of https://github.com/scipp/scipp/blob/d1e45c4d3476ebf6ffb41088696a99a53b6217d9/src/scipp/core/concepts.py#L76 but I don't know if that is any easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could use that function. But we don't need the multi-dim handling here.
| mask = sc.DataArray(monitor_mask, coords={dim: monitor.coords[dim]}).rebin( | ||
| {dim: detector.coords[dim]} | ||
| ).data != sc.scalar(0, unit=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this go badly wrong (masking everything) if we have an event-mode detector with just 1 or few bins along wavelength?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. But why would we? I think a more realistic problem is that we don't have a wavelength bin coord because we never binned in wavelength. I will update for that.
If you want to support arrays like you describe, then we pretty much have to ignore the existing binning and always operate on the events. Meaning we need to create an event mask and use the events in the range calculations above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, if we require a "reasonable" wavelength dim length for the detector then we need to make that very clear. I think it also implies that data must NOT be in detector-space any more, as otherwise we get too many bins?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, was this addressed somehow?
src/ess/reduce/normalization.py
Outdated
|
|
||
| where :math:`m_i` is the monitor intensity in bin :math:`i`, | ||
| :math:`x_i` is the lower bin edge of bin :math:`i`, and | ||
| :math:`I(x_i, x_{i+1})` selects bins that are within the range of the detector. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is I(x_i, x_{i+1}) an indicator function that is 1 if the interval (x_i, x_{i+1}) is in the wavelength range of the detector and 0 otherwise?
The in-place or modified an input mask.
c1621cb to
99fa7f9
Compare
|
It should be ready now. Please take a fresh look at the equations! I made some changes:
|
jokasimr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
| *, detector: sc.DataArray, monitor: sc.DataArray | ||
| ) -> sc.DataArray: | ||
| """Mask the detector where the monitor is masked. | ||
|
|
||
| For performance, this applies the monitor mask to the detector bins. | ||
| This can lead to masking more events than strictly necessary if we | ||
| used an event mask. | ||
| """ | ||
| if (monitor_mask := _monitor_mask(monitor)) is None: | ||
| return detector | ||
|
|
||
| # Use rebin to reshape the mask to the detector: | ||
| dim = monitor.dim | ||
| mask = sc.DataArray(monitor_mask, coords={dim: monitor.coords[dim]}).rebin( | ||
| {dim: detector.coords[dim]} | ||
| ).data != sc.scalar(0, unit=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed this part when I reviewed earlier.
- Looks to me like we assume the detector shares a dimension with the monitor. Don't we need to check
detector.binsfirst in that case? Ifdetector.binsis notNonethen the detector dimensions probably represent the detector geometry, while the monitor always has dimensionwavelength. Is that not the case? - This seems to masks all regions of the monitor hat are either "not finite" or are masked. But does it mask the regions where the monitor has
0counts? If we don't mask those regions that will divide by zero when we do the normalization.
This addresses scipp/essdiffraction#167
I moved the code here because it is rather involved and it will be needed by spectroscopy and reflectometry as well. Once this is merged, I will prepare a PR in ESSdiffraction to use these new implementations.
Note that I changed the behaviour. The code now favours bin coordinates over event coordinates when selecting the monitor range. Here is an example why: Let's say we have detector data where the last event is at 1.9Å but the last bin ends at 2Å. And we have a monitor with data up to 1.91Å. If we check ranges based on events, we would be allowed to normalize. But if we histogrammed after normalization, we would get a detector bin that extends to 2Å even though we only have normalization data for part of this bin. The chosen implementation avoids this problem. In practice this probably does not matter because bin widths should be small enough to limit detector bins to be within the monitor range.
Further, for histogrammed detectors, the monitor is now rebinned to match the detector instead of using
lookup. This assigns more accurate weights to each bin. And, AFAIK, this matches Mantid.