Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
setup.py
Outdated
| "pathos", | ||
| "pybind11>=2.6.0" | ||
| "pybind11>=2.6.0", | ||
| "pymc3>=3.11.2,<4.0" |
There was a problem hiding this comment.
What do you think about making pymc3, pymulintest (...) optional later on? -> So one could have a leightweight version of ompy reading and plotting spectra.
There was a problem hiding this comment.
I've thought about the same. I'm not sure exactly how this is best solved. Maybe that could be the goal for next major version of the OMpy project?
| error_estimator (ErrorFinder): The algorithm used to estimate the | ||
| relative errors of the extracted NLD and γSF. |
There was a problem hiding this comment.
Consistent naming: error_estimator? -> ErrorEstimator class...?
| rel_err_missing (float): Relative error used for points that cannot be | ||
| estimated by error_estimator object. |
There was a problem hiding this comment.
when would that be the case?
There was a problem hiding this comment.
Sometimes if the statistics are low some of the realizations may have different number of points in the extracted NLD and GSF (values that are not nan).
|
|
||
| If the error_estimator attribute or argument is set then relative | ||
| errors will be estimated using the provided algorithm. Points in the | ||
| nld and gsf that the algorithm are unable to estimate will be set to |
There was a problem hiding this comment.
Suggestion: /Relative error/ of points (...) will be set to self.rel_err_missing.
| nld.save(self.path / f'nld_{i}.npy') # Overwrite with errors! | ||
| gsf.save(self.path / f'gsf_{i}.npy') # Overwrite with errors! |
There was a problem hiding this comment.
what is the meaning of the inline comment? is this supposed to happen here (but not yet...)?
There was a problem hiding this comment.
Regardless of attribute error_estimator is set or not, the extracted NLD/GSF will be written to disk here: https://github.com/vetlewi/ompy/blob/1ab2618ec0c2b2e7b8c5935242183587e3451f00/ompy/extractor.py#L169-L170
If error_estimator is set then these two lines will overwrite the generated files on the disk. Could change the comment to Update files on disk with errors or something similar
| This class uses pyMC3 to calculate the relative errors of the data points | ||
| in the extracted NLDs and γSFs. The class implements two different models, | ||
| one logarithmic and one linear. The logarithmic model will usually be more | ||
| stable and should be used if the linear doesn't converge. |
There was a problem hiding this comment.
Add something about the article (to be submitted)
ompy/error_finder.py
Outdated
| Tuple of nld energy points, gsf energy points and the observed | ||
| matrix for the NLD and γSF. |
There was a problem hiding this comment.
What is the observed matrix for the nld and gsf? Outdated? The return in definitively different now
ompy/error_finder.py
Outdated
| return vec_all_common | ||
|
|
||
| def condition_data(self, _nlds: List[Vector], _gsfs: List[Vector], | ||
| ) -> Tuple[ndarray, ndarray, ndarray, ndarray]: |
There was a problem hiding this comment.
Return signature inconsistent (see below)?
ompy/error_finder.py
Outdated
| else: | ||
| return nld_rel_err, gsf_rel_err | ||
|
|
||
| def keep_only(self, vecs: List[Vector]) -> List[Vector]: |
There was a problem hiding this comment.
Move to below condition_data?
ompy/error_finder.py
Outdated
| # Make a copy of the values arrays | ||
| nld_array = [nld.values.copy() for nld in nlds] | ||
| gsf_array = [gsf.values.copy() for gsf in gsfs] | ||
|
|
||
| del nld_array[n] | ||
| del gsf_array[n] | ||
|
|
||
| nld_array = np.array(nld_array) | ||
| gsf_array = np.array(gsf_array) | ||
|
|
||
| # Set the observed data | ||
| nld_obs.append(nld.values[idx_nld]/nld_array) | ||
| gsf_obs.append(gsf.values[idx_gsf]/gsf_array) |
There was a problem hiding this comment.
- This is somewhat funky. Why do you need to manually delete the array -- but after setting it by a copy?
I if I understand this correctly, it is because you want to get rid of the reference dataset. Maybe some comments would be appreciated here. - I'm very confused about this section:
Why/ what does it do? Why is it a devision (...)? Is it
# Set the observed data nld_obs.append(nld.values[idx_nld]/nld_array) gsf_obs.append(gsf.values[idx_gsf]/gsf_array)
qin the article, and not the nld (gsf)?
There was a problem hiding this comment.
I've refactored this entire part of the code. Should be much easier to follow now. Please see the format_data function (https://github.com/vetlewi/ompy/blob/621c95c4797bf11fefd6a95a032c9296fc0c7054/ompy/error_finder.py#L282-L322)
ompy/error_finder.py
Outdated
| nld_obs (ndarray): The observation tensor of the NLD | ||
| gsf_obs (ndarray): The observation tensor of the γSF |
There was a problem hiding this comment.
Where and everywhere else: Would it be better to say _ref and the reference tensor, instead of observed?
There was a problem hiding this comment.
I've refactored. Should be more understandable.
ompy/error_finder.py
Outdated
| nld_obs = np.array(nld_obs) | ||
| gsf_obs = np.array(gsf_obs) | ||
|
|
||
| idx_coef_nld = np.repeat(np.arange(N-1), M_nld).reshape(N-1, M_nld) |
There was a problem hiding this comment.
I have got no idea what idx_coef_nld is
There was a problem hiding this comment.
The idx_coef_nld is just a masking/indexing that are used in the pyMC3 model to broadcast the coef_mask_nld, values_mask_nld, coef_mask_gsf and values_mask_gsf.
| median (bool, optional): If the resulting relative errors should be | ||
| the mean or the median. |
There was a problem hiding this comment.
I'm falling out here: What do you mean by mean or median? This does not seem to be described in the article?
There was a problem hiding this comment.
What to adopt as the value for the relative uncertainty. The mean (average) or the median (50% percentile). This can probably be removed. If the model works as expected these should coincide, I think. If anyone wants to dive deeper into the results they should probably look at the samples directly.
ompy/error_finder.py
Outdated
| mid = np.median if median else np.mean | ||
| nld_rel_err = Vector(E=E_nld, values=mid(trace['σ_ρ'], axis=0), | ||
| std=np.std(trace['σ_ρ'], axis=0), units='MeV') | ||
| gsf_rel_err = Vector(E=E_gsf, values=mid(trace['σ_f'], axis=0), | ||
| std=np.std(trace['σ_f'], axis=0), units='MeV') |
There was a problem hiding this comment.
As mentioned above, not quite sure about this step, yet.
Co-authored-by: Fabio Zeiser <fzeiser@users.noreply.github.com>
Co-authored-by: Fabio Zeiser <fzeiser@users.noreply.github.com>
Co-authored-by: Fabio Zeiser <fzeiser@users.noreply.github.com>
The prior parameters for σ_ρ and σ_f (`lam` & `mu`) can now be modified by the user.
I've added an attribute to make it simple to suppress warnings
A slightly different expression is used as it gives slightly better numerical accuracy at the extremes (q=0 and q=1).
6651bd0 to
3a72c2f
Compare
I've removed the linear version for now. I don't think I want to spend time on making sure everything works correctly just yet. Might come back at a later time when I'm confident that everything else are as it should be.
I've refactored the `ErrorFinder` class significantly to make it much less complex. It should be easier to follow the code and the paper.
The normalizers will automatically remove any nan's in the input GSF or NLD. This will ensure that the normalizers actually gives useful results. It should be considered if the user should be informed if any nan's were removed. Maybe add best solved by adding an optional `logger` argument to the `cut_nan()` method that will be used to inform the user if there is nans in the vector?
This PR introduces a new method for estimating the relative errors of the individual points of the NLD and gSF. A more detailed description on how it works, etc. will be made available once ready for merge.
The PR is still work in progress