Use sciform as a formatting backend for uncertainties
#192
Replies: 10 comments 18 replies
-
|
The subject is interesting. As a matter of principle, I'm for avoiding duplication of efforts (in this case, maintaining and possibly extended the formatting of numbers with uncertainty). Formatting obviously belongs in a library that manipulates numbers with uncertainty, because users will often want to print them at some point. Now, how formatting is implemented can indeed be delegated to another library (with the advantage cited above). Back around 2009, I did check best scientific practices in terms of number printing (including NIST's and the Particle Data Group's) and took them into account while writing the formatting code, so the situation shouldn't be too bad. Now, I fully appreciate that some users might want additional options, and again, being able to maintain the formatting of numbers with uncertainty in a single place will help with that. Now, at this stage, it looks to me like a better option is to offer
And for reference: the formatting code was actually much harder to write than the rest! There are in fact many corner cases, that come from the interaction between formatting options (LaTeX, uncertainties in parentheses, the handling of NaN and inf, all within the constraints of padding, etc.). This makes the code also probably the hardest part of the package to read. This is where avoiding effort duplication can pay off, if it can be put in place. Again, I feel that this is not so hard to do, for the reasons above. At this stage, offering |
Beta Was this translation helpful? Give feedback.
-
|
@lebigot, thanks for your thoughtful responses. I've been working out how to respond. I'll try to mainly focus on proposed paths forward. Note, I'm assuming these changes would be implemented on a 3.x -> 4.x changeover so that breaking changes are considered acceptable. But first, yes, Trying to figure out exactly what you mean by:
By this do you mean there would be some package-wide global flag that selects either existing Here's what I consider to be my favorite idea after brainstroming for a little. Rather than importing the One big advantage of this is that the The other big advantage is that That said, this approach wouldn't be fully backwards compatible without a lot of custom formatting still happening in
One more point I'll make. It looks like For existing incompatibilities see: For the latter I would have expected Since the existing FSML isn't fully compatible with the built-in FSML, and because I consider the built-in FSML to not have made the best choices for scientific formatting, I suggest or If you want to include left padding by |
Beta Was this translation helpful? Give feedback.
-
|
Some other comments:
I'm hoping the answer to "can we make breaking changes" is "yes". If that's the case then maybe a more productive direction for this conversation is "How should formatting be updated/redesigned for I would be interested in the latter discussion I mentioned and I could list out some ideas I have. |
Beta Was this translation helpful? Give feedback.
-
|
Yes, I was suggesting that users could set a global option in I still like your favorite idea ( As for some of the formatting behavior choices:
Good point, about the I personally don't mind breaking changes, but only as long as they can be mostly painless to users. This is a relevant point, for Concretely, one can, in this case:
How does this sound? |
Beta Was this translation helpful? Give feedback.
-
This will be the trickiest to support with But, about getting the
This would also be challenging to support with
It would be very easy to support this the way I'm imagining. it.
This seems fine for a transition involving breaking changes. The only thing I would add is to have a deprecation schedule even for the painless option to use the old formatting. The only reason we'd make a breaking change to the current formatting is we think we've designed something that is improved. If it really is improved I think the goal should be for all users to move to it eventually with enough warning. |
Beta Was this translation helpful? Give feedback.
-
|
However, with everything said, I would like to shift this discussion from "how can So for my next post I'll try to make a list of improvements I would like to see for |
Beta Was this translation helpful? Give feedback.
-
|
How can Note that this conversation is irrelevant if probably also exposing something like But, in the likely event First some features I consider to be relatively important
Next some features I could take or leave
Then there's a question that concerns a few possible changes (don't know if they're improvements or not).
If users really want alignment then If uncertainties were to take these points of view then I would suggest:
|
Beta Was this translation helpful? Give feedback.
-
|
@jagerber48 @lebigot Sorry that I have not had time to keep up with this detailed conversation. I agree with @jagerber48 when in saying in the original post:
There is sort of a lot of code in Aside from how to best express precision and the number of significant digits in the printed form, there is a related question of what should
That suggests that >>> from uncertatinties import ufloat
>>> x = ufloat(3, 0.2)
>>> repr(x)
'3.0+/-0.2'could return I don't think anyone is complaining about the current rendering, but it is worth re-deciding whether With |
Beta Was this translation helpful? Give feedback.
-
This is something I'd like to do too, unrelated to sciform, as it reduces file size somewhat |
Beta Was this translation helpful? Give feedback.
-
|
I like how Here's a link to the pint docs with an example using sciform |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I use the
uncertaintiespackage for two things.The former strikes me as the core functionality of the package. From the short description of the package:
The former functionality is also a mathematically and computationally challenging problem to get done well.
The latter regarding the display of value uncertainty pairs strikes me as useful nice-to-have feature while working with value/uncertainty pairs. The code to realize value/uncertainty formatting is not as algorithmically demanding as the core error propagation code. Rather, the formatting code is sprinkled with parsing many user options, doing various rounding, string manipulations, etc., adherence to published standards, and watching out for various edge cases like
nan, or rounding changing the number of significant figures etc.In short, I view the error propagation code and the formatting code as being very different types of codes. My general suggestion is that the formatting code should be moved out of
uncertaintiesand delegated to a separate package which can be dedicated to implementing scientific formatting according to published standards anduncertainties. I suggest that this is a better separation of responsibilities.Heavily motivated/inspired by
uncertaintiesformatting, I wrote the sciform package.sciformis a package dedicated to formatting values and value/uncertainty pairs according to published scientific standards. It provides many relevant formatting options and a few workflows for implementing those options to format numbers into strings. Some relevant features foruncertaintiessciformsupport±formatting like123.45 ± 0.67and parentheses uncertainty like123.45(0.67)or123.45(67).sciformsupports explicit control over the number of significant figures of uncertainty to display and formats the corresponding value accordingly.sciformcan do this both for value/uncertainty pairs (like can be done with theufloatnow) but also for values alone.sciformsupports engineering notation in which the exponent in scientific notation is coerced to be a multiple of 3. This is valuable for presenting data more compatibly with SI standards and can improve readability.sciformexposes a format specification mini-language (FSML) similar to the built-in FSML and similar to theuncertaintiesFSML.0.000_012_345 ± 0.000_000_012can be formatted as(12.345 ± 0.012)e-06, butsciformcan also present it as(12.345 ± 0.012) μ. Users can then easily append relevant units for simple dimensional quantities to makeμmorμsetc.sciformwill behave as desired.Many more features can be found in the documentation.
I would like to explore the idea of if
uncertaintiesshould migrate to usingsciformas a formatting backend. Right now theufloatobject, at format time, calls a formatting functions that appear incore.pyinuncertaintiesSee AffineScalarFunc.__format__(). The suggestion would be that instead of calling internaluncertaintiescode theufloatorAffineScalarFuncwould import and callsciformcode. A simple example implementation taking advantage of thesciformFSML would look likewhere
format_specis a validsciformformat specification string. A quick reference example to demonstrate how this might work isIn addition,
sciformallows global configuration of formatting settings to allow users to modify setting not accessible through the FSML (like upper/lower/decimal separator characters).uncertaintiescould expose something likeset_uncertainties_formatting_optionsto allow users even more flexible formatting control if desired. Under-the-hooduncertaintieswould use a localsciformcontext orFormatterobject to implement the user selected global options. Note that the global options apply for any options not explicitly specified by the user's format specification string. Any options selected by the format specification string overwrite the corresponding option in the global configuration.The main caveat I will point is the following. I imagine the main way users will access formatting will be via f-string formatting on the
ufloatobject. Ifuncertaintieswere to migrate to thescinumFSML I have to point out that there are some backwards incompatible differences. See this section for a comparison between thesciformand python built-in FSMLs. The main points I will highlight:sciformalways opts towards positive explicit control of all formatting parameters. For this reasonsciformopted to not support e.g. thegformatting option which does some automatic work in the background to select exactly what options will be used when formatting numbers.sciformONLY supports presenting value/uncertainty pairs by specifying the number of significant digits to display on the uncertainty. The value and uncertainty are always rounded to the same least-significant decimal place. Specifying digits-past-the-decimal-point is not supported because that sort of formatted is not recommended by official standards such as NIST or BIPM.sciformdoes support formatting individual numbers according to a number of digits-past-the-decimal-point but that is less relevant foruncertainties.sciformsupports left padding either zeros or spaces between the most significant digit and the sign symbol so that a number is left-padded up to a certain user-specified decimal place (e.g. the millions place).sciformdoes not support padding by arbitrary symbols, andsciformdoes not support left, right, or center padding.sciformtakes the view that these types of formatting can be applied tosciformoutput strings afterwards if the user has some need for the string to be a certain length or justified a certain way.sciformonly takes responsibility for the numerical and scientific parts of the formatting, whereassciformconsiders left/right/center padding to be a task generic to all python string objects and not specific to scientific/numeric string objects.nanorinfinputs.Adoption of the
sciformFSML would likely mean breaking changes foruncertaintiesformatting so probably would not be suitable for anuncertainties3.x release and would need to wait for a 4.x release.That said, there is another way
uncertaintiescould usesciformthat is not just directly inheriting thesciformFSML.uncertaintiescould retain its same FSML and instead of using the format string to fully format the string, the format string would be parsed and used to construct asciformFormatterobject with the appropriate options which could in the end used to format the string usingsciform. In this case there would still be breaking changes oruncertaintieswould still need to do a lot of the formatting leg work to cover cases which are impossible forsciformto handle.All of this said, I was intentional about excluding features from the
sciformFSML. If a feature isn't included it is probably because it goes against scientific formatting best practices and should probably be avoided anyways. So, despite the differences, ifuncertaintiescan bear backwards incompatible changes to formatting, I think it would be a net benefit touncertaintiesto outsource formatting tosciform.Two final notes:
sciformonly supports python >= 3.9 right now because it internally uses some modern typing features. If it is important forsciformto support lower python versions this can likely be done pretty easy.uncertainties, I am perfectly happy/open to discussing modifications tosciformfor better compatibility.Please let me know what you think or if you have any questions about
sciformor how it could be used withuncertainties.Beta Was this translation helpful? Give feedback.
All reactions