-
Notifications
You must be signed in to change notification settings - Fork 197
fix #905: use the correct mad() normalization #949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| uuid = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91" | ||
| authors = ["JuliaStats"] | ||
| version = "0.34.4" | ||
| version = "0.35.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should really go to 1.0 next time we make a breaking release.
| version = "0.35.0" | |
| version = "1.0" |
It would be good to check whether we want to make other breaking changes though (I don't have ideas off the top of my head).
|
It's been a year since opening this PR, and two years since the linked issue was opened. Anything else should be done here? I'm personally fine with whatever (breaking) version number it becomes, they don't carry real meaning anyways and generally treated the same (unlike the actual behavior!). |
|
The difficulty is that we don't want to tag a major release each time we find a breaking change to make. This change alone doesn't justify a major release IMO. And it doesn't seem we have lots of issues tagged as breaking. But there are probably other changes that would be justified that have not been marked as such. |
devmotion
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just came across this PR again.
I agree that it is a bit annoying to tag too many breaking releases, on the other hand we have done so in the past and it was not a major problem for the ecosystem. I don't think we should wait until all other breaking changes are included - there are > 150 open issues and hence likely too many other potentially breaking changes that we'd like to do. I also think given that we anticipate more breaking changes, it's not really important to tag a 1.0 release and maybe not even appropriate.
Regarding this PR, I wonder:
- Are we certain we want to change the default behaviour? The current behaviour is identical to
madin R: https://web.mit.edu/~r/current/lib/R/library/stats/html/mad.html Maybe it's rather a documentation issue? - In case we want to change the behaviour (I'm leaning towards this alternative), I think we should reconsider the
normalisekeyword argument: 1) The name seems very non-descriptive and somewhat inappropriate; 2) I assume the "normalization" could lead to type instabilities for some argument types; 3) Do we need the keyword argument at all? Maybe we should just document the factor (possibly also define (and export?) the constant to make it easier to use) as done in the documentation of https://web.mit.edu/~r/current/lib/R/library/stats/html/IQR.html
|
|
||
| # Median absolute deviation | ||
| @irrational mad_constant 1.4826022185056018 BigFloat("1.482602218505601860547076529360423431326703202590312896536266275245674447622701") | ||
| @irrational mad_to_std_multiplier 1.4826022185056018 BigFloat("1.482602218505601860547076529360423431326703202590312896536266275245674447622701") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is type piracy. One must not use @irrational outside of Base.Math. The proper way to define irrational constants in packages is to use IrrationalConstants.jl
There will always be breaking changes that we imagine we could make in the future. I think there's a strong demand in the community for tagging 1.0 versions of packages. This is necessary in particular to allow distinguishing bugfix 1.y.z releases from minor 1.y releases which introduce new features. BTW, given that deprecations are silent by default, most end users (as opposed to package authors) will be unlikely to have noticed it. It would be safer to throw an error by default, requiring to pass |
IMO one should just enforce deprecation warnings using |
I kind of wish we had done that when we introduced the deprecation. But what do you suggest we do now? Force printing the deprecation in the current release series, before we change the behavior in the next one? |
|
Yes, I think that's the best option if you want to ensure that users and downstream packages see the deprecation warning (they might still miss it if they don't inspect CI logs or run tests with |
|
Yeah that's a possibility. We could do that immediately and tag a breaking release with the change in a while. |
Make
mad()actually follow the definition of median absolute deviation as it is generally defined (eg https://en.wikipedia.org/wiki/Median_absolute_deviation, scipy, matlab, mathematica, ...).Obviously breaking. Are there any other breaking fixes that could go into the next StatsBase version?
See #905, #97, #347 for more context.