Adding distributions and log scores for K-Normal-Mixture#265
Adding distributions and log scores for K-Normal-Mixture#265soumyasahu wants to merge 3 commits intomasterfrom
Conversation
|
The math and implementation look good to me. @ryan-wolbeck do you know what's going on with the failing checks? |
|
@soumyasahu can you take a look at fixing the following ************* Module ngboost.distns.mixture_normal |
|
I think I have fixed other issuses apart from the followings: ngboost/distns/mixture_normal.py:3:0: C0414: Import alias does not rename original package (useless-import-alias) Actually, I don't understand these issues. One problem may be -- 'RegressionDistn' is inside the function k_normal_mixture. |
I have coded the log score and the derivatives based on the attached derivations.
Implementation_of_Mixture_Normal_Density_in_NGBoost.pdf
To map the mixture proportions I have used multivariate logit transformation. The inverse of the Jacobian of this transformation is required to find 'd_score'. This can be calculated in a closed-form in the following way,
Inv_jaccobian.pdf
The exact Fisher information matrix can be calculated but the expressions of double derivatives will be ugly. I shall give it a try later.
For initial values, K-means clustering has been used where sample proportions, means, and variances from each cluster are considered as mixture proportions, mean, and variance of each normal distribution.