You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I get a df.corr() matrix, I wish there was a function to return only top n pair of features with strongest correlation. This is doable with a few lines of code but having a one-stop shop function might be warranted due to how frequently it will be used.
Feature Description
Add a new method to df.corr() that returns top N pairs of features sorted by correlation strength. Something like :
df.corr().top_correlated_features(top=5)
thanks to this post on stackoverflow, here is an easy solution that can be implemented in a method with a top_N arg:
corr_matrix = df.corr().abs()
corr_matrix.where(np.triu(np.ones(corr_matrix.shape),k=1).astype(bool)).stack().sort_values(ascending=False).head(N)