-
Notifications
You must be signed in to change notification settings - Fork 50
Open
Description
Dear Miles,
I have used gap-stat on a same dataset. However, the optimal number of clusters that gap-stat returns is not always the same. I guess this happens because the reference distribution is randomly generated (actually, you use numpy for that in the code). So, for reproducibility reasons, it appears reasonable to have optimalK function with an argument
random_state.
If you agree, maybe I would be able to change the code accordingly, with your directions and help.
Thanks!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels