-
Notifications
You must be signed in to change notification settings - Fork 28
Move transition matrix computation outside of main function #15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| metric=self.metric) | ||
| return self.affinity_matrix_ | ||
| self.affinity_matrix_ = self.affinity(X) | ||
| return self.affinity_matrix_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the function description should be modified if this function is not going to return the affinity matrix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I re-added the return of affinity_matrix_, but not as a field in self. The reason is that in the code I am currently writing I want to be able to bind results of several affinity calculations to different fields in the object.
| raise ImportError('Checks require scikit-learn, but not found') | ||
|
|
||
| ndim = L.shape[0] | ||
| if overwrite: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overwrite is no longer supported in the rewrite. i think this should still be used as a memory saving option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mfalkiewicz - this is not supported in the refactor
| k = int(max(2, np.round(D.shape[0] * 0.01))) | ||
| eps = 2 * np.median(np.sort(D, axis=0)[k+1, :])**2 | ||
| k = int(max(2, np.round(D.shape[0] * f_neighbors))) | ||
| eps = np.median(np.sort(D, axis=0)[0:k+1, :]**2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if i remember this correctly, eps is based on the k+1 th entry rather than than everything upto that point. it basically provides a scaling factor given the sorted distance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach currently implemented is more liberal than the one I propose and both of them are correct. All manifold learning approaches are based on local similarities and these should be emphasized - I think the median of all the values up to the k-th captures this property more accurately. Besides, I think that parametrization of the kernel in terms of average distances to a certain fraction of nearest neighbors is more natural, because it doesn't depend on the scale of the affinities that one uses.
| L_alpha = d_alpha[:, np.newaxis] * L_alpha | ||
|
|
||
| M = L_alpha | ||
| M = _compute_markov_matrix(L, use_sparse = use_sparse, alpha = alpha) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
M = _compute_markov_matrix(L if overwrite else L.copy(), use_sparse = use_sparse, alpha = alpha)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding this comment and overwrite - I agree that it would be nice to have a memory saving option, but I am not sure if this refactor will allow to manipulate that. I don't know the details of the lexical scope of Python, but maybe you could enlighten me. _compute_markov_matrix creates a local binding A. If we pass an alias as an argument (L) rather than the object itself (L.copy()), is A also going to be an alias? In other words, if we pass an alias to variable that is in the lexical scope of function compute_diffusion_map as an argument to function _compute_markov_matrix, is _compute_markov_matrix going to mutate the object that the alias refers to OR create it's local copy and mutate that instead?
Sorry for slightly confusing description, but I don't know how to phrase this in a simpler way.
|
@mfalkiewicz - should #13 still be a PR? |
|
also tests are not passing - since the markov computation has changed. i would recommend creating a second |
Refactoring the kernel normalization and transition matrix computation as independent function will allow for more code re-use.