Skip to content

[FIX] Nomogram: Purge class_var values#5847

Merged
ajdapretnar merged 1 commit intobiolab:masterfrom
VesnaT:fix_nomogram_class_var
Feb 18, 2022
Merged

[FIX] Nomogram: Purge class_var values#5847
ajdapretnar merged 1 commit intobiolab:masterfrom
VesnaT:fix_nomogram_class_var

Conversation

@VesnaT
Copy link
Contributor

@VesnaT VesnaT commented Feb 17, 2022

Issue

Fixes #5846

Description of changes
  • remove unused values from class_var.values
Includes
  • Code changes
  • Tests
  • Documentation

orig_clv = original.class_var
orig_data = classifier.original_data
values = (orig_clv.values[int(i)] for i in
np.unique(orig_data.get_column_view(orig_clv)[0]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.unique is $n \log n$. I think it would be better to have something like

mask = np.zeros(len(orig_clv.values), dtype=bool)
column = orig_data.get_column_view(orig_clv)[0]
mask[column[np.isfinite(column)].astype(int)] = True
values = tuple(np.array(orig_clv.values)[mask])

(I haven't tested this code, it's just an idea.)

Copy link
Contributor Author

@VesnaT VesnaT Feb 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the unique. For me, it is easier to read the code.

I'd leave it as is, unless you think it's too time consuming.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like what Janez did, but perhaps we could pack it into a function. Looking for values of discrete variables that are used in some column seems like something common.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If @VesnaT finds the original better, we can keep it.

However, @markotoplak's idea to have a function make sense. Where to put it, how to name it? Hm, let's put it to Orange.prepocess.remove and name it remove_unused_values. Oh, no, wait, it's already there.

It is understandable that @VesnaT and I did not know about this function. Oh, no, wait, @VesnaT wrote it and I did some work on it two years ago.

@ajdapretnar ajdapretnar merged commit 2c13d0a into biolab:master Feb 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Nomogram: IndexError when changing target class

4 participants