[FIX] OWColor: Fix propagating changes to the output#2379
[FIX] OWColor: Fix propagating changes to the output#2379astaric merged 2 commits intobiolab:masterfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2379 +/- ##
==========================================
- Coverage 74.33% 74.32% -0.01%
==========================================
Files 321 321
Lines 55988 55990 +2
==========================================
Hits 41616 41616
- Misses 14372 14374 +2 |
Orange/widgets/data/owcolor.py
Outdated
| self._create_proxies(domain.metas)) | ||
| self.openContext(data) | ||
| self.data = data.transform(domain) | ||
| self.data.domain = domain |
There was a problem hiding this comment.
I guess we need a comment here, at the very least.
But there may be a more general problem here. One would expect that after self.data = data.transform(domain), we would have self.data.domain is domain, while transform only ensures self.data.domain == domain. We should either change transform to (always) compare domains with is, not ==, or we can add a flag to specify how to compare domains.
There was a problem hiding this comment.
I would change transform to compare with is.
There was a problem hiding this comment.
CMIIW, but changing Table.from_table to compare domains with is instead of == doesn't look like a trivial change. Also it by itself doesn't make OWColor work — the problem then arises in DomainConversion, which doesn't seem to know how to do the transformation. As a result, we get a data table with all values missing.
Before, as we checked with == if domains matched we called from_table_rows which just makes a selection of rows and keeps the provided domain. Now, if domain1 == domain2 but domain1 is not domain2 we need to make a domain conversion.
It seems like the core of the problem with all values missing is in how DomainConversion checks whether some variable is inside a domain. The relevant part of the code from domain.py:
def __contains__(self, item):
"""
Return `True` if the item (`str`, `int`, :class:`Variable`) is
in the domain.
"""
return item in self._indices
It seems like item in self._indices == False while item in list(self._indices.keys()) == True. Is this a result of how we compare variables?
Should we maybe just revert to self.data.domain = domain as it was before @janezd refactoring? If so, @janezd can you provide an explanation, since I'm not quite sure how variable's proxies work?
There was a problem hiding this comment.
The first problem sounds like a broken hash on Variable.
As for the fix, I would rather move the data.domain = domain to transformmethod than have every caller do this explicitly.
There was a problem hiding this comment.
@astaric but shouldn't then we fix Table.from_table as well to prevent from_table and transform to behave differently? Isn't data.transform currently just a syntactic sugar?
There was a problem hiding this comment.
That is also possible. As we have changed all occurrences of Table.from_table with data.transform, I kind of consider it deprecated, but yeah, we can put the change there.
There was a problem hiding this comment.
... and risk @janezd changing it back to data.transform the next time he sees this code? If data.transform is syntactic sugar it has to have the same behaviour as the thing that this sugar is coating.
There was a problem hiding this comment.
... and risk @janezd changing it back to
data.transformthe next time he sees this code?
@janezd change to data.transform didn't break anything (transform and to_table behave the same way). Removing of the extra self.data.domain = domain is the change that broke things.
If data.transform is syntactic sugar it has to have the same behaviour as the thing that this sugar is coating.
Agree. Hence, if we are going to fix this in the core we need to do it in from_table. And while I'm all for changing to_table to compare domains with is instead of ==, I'm afraid this might cause other issues all over the place. As it has revealed some problems in domain's __contains__ which I'm not sure how to solve. Any help appreciated 🙈
There was a problem hiding this comment.
And if anyone wishes to play around, here is the minimal example of the problem, after the change introduced in this PR:
import Orange
from Orange.data import Table
def _create_proxies(variables):
part_vars = []
for var in variables:
if var.is_discrete or var.is_continuous:
var = var.make_proxy()
if var.is_discrete:
var.values = var.values[:]
part_vars.append(var)
return part_vars
data = Table('iris')
domain = data.domain
domain = Orange.data.Domain(_create_proxies(domain.attributes),
_create_proxies(domain.class_vars),
_create_proxies(domain.metas))
new_data = data.transform(domain)
print(new_data)The new_data contains all missing values since variables from the destination domain are not in the source domain and compute values are not set.
This ensures that after one calls data.transform(domain) the domain of the resulting data is set to the instance that was passed to transform.
|
@astaric this should work now, please check. |
Issue
Fixes #2378
Description of changes
Assure that when domains are equal,
from_tablessets the resulting table's domain to the instance that was passed on the input.Includes