-
Notifications
You must be signed in to change notification settings - Fork 1k
Fix incorrect keying after merge of keyed, non-alphabetic factor and character columns
#5362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #5362 +/- ##
=======================================
Coverage 98.47% 98.47%
=======================================
Files 81 81
Lines 15005 15019 +14
=======================================
+ Hits 14776 14790 +14
Misses 229 229 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
|
No obvious timing issues in HEAD=merge_factor_char_key Generated via commit a1cbe53 Download link for the artifact containing the test results: ↓ atime-results.zip
|
factor and character returns wrongly keyed data.tablefactor and character columns
|
I think this is ready to merge, WDYT @ben-schwen? |
R/data.table.R
Outdated
| return(NULL) | ||
|
|
||
| ## check key on i as well! | ||
| if (is.logical(i)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is.logical(i) check has been there since initial check-in in 2008 (2ec50ec; was is.logical(irows) then), not sure it's possible to reach it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For queries like DT[<logical subset>], we return here:
Lines 681 to 684 in 6029f2f
| if (!length(leftcols)) { | |
| # basic x[i] subset, #2951 | |
| if (is.null(irows)) return(shallow(x)) # e.g. DT[TRUE] (#3214); otherwise CsubsetDT would materialize a deep copy | |
| else return(.Call(CsubsetDT, x, irows, seq_along(x)) ) |
For queries like DT[<logical subset>, .(key, other)], the key is retained & the value returned here:
Lines 1459 to 1467 in 6029f2f
| if (is.null(irows) && !is.null(shared_keys)) { | |
| setattr(jval, 'sorted', shared_keys) | |
| # potentially inefficient backup -- check if jval is sorted by key(x) | |
| } else if (haskey(x) && all(key(x) %chin% names(jval)) && is.sorted(jval, by=key(x))) { | |
| setattr(jval, 'sorted', key(x)) | |
| } | |
| if (any(vapply_1b(jval, is.null))) internal_error("j has created a data.table result containing a NULL column") # nocov | |
| } | |
| return(jval) |
So I'm pretty sure it's not possible to reach this. We can see if revdeps turn anything up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ben-schwen feel free to merge once you've reviewed my own edits.

Closes #5361
Implements option 1 of #5361 (comment) (Mentioned problems 2+3 still exist but need additional
is.sortedcheck)