Skip to content

Conversation

@ben-schwen
Copy link
Member

@ben-schwen ben-schwen commented Apr 6, 2022

Closes #5361

Implements option 1 of #5361 (comment) (Mentioned problems 2+3 still exist but need additional is.sorted check)

@codecov
Copy link

codecov bot commented Apr 6, 2022

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.47%. Comparing base (c16f320) to head (a1cbe53).
Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5362   +/-   ##
=======================================
  Coverage   98.47%   98.47%           
=======================================
  Files          81       81           
  Lines       15005    15019   +14     
=======================================
+ Hits        14776    14790   +14     
  Misses        229      229           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@MichaelChirico MichaelChirico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@github-actions
Copy link

github-actions bot commented Sep 9, 2024

No obvious timing issues in HEAD=merge_factor_char_key
Comparison Plot

Generated via commit a1cbe53

Download link for the artifact containing the test results: ↓ atime-results.zip

Task Duration
R setup and installing dependencies 2 minutes and 53 seconds
Installing different package versions 38 seconds
Running and plotting the test cases 2 minutes and 26 seconds

@MichaelChirico MichaelChirico changed the title Merge on factor and character returns wrongly keyed data.table Fix incorrect keying after merge of keyed, non-alphabetic factor and character columns Jun 30, 2025
@MichaelChirico
Copy link
Member

I think this is ready to merge, WDYT @ben-schwen?

R/data.table.R Outdated
return(NULL)

## check key on i as well!
if (is.logical(i))
Copy link
Member

@MichaelChirico MichaelChirico Jun 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is.logical(i) check has been there since initial check-in in 2008 (2ec50ec; was is.logical(irows) then), not sure it's possible to reach it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For queries like DT[<logical subset>], we return here:

data.table/R/data.table.R

Lines 681 to 684 in 6029f2f

if (!length(leftcols)) {
# basic x[i] subset, #2951
if (is.null(irows)) return(shallow(x)) # e.g. DT[TRUE] (#3214); otherwise CsubsetDT would materialize a deep copy
else return(.Call(CsubsetDT, x, irows, seq_along(x)) )

For queries like DT[<logical subset>, .(key, other)], the key is retained & the value returned here:

data.table/R/data.table.R

Lines 1459 to 1467 in 6029f2f

if (is.null(irows) && !is.null(shared_keys)) {
setattr(jval, 'sorted', shared_keys)
# potentially inefficient backup -- check if jval is sorted by key(x)
} else if (haskey(x) && all(key(x) %chin% names(jval)) && is.sorted(jval, by=key(x))) {
setattr(jval, 'sorted', key(x))
}
if (any(vapply_1b(jval, is.null))) internal_error("j has created a data.table result containing a NULL column") # nocov
}
return(jval)

So I'm pretty sure it's not possible to reach this. We can see if revdeps turn anything up.

Copy link
Member

@MichaelChirico MichaelChirico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ben-schwen feel free to merge once you've reviewed my own edits.

@ben-schwen ben-schwen merged commit c806849 into master Jul 2, 2025
12 checks passed
@MichaelChirico MichaelChirico deleted the merge_factor_char_key branch July 8, 2025 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug related to merging on character vs factor column when factor is sorted

4 participants