Skip to content

Conversation

@jangorecki
Copy link
Member

closes #4837

@codecov
Copy link

codecov bot commented Dec 10, 2020

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.70%. Comparing base (d2b3ea7) to head (bbba929).
⚠️ Report is 107 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #4843   +/-   ##
=======================================
  Coverage   98.69%   98.70%           
=======================================
  Files          79       79           
  Lines       14688    14695    +7     
=======================================
+ Hits        14497    14504    +7     
  Misses        191      191           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jangorecki
Copy link
Member Author

jangorecki commented Dec 11, 2020

In fabafed I added extra argument, rev=FALSE.
When doing plots with lattice in a grid layout, plots are being ordered by levels starting from bottom left corner. This is good for leff-to-right order of plots horizontally, yet for a vertical order of plots bottom-to-top might not be expected, and then levels needs to be reversed for plots to be produced in top-to-bottom order. This new argument will make that more convenient. This can be useful for dcast as well, depending on the expected order of transformed rows into columns.
An example of such plotting can be found in this report and generated plot below. Having a non reversed original order of levels "basic" would be at the bottom row in layout, while "advanced" at the top.
datatable groupby 1e9
In this plot I actually used 3 different factors for proper ordering. Order of plots vertically (basic/advanced), horizontally (1e9_1e2_0_0, ...) and order of question in legend. This nicely presents how useful is the ordering of levels to be in original order rather than alphabetical (as by default in base R).

@jangorecki jangorecki linked an issue Feb 8, 2022 that may be closed by this pull request
@jangorecki jangorecki requested a review from mattdowle as a code owner December 9, 2023 10:56
@jangorecki
Copy link
Member Author

Needs unit test for #5327 so it can close that one as well

Copy link
Member

@MichaelChirico MichaelChirico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Searching around on cran, I do see calls like factor(x, levels=unique(x)) pretty commonly:

https://github.com/search?q=org%3Acran%20%2Ffactor%5C(.*unique%5C(%2F&type=code

As well as more generally in lang:R:

https://github.com/search?q=lang%3AR+%2Ffactor%5C%28.*unique%5C%28%2F&type=code

I think that justifies a new function even though it's just a base wrapper.

We also see ample rev=TRUE cases.

I only quibble about the name, there's a tension between "its good to have something concise" and "readers should be able to infer something about it from its name".

Maybe u_factor / factor_u (unsorted factor() / unique factor())? uns_factor / unq_factor?

Alternatively, maybe this is something to propose to r-devel first?

@jangorecki
Copy link
Member Author

I only quibble about the name, there's a tension between "its good to have something concise" and "readers should be able to infer something about it from its name".

fctr is just like factor but without vowels, therefore should be very easy to infer

Alternatively, maybe this is something to propose to r-devel first?

I doubt they would consider adding that

@MichaelChirico
Copy link
Member

fctr is just like factor but without vowels, therefore should be very easy to infer

we can guess it's related to factors, but not anything else. is it fast alternative (as is most common for our helpers)? is it a related-but-similar S3 class?

I doubt they would consider adding that

not as a separate function, surely, but possibly as new argument(s) to factor()

@jangorecki
Copy link
Member Author

is it fast alternative (as is most common for our helpers)?

not really, therefore no f prefix :)

is it a related-but-similar S3 class?

factor() is not S3 class, so less likely users will think that way

@MichaelChirico
Copy link
Member

factor() is not S3 class, so less likely users will think that way

really?

is.object(factor())
# [1] TRUE

methods(class = 'factor')
#  [1] all.equal     as.character  as.data.frame as.Date       as.list       as.logical    as.POSIXlt    as.vector     c             coerce        droplevels    [<-          
# [13] [             [[<-          [[            format        initialize    is.na<-       length<-      levels<-      Math          Ops           plot          print        
# [25] relevel       relist        rep           show          slotsFromS3   summary       Summary       xtfrm        
# see '?methods' for accessing help and source code

@MichaelChirico
Copy link
Member

I would like more feedback from other @Rdatatable/committers here on whether (1) we want to export a simple {base} wrapper (2) the proposed naming

The closest I can see in terms of "nearness" to base R functionality would be like() which is typically invoked via %like% and friends; even then it builds in the faster implementation for factor input that's not present in older versions of R (for now: #7104). We could consider unexporting like() but we probably wouldn't.

@TysonStanley
Copy link
Member

Just my opinion: 1) it would have nearly no need for maintenance going forward so if it is useful for Jan and others, seems like a low lift to add. 2) I'm fine with the name. fctr is similar to the fct_ prefix in forcats so there's obviously similar notation. Don't think the name is informative but also don't know how important that is given it's not doing a lot of special things.

@MichaelChirico
Copy link
Member

Jan WDYT about extending this to support the "normal" factor() default of sorted levels? that way users can have a one-stop shop instead of needing to switch between the proposed fctr() and factor()

@jangorecki
Copy link
Member Author

factor() is not S3 class, so less likely users will think that way

really?

is.object(factor())
# [1] TRUE

methods(class = 'factor')
#  [1] all.equal     as.character  as.data.frame as.Date       as.list       as.logical    as.POSIXlt    as.vector     c             coerce        droplevels    [<-          
# [13] [             [[<-          [[            format        initialize    is.na<-       length<-      levels<-      Math          Ops           plot          print        
# [25] relevel       relist        rep           show          slotsFromS3   summary       Summary       xtfrm        
# see '?methods' for accessing help and source code

uh sorry, I meant factor is not a S3 method... you are absolutely right

@jangorecki
Copy link
Member Author

jangorecki commented Jun 28, 2025

by

the "normal" factor() default of sorted levels?

you mean

levels=sort(unique(x)) instead of levels=unique(x)

?

If so then I would add another arg sort=FALSE, because otherewise retaining original order, as was the purpose of this helper, will no longer be easier with fctr() than with factor().

@MichaelChirico
Copy link
Member

by

the "normal" factor() default of sorted levels?

you mean

levels=sort(unique(x)) instead of levels=unique(x)

?

If so then I would add another arg sort=FALSE, because otherewise retaining original order, as was the purpose of this helper, will no longer be easier with fctr() than with factor().

yes that's what I had in mind. basically fctr() becomes extended factor() with a different default.

@github-actions
Copy link

github-actions bot commented Jun 29, 2025

No obvious timing issues in HEAD=fctr
Comparison Plot

Generated via commit bbba929

Download link for the artifact containing the test results: ↓ atime-results.zip

Task Duration
R setup and installing dependencies 2 minutes and 58 seconds
Installing different package versions 19 seconds
Running and plotting the test cases 2 minutes and 30 seconds

Copy link
Member

@MichaelChirico MichaelChirico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great now, thanks!

PTAL at the updated NEWS/Rd descriptions & merge if you're happy :)

@jangorecki jangorecki merged commit 7f1b3bb into master Jun 29, 2025
12 checks passed
@jangorecki jangorecki deleted the fctr branch June 29, 2025 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

helper function: convert char to factor retaining order

3 participants