Skip to content

implement comment.char argument for fread#7375

Merged
MichaelChirico merged 33 commits intomasterfrom
fread_commentChar
Oct 20, 2025
Merged

implement comment.char argument for fread#7375
MichaelChirico merged 33 commits intomasterfrom
fread_commentChar

Conversation

@ben-schwen
Copy link
Member

Closes #856
Supersedes #4486

Needed this on machine without grep.
It should have no performance impact since we always check for comment.char before entering costly checking branch.

@codecov
Copy link

codecov bot commented Oct 17, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.12%. Comparing base (67129f0) to head (ba0c68a).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #7375   +/-   ##
=======================================
  Coverage   99.12%   99.12%           
=======================================
  Files          85       85           
  Lines       16589    16637   +48     
=======================================
+ Hits        16444    16492   +48     
  Misses        145      145           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

5,6', nrows=2, comment.char='#'), data.table(a=c(1L,3L), b=c(2L,4L)))

# sep and comment char same
test(2341.16, fread('a#b
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm that's interesting. I don't know what I'd expect out of reading this file. E.g. data.table(a=1L, b=2L) also seems like a possible answer, and even data.table(a=c(1L, NA), b=c("2", "only comment")) could be justified too. Maybe let's just error if comment.char==sep?

@MichaelChirico
Copy link
Member

Looks basically GTG. Main remaining thing I think is what to do about blank.lines.skip.

@eddelbuettel if you have any "real" files laying around you'd like to test on that would be most helpful :)

@eddelbuettel
Copy link
Contributor

eddelbuettel commented Oct 20, 2025

LOL @MichaelChirico -- while I wrote eight years ago

Bump. Needing this right now.

I am sorry to say that I have not been sitting here with baited breath and no, I do not have that file around. I do not even remember what the comment character may have been. I am sure the PR will be fine.

@MichaelChirico
Copy link
Member

@Anirban166 it looks like some {pak} cacheing issue failing the atime GHA, any idea?

https://github.com/Rdatatable/data.table/actions/runs/18658671941/job/53193860953

@github-actions
Copy link

  • HEAD=fread_commentChar stopped early for fread disk overhead improved in #6925
    Comparison Plot

Generated via commit ba0c68a

Download link for the artifact containing the test results: ↓ atime-results.zip

Task Duration
R setup and installing dependencies 6 minutes and 33 seconds
Installing different package versions 10 minutes and 2 seconds
Running and plotting the test cases 2 minutes and 40 seconds

@Anirban166
Copy link
Member

Haven't seen that one before, but I think its a flaky error having to do with network/resource contention as it worked now that you re-runned (the r-lib/actions/setup-r-dependencies@v2 action that I'm using internally uses pak, so it would usually be an upstream error if one, but in cases of corrupted caches or missing files, we can use a different number for cache-version)

@MichaelChirico
Copy link
Member

Thanks I'll try and notice if it recurs, indeed seems flaky. There were other failures related to missing branches, which clouded things. Seems restored to working now.

Copy link
Member

@MichaelChirico MichaelChirico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! :shipit:

@MichaelChirico MichaelChirico merged commit 59f966c into master Oct 20, 2025
19 of 22 checks passed
@MichaelChirico MichaelChirico deleted the fread_commentChar branch October 20, 2025 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement comment.char argument in fread

4 participants