Skip to content

Conversation

@ben-schwen
Copy link
Member

Closes #469

Not exactly what Arun suggested but seems like the best option since we encode to UTF8 in forderv. Is a warning too much here?

@codecov
Copy link

codecov bot commented Oct 19, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.12%. Comparing base (59f966c) to head (2f49a9a).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #7379   +/-   ##
=======================================
  Coverage   99.12%   99.12%           
=======================================
  Files          85       85           
  Lines       16637    16640    +3     
=======================================
+ Hits        16492    16495    +3     
  Misses        145      145           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jangorecki
Copy link
Member

Let's see revdeps. If none affected then I would keep it like this.

@aitap
Copy link
Contributor

aitap commented Oct 20, 2025 via email

@MichaelChirico
Copy link
Member

Could someone please share the original context from R-Forge #5758?

Basically, it links to https://stackoverflow.com/questions/24085906/unique-data-table-do-not-handle-keys-properly

@jangorecki
Copy link
Member

Late warning about unexpected consequences is better than no warning, so still despite that I see value in it.
And what about changing the default of fread as well?

@aitap
Copy link
Contributor

aitap commented Oct 20, 2025 via email

@ben-schwen
Copy link
Member Author

Good points! I've added fread(encoding="UTF-8") as default encoding. Given how much support the addition of encoding='UTF-8' got in #563 this sounds also like what the community wants/needs.

I have also filed this now as breaking change.

@aitap
Copy link
Contributor

aitap commented Oct 22, 2025 via email

@github-actions
Copy link

github-actions bot commented Oct 22, 2025

No obvious timing issues in HEAD=warn_encodings
Comparison Plot

Generated via commit 2f49a9a

Download link for the artifact containing the test results: ↓ atime-results.zip

Task Duration
R setup and installing dependencies 2 minutes and 43 seconds
Installing different package versions 21 seconds
Running and plotting the test cases 2 minutes and 37 seconds

@ben-schwen
Copy link
Member Author

I guess what Jan meant by unexptected consequences is that forderv can simply take longer because user are unaware of encodings.

@jangorecki
Copy link
Member

Simply longer is not a problem, I thought that something that was giving T for == can now start to return F

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[R-Forge #5758] unique and duplicated should also warn on columns with mixed encodings.

5 participants