-
Notifications
You must be signed in to change notification settings - Fork 1k
Column naming for empty string and duplicate NA label #6795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6795 +/- ##
==========================================
- Coverage 98.64% 98.63% -0.02%
==========================================
Files 79 79
Lines 14642 14646 +4
==========================================
+ Hits 14444 14446 +2
- Misses 198 200 +2 ☔ View full report in Codecov by Sentry. |
tdhock
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
before asking for review, please click "Files changed" tab and make sure that only a minimal set of changes that is relevant to the PR appears.
here there are many irrelevant changes should be reverted before review (adding empty lines, and removing comments)
R/fcast.R
Outdated
| if (is.function(dat[[i]])) | ||
| stopf("Column [%s] not found or of unknown type.", deparse(x)) | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please undo addition of empty lines
| subset = m[["subset"]][[2L]] | ||
| if (!is.null(subset)) { | ||
| if (is.name(subset)) subset = as.call(list(quote(`(`), subset)) | ||
| idx = which(eval(subset, data, parent.frame())) # any advantage thro' secondary keys? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please undo
|
@tdhock , Please let me know if there is any change or improvement needed . |
tdhock
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please fix
R/fcast.R
Outdated
| value.var = names(data)[ncol(data)] | ||
| lvals = value_vars(value.var, names(data)) | ||
| valnames = unique(unlist(lvals)) | ||
| valnames = handle_empty_strings(valnames) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please avoid re-writing the same variable (valnames) which can be confusing.
Either use unique names or don't use multiple lines/variables.
Also are these helper functions used only once? If so please delete the helper functions, and just use the code here instead of in a separate function. (helper functions should only be introduced if the same code is used in more than one place, to avoid repetition)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes these helper function used only once , so that I remove helper functions and used code directly instead of separate function.
| lhs = lhs_; rhs = rhs_ | ||
| } | ||
| maplen = lengths(mapunique) | ||
| idx = do.call(CJ, mapunique)[map, 'I' := .I][["I"]] # TO DO: move this to C and avoid materialising the Cross Join. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please undo all deletions which are not relevant to your PR.
Click "Files changed" tab in github, and make sure there are only changes relevant to your PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I undo all deletions and made minimal changes in a code which are relevant .
|
@tdhock , I made all suggested changes , is there any improvement needed in my code? |
|
The original issue #5605 asked for making |
|
I think @aitap is right, we should probably not be encouraging column names being empty string. Empty string is not allowed as a variable name when constructing list and data table. > list(""="foo")
Erreur : tentative d'utilisation de nom de variable de longueur nulle
> data.table(""=1)
Erreur : tentative d'utilisation de nom de variable de longueur nulleyou can create a column name which is empty string but you can't extract it using [[ > setnames(data.table(x=1),"")[[""]]
NULLPlease close PR if you agree. |
Fixes #5605
This PR addresses the issues of Handling empty strings with "empty_string" and ensuring unique column.
Handling Empty Strings in Column Names
Added a function handle_empty_strings to replace empty strings in column names with "empty_string".
Ensuring Unique Column Names
Added a function ensure_unique_names to ensure column names are unique by appending a suffix if duplicates are found.
Improved fill.default Handling
Explicitly handled fill.default when fun.aggregate is used and fill is NULL.
When fun.aggregate is used and fill is NULL, missing values in the reshaped data need to be filled with a default value.This change ensures that fill.default is computed correctly and used to fill missing values, improving consistency and preventing errors.
Added a check to ensure the length of names matches the length of the vector before setting the names attribute.
if there is any improvement needed in the code than tell me.