Replies: 2 comments 1 reply
-
I guess this may not be high priority as there are many options in contributed packages. However, the vroom package uses ALTREP to do a quick initial read and I wonder if this might be something that makes sense to bring into base R. This might widen the pool of potentially interested mentors to include e.g. @ltierney and @kalibera. It’s difficult to include in benchmarks as the full read is delayed, but maybe someone could do an initial investigation to see how vroom scales with the number of columns. |
Beta Was this translation helpful? Give feedback.
-
I'm almost sure that we (R-core) got suggestions previously... about improving Still I think everyone would agree that it should be desirable to get to a linear O(p) instead of quadratic O(p^2) time complexity for read.*() |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi! I will not be attending the sprint, but I had a couple of ideas related to improving efficiency of read.csv and write.csv.
Probably the more important issue to address would be read.csv, which had time complexity quadratic in number of columns, see this issue for some empirical analysis:
tdhock/atime#8
Another issue was that write.csv uses linear memory, whereas other CSV writers use only constant memory (this is not that big of an issue though, because anyways you need linear memory to store the data in R before writing to CSV)
tdhock/atime#10
@gmbecker @bastistician may be able to help mentor? They worked on fixing a similar efficiency issue tdhock/atime#9
Beta Was this translation helpful? Give feedback.
All reactions