- Internal changes requested by CRAN around format specification (#524).
-
It is now possible (again?) to read from a list of connections (@bairdj, #514).
-
Internal change for compatibility with cpp11 >= 0.4.6 (@DavisVaughan, #512).
- No user-facing changes.
- There was no CRAN release with this version number.
-
str()now works in a colorized context in the presence of a column of classinteger64, i.e. parsed withcol_big_integer()(@bart1, #477). -
The embedded implementation of the Grisu algorithm for printing floating point numbers now uses
snprintf()instead ofsprintf()and likewise for vroom's own code (@jeroen, #480).
-
vroom(col_select=)now handles column selection by numeric position whenidcolumn is provided (#455). -
vroom(id = "path", col_select = a:c)is treated likevroom(id = "path", col_select = c(path, a:c)). If anidcolumn is provided, it is automatically included in the output (#416). -
vroom_write(append = TRUE)does not modify an existing file when appending an empty data frame. In particular, it does not overwrite (delete) the existing contents of that file (tidyverse/readr#1408, #451). -
vroom::problems()now defaults to.Last.valuefor its primary input, similar to howreadr::problems()works (#443). -
The warning that indicates the existence of parsing problems has been improved, which should make it easier for the user to follow-up (tidyverse/readr#1322).
-
vroom()reads more reliably from filepaths containing non-ascii characters, in a non-UTF-8 locale (#394, #438). -
vroom_format()andvroom_write()only quote values that contain a delimiter, quote, or newline. Specifically values that are equal to thenastring (or that start with it) are no longer quoted (#426). -
Fixed segfault when reading in multiple files and the first file has only a header row of column names, but subsequent files have at least one row (#430).
-
Fixed segfault when
vroom_format()is given an empty data frame (#425) -
Fixed a segfault that could occur when the final field of the final line is missing and the file also does not end in a newline (#429).
-
Fixed recursive garbage collection error that could occur during
vroom_write()whenoutput_column()generates an ALTREP vector (#389). -
vroom_progress()usesrlang::is_interactive()instead ofbase::interactive(). -
col_factor(levels = NULL)honors thenastrings ofvroom()and its owninclude_naargument, as described in the docs, and now reproduces the behaviour of readr's first edition parser (#396).
-
Jenny Bryan is now the official maintainer.
-
Fix uninitialized bool detected by CRAN's UBSAN check (tidyverse#386)
-
Fix buffer overflow when trying to parse an integer field that is over 64 characters long (tidyverse/readr#1326)
-
Fix subset indexing when indexes span a file boundary multiple times (#383)
-
vroom(col_select=)now works ifcol_names = FALSEas intended (#381) -
vroom(n_max=)now correctly handles cases when reading from a connection and the file does not end with a newline (tidyverse/readr#1321) -
vroom()no longer issues a spurious warning when the parsing needs to be restarted due to the presence of embedded newlines (tidyverse/readr#1313) -
Fix performance issue when materializing subsetted vectors (#378)
-
vroom_format()now uses the same internal multi-threaded code asvroom_write(), improving its performance in most cases (#377) -
vroom_fwf()no longer omits the last line if it does not end with a newline (tidyverse/readr#1293) -
Empty files or files with only a header line and no data no longer cause a crash if read with multiple files (tidyverse/readr#1297)
-
Files with a header but no contents, or a empty file if
col_names = FALSEno longer cause a hang whenprogress = TRUE(tidyverse/readr#1297) -
Commented lines with comments at the end of lines no longer hang R (tidyverse/readr#1309)
-
Comment lines containing unpaired quotes are no longer treated as unterminated quotations (tidyverse/readr#1307)
-
Values with only a
InforNaNprefix but additional data afterwards, likeInformor no longer inappropriately guessed as doubles (tidyverse/readr#1319) -
Time types now support
%hformat to denote hour durations greater than 24, like readr (tidyverse/readr#1312) -
Fix performance issue when materializing subsetted vectors (#378)
-
vroom()now supports files with only carriage return newlines (\r). (#360, tidyverse/readr#1236) -
vroom()now parses single digit datetimes more consistently as readr has done (tidyverse/readr#1276) -
vroom()now parsesInfvalues as doubles (tidyverse/readr#1283) -
vroom()now parsesNaNvalues as doubles (tidyverse/readr#1277) -
VROOM_CONNECTION_SIZEis now parsed as a double, which supports scientific notation (#364) -
vroom()now works around specifying a\nas the delimiter (#365, tidyverse/dplyr#5977) -
vroom()no longer crashes if given acol_nameandcol_typeboth less than the number of columns (tidyverse/readr#1271) -
vroom()no longer hangs if given an empty value forlocale(grouping_mark=)(tidyverse/readr#1241) -
Fix performance regression when guessing with large numbers of rows (tidyverse/readr#1267)
-
vroom(col_types=)now accepts column type names like those accepted by utils::read.table. e.g. vroom::vroom(col_types = list(a = "integer", b = "double", c = "skip")) -
vroom()now respects thequoteparameter properly in the first two lines of the file (tidyverse/readr#1262) -
vroom_write()now always correctly writes its output including column names in UTF-8 (tidyverse/readr#1242) -
vroom_write()now creates an empty file when given a input without any columns (tidyverse/readr#1234)
-
vroom(col_types=)now truncates the column types if the user passes too many types. (#355) -
vroom()now always includes the last row when guessing (#352) -
vroom(trim_ws = TRUE)now trims field content within quotes as well as without (#354). Previously vroom explicitly retained field content inside quotes regardless of the value oftrim_ws.
-
vroom()now supports inputs with unnamed column types that are less than the number of columns (#296) -
vroom()now outputs the correct column names even in the presence of skipped columns (#293, tidyverse/readr#1215) -
vroom_fwf(n_max=)now works as intended when the input is a connection. -
vroom()andvroom_write()now automatically detect the compression format regardless of the file extension for bzip2, xzip, gzip and zip files (#348) -
vroom()andvroom_write()now automatically support many more archive formats thanks to the archive package. These include new support for writing zip files, reading and writing 7zip, tar and ISO files. -
vroom(num_threads = 1)will now not spawn any threads. This can be used on as a workaround on systems without full thread support. -
Threads are now automatically disabled on non-macOS systems compiling against clang's libc++. Most systems non-macOS systems use the more common gcc libstdc++, so this should not effect most users.
-
Parsers now treat NA values as NA even if they are valid values for the types (#342)
-
Element-wise indexing into lazy (ALTREP) vectors now has much less overhead (#344)
-
New
vroom(show_col_types=)argument to more simply control when column types are shown. -
vroom(),vroom_fwf()andvroom_lines()now support multi-byte encodings such as UTF-16 and UTF-32 by converting these files to UTF-8 under the hood (#138) -
vroom()now supports skipping comments and blank lines within data, not just at the start of the file (#294, #302) -
vroom()now uses the tzdb package when parsing date-times (@DavisVaughan, #273) -
vroom()now emits a warning of classvroom_parse_issueif there are non-fatal parsing issues. -
vroom()now emits a warning of classvroom_mismatched_column_nameif the user supplies a column type that does not match the name of a read column (#317). -
The vroom package now uses the MIT license, as part of systematic relicensing throughout the r-lib and tidyverse packages (#323)
-
`vroom() correctly reads double values with comma as decimal separator (@kent37 #313)
-
vroom()now correctly skips lines with only one quote if the format doesn't use quoting (tidyverse/readr#991 (comment)) -
vroom()andvroom_lines()now handle files with mixed windows and POSIX line endings (tidyverse/readr#1210) -
vroom()now outputs a tibble with the expected number of columns and types based oncol_typesandcol_nameseven if the file is empty (#297). -
vroom()no longer mis-indexes files read from connections with windows line endings when the two line endings falls on separate sides of the read buffer (#331) -
vroom()no longer crashes ifn_max = 0andcol_namesis a character (#316) -
vroom()now preserves the spec attribute when vroom and readr are both loaded (#303) -
vroom()now allows specifying column names incol_typesthat have been repaired (#311) -
vroom()no longer inadvertently calls.name_repairfunctions twice (#310). -
vroom()is now more robust to quoting issues when tracking the CSV state (#301) -
vroom()now registers the S3 class withmethods::setOldClass()(r-dbi/DBI#345) -
col_datetime()now supports '%s' format, which represents decimal seconds since the Unix epoch. -
col_numeric()now supportsgrouping_markanddecimal_markthat are unicode characters, such as U+00A0 which is commonly used as the grouping mark for numbers in France (tidyverse/readr#796). -
vroom_fwf()gains askip_empty_rowsargument to skip empty lines (tidyverse/readr#1211) -
vroom_fwf()now respectsn_max, as intended (#334) -
vroom_lines()gains anaargument. -
vroom_write_lines()no longer escapes or quotes lines. -
vroom_write_lines()now works as intended (#291). -
vroom_write(path=)has been deprecated, in favor offile, to match readr. -
vroom_write_lines()now exposes thenum_threadsargument. -
problems()now prints the correct row number of parse errors (#326) -
problems()now throws a more informative error if called on a readr object (#308). -
problems()now de-duplicates identical problems (#318) -
Fix an inadvertent performance regression when reading values (#309)
-
n_maxargument is correctly respected in edge cases (#306) -
factors with implicit levels now work when fields are quoted, as intended (#330)
-
Guessing double types no longer unconditionally ignores leading whitespace. Now whitespace is only ignored when
trim_wsis set.
-
vroom now tracks indexing and parsing errors like readr. The first time an issue is encountered a warning will be signaled. A tibble of all found problems can be retrieved with
vroom::problems(). (#247) -
Data with newlines within quoted fields will now automatically revert to using a single thread and be properly read (#282)
-
NUL values in character data are now permitted, with a warning.
-
New
vroom_write_lines()function to write a character vector to a file (#291) -
vroom_write()gains aeol=parameter to specify the end of line character(s) to use. Usevroom_write(eol = "\r\n")to write a file with Windows style newlines (#263).
-
Datetime formats used when guessing now match those used when parsing (#240)
-
Quotes are now only valid next to newlines or delimiters (#224)
-
vroom()now signals an R error for invalid date and datetime formats, instead of crashing the session (#220). -
vroom(comment = )now accepts multi-character comments (#286) -
vroom_lines()now works with empty files (#285) -
Vectors are now subset properly when given invalid subscripts (#283)
-
vroom_write()now works when the delimiter is empty, e.g.delim = ""(#287). -
vroom_write()now works with all ALTREP vectors, including string vectors (#270) -
An internal call to
new.env()now correctly uses theparentargument (#281)
-
Test failures on R 4.1 related to factors with NA values fixed (#262)
-
vroom()now works without error with readr versions of col specs (#256, #264, #266)
-
Test failures on R 4.1 related to POSIXct classes fixed (#260)
-
Column subsetting with double indexes now works again (#257)
-
vroom(n_max=)now only partially downloads files from connections, as intended (#259)
-
The Rcpp dependency has been removed in favor of cpp11.
-
vroom()now handles cases whenidis set and a column in skipped (#237) -
vroom()now supports column selections when there are some empty column names (#238) -
vroom()argumentn_maxnow works properly for files with windows newlines and no final newline (#244) -
Subsetting vectors now works with
View()in RStudio if there are now rows to subset (#253). -
Subsetting datetime columns now works with
NAindices (#236).
-
vroom()now writes the column names if given an input with no rows (#213) -
vroom()columns now support indexing with NA values (#201) -
vroom()no longer truncates the last value in a file if the file contains windows newlines but no final newline (#219). -
vroom()now works when thenaargument is encoded in non ASCII or UTF-8 locales and the file encoding is not the same as the native encoding (#233). -
vroom_fwf()now verifies that the positions are valid, namely that the begin value is always less than the previous end (#217). -
vroom_lines()gains alocaleargument so you can control the encoding of the file (#218) -
vroom_write()now supports theappendargument with R connections (#232)
vroom_altrep_opts()and the argumentvroom(altrep_opts =)have been renamed tovroom_altrep()andaltreprespectively. The prior names have been deprecated.
-
vroom()now supports reading Big Integer values with thebit64package. Usecol_big_integer()or the "I" shortcut to read a column as big integers. (#198) -
cols()gains a.delimargument andvroom()now uses it as the delimiter if it is provided (#192) -
vroom()now supports reading fromstdin()directly, interpreted as the C-level standard input (#106).
-
col_datenow parses single digit month and day (@edzer, #123, #170) -
fwf_empty()now uses theskipparameter, as intended. -
vroom()can now read single line files without a terminal newline (#173). -
vroom()can now select the id column if provided (#110). -
vroom()now correctly copies string data for factor levels (#184) -
vroom()no longer crashes when files have trailing fields, windows newlines and the file is not newline or null terminated. -
vroom()now includes a spec object with thecol_typesclass, as intended. -
vroom()now better handles floating point values with very large exponents (#164). -
vroom()now uses better heuristics to guess the delimiter and now throws an error if a delimiter cannot be guessed (#126, #141, #167). -
vroom()now has an improved error message when a file does not exist (#169). -
vroom()no longer leaks file handles (#177, #180) -
vroom()now outputs its messages onstdout()rather thanstderr(), which avoids the text being red in RStudio and in the Windows GUI. -
vroom()no longer overflows when reading files with more than 2B entries (@wlattner, #183). -
vroom_fwf()is now more robust if not all lines are the expected length (#78) -
vroom_fwf()andfwf_empty()now support passingInftoguess_max(). -
vroom_str()now works with S4 objects. -
vroom_fwf()now handles files with dos newlines properly. -
vroom_write()now does not try to write anything when given empty inputs (#172). -
Dates, times, and datetimes now properly consider the locale when parsing.
-
Added benchmarks with wide data for both numeric and character data (#87, @R3myG)
-
The delimiter used for parsing is now shown in the message output (#95 @R3myG)
- The column created by
idis now stored as an run length encoded Altrep vector, which uses less memory and is much faster for large inputs. (#111)
-
vroom_lines()now properly respects then_maxparameter (#142) -
vroom()andvroom_lines()now support reading files which do not end in newlines by using a file connection (#40). -
vroom_write()now works with the standard output connectionstdout()(#106). -
vroom_write()no longer crashes non-deterministically when used on Altrep vectors. -
The integer parser now returns NA values for invalid inputs (#135)
-
Fix additional UBSAN issue in the mio project reported by CRAN (#97)
-
Fix indexing into connections with quoted fields (#119)
-
Move example files for
vroom()out of\dontshow{}. -
Fix integer overflow with very large files (#116, #119)
-
Fix missing columns and windows newlines (#114)
-
Fix encoding of column names (#113, #115)
-
Throw an error message when writing a zip file, which is not supported (@metaOO, #145)
-
Default message output from
vroom()now usesRowsandCols(@meta00, #140)
vroom_lines()function added, to (lazily) read lines from a file into a character vector (#90).
-
Fix for a hang on Windows caused by a race condition in the progress bar (#98)
-
Remove accidental runtime dependency on testthat (#104)
-
Fix to actually return non-Altrep character columns on R 3.2, 3.3 and 3.4.
-
Disable colors in the progress bar when running in RStudio, to work around an issue where the progress bar would be garbled (rstudio/rstudio#4777)
-
Fix for UBSAN issues reported by CRAN (#97)
-
Fix for rchk issues reported by CRAN (#94)
-
The progress bar now only updates every 10 milliseconds.
-
Getting started vignette index entry now more informative (#92)
-
Initial release
-
Added a
NEWS.mdfile to track changes to the package.