Skip to content

Name repair for duplicated columns inconsistent between read_csv and spec_csv #1387

@blaidd4drwg

Description

@blaidd4drwg

When trying to create column specifications of a file that contains duplicated variable names, spec_csv() renames the variables differently (e.g "error_1") than read_csv() ("error...1").

Let test.csv be a simple CSV with duplicated variables "error" and one observation:

echo "a,error,b,error,c,error\n1,string1,2,string2,3,string3" >> test.csv
readr::spec_csv("test.csv")
cols(
  a = col_double(),
  error = col_character(),
  b = col_double(),
  error_1 = col_character(),
  c = col_double(),
  error_2 = col_character()
)

Warning message:
Duplicated column names deduplicated: 'error' => 'error_1' [4], 'error' => 'error_2' [6] 
readr::read_csv("test.csv")
New names:                                                                                                                                                                                                  
* error -> error...2
* error -> error...4
* error -> error...6
Rows: 1 Columns: 6
── Column specification ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (3): error...2, error...4, error...6
dbl (3): a, b, c

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# A tibble: 1 × 6
      a error...2     b error...4     c error...6
  <dbl> <chr>     <dbl> <chr>     <dbl> <chr>    
1     1 string1       2 string2       3 string3

Metadata

Metadata

Assignees

No one assigned

    Labels

    featurea feature request or enhancement

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions