Skip to content

[R][Parquet] Error: Invalid: Invalid number of indices: 0 with read_parquet #48066

@anashen

Description

@anashen

Describe the bug, including details regarding any error messages, version, and platform.

Hi,

It looks like the team may already be aware of this (#47981), but I wanted to create a separate issue for the R package.

Our R package Seurat depends on read_parquet. With arrow v22.0.0, read_parquet throws Error: Invalid: Invalid number of indices: 0 on only some files. I can confirm that arrow v21.0.0.1 has no issues on the same files. An example of the type of file raising an error is included below.

Any fixes for this, or tips on getting around it, will be much appreciated. However, the data in question is not generated by us, so we have no control regarding file format.

# publicly available at 
# https://www.10xgenomics.com/support/software/xenium-onboard-analysis/latest/resources/xenium-example-data#test-data-v4-0
curl -O https://cf.10xgenomics.com/samples/xenium/4.0.0/Xenium_V1_Protein_Human_Kidney_tiny/Xenium_V1_Protein_Human_Kidney_tiny_outs.zip
# `transcripts.parquet` is located inside the zipped folder
read_parquet("/fakepath/Xenium_V1_Protein_Human_Kidney_tiny_outs/transcripts.parquet")
# Error: Invalid: Invalid number of indices: 0
traceback()
# 6: stop(e)
# 5: value[[3L]](cond)
# 4: tryCatchOne(expr, names, parentenv, handlers[[1L]])
# 3: tryCatchList(expr, classes, parentenv, handlers)
# 2: tryCatch(reader$ReadTable(), error = read_compressed_error)
# 1: read_parquet("/fakepath/Xenium_V1_Protein_Human_Kidney_tiny_outs/transcripts.parquet")
sessionInfo
R version 4.5.1 (2025-06-13)
Platform: aarch64-apple-darwin20
Running under: macOS Sequoia 15.6.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] arrow_22.0.0

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.1 bit_4.6.0        compiler_4.5.1   magrittr_2.0.3  
 [5] assertthat_0.2.1 R6_2.6.1         cli_3.6.5        glue_1.8.0      
 [9] bit64_4.6.0-1    vctrs_0.6.5      lifecycle_1.0.4  rlang_1.1.6     
[13] purrr_1.1.0  

Thanks!

Component(s)

R

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions