Skip to content

archs4r: a4.meta.series() and a4.data.series() bug for GEO SuperSeries and SubSeries #31

@Barmozavr

Description

@Barmozavr

Dear members of the Ma’ayan Lab,

First of all, thank you a lot for providing the archs4r package and the ARCHS4 resource — it became a game changer for my project, which involves collecting and analyzing a large number of transcriptomic datasets. Having access to preprocessed data has saved me a lot of time.

However, I have encountered a bug when using the archs4r R package.

Specifically, the functions a4.meta.series() and a4.data.series() appear not to work properly when used on GEO series that are either:

  1. Part of a SuperSeries (e.g., GSE108254, which is listed as "This SubSeries is part of SuperSeries"), or
  2. A SuperSeries themselves (e.g., GSE73508).

In contrast, for standard series not associated with SuperSeries, these functions work as expected.

Example code:

>a4.data.series(h5file, 'GSE108254')
NULL
> a4.meta.series(h5file, 'GSE108254', meta_fields = c('characteristics_ch1', 'title'))
Extracting field 'characteristics_ch1' for 0 samples
Extracting field 'title' for 0 samples

I am using the file mouse_gene_v2.5.h5.

Importantly, these GEO series are available for download from the ARCHS4 data portal and both metadata and counts can be downloaded there (please see the attached screenshot showing some of the GEO IDs present on the portal but inaccessible via the R package functions).

Image

Could you please advise whether this is a known limitation, or possibly a bug in the way SuperSeries are indexed in the .h5 file?

Thank you again for your time and work — it's a fantastic resource!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions