-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Dear members of the Ma’ayan Lab,
First of all, thank you a lot for providing the archs4r package and the ARCHS4 resource — it became a game changer for my project, which involves collecting and analyzing a large number of transcriptomic datasets. Having access to preprocessed data has saved me a lot of time.
However, I have encountered a bug when using the archs4r R package.
Specifically, the functions a4.meta.series() and a4.data.series() appear not to work properly when used on GEO series that are either:
- Part of a SuperSeries (e.g., GSE108254, which is listed as "This SubSeries is part of SuperSeries"), or
- A SuperSeries themselves (e.g., GSE73508).
In contrast, for standard series not associated with SuperSeries, these functions work as expected.
Example code:
>a4.data.series(h5file, 'GSE108254')
NULL
> a4.meta.series(h5file, 'GSE108254', meta_fields = c('characteristics_ch1', 'title'))
Extracting field 'characteristics_ch1' for 0 samples
Extracting field 'title' for 0 samples
I am using the file mouse_gene_v2.5.h5.
Importantly, these GEO series are available for download from the ARCHS4 data portal and both metadata and counts can be downloaded there (please see the attached screenshot showing some of the GEO IDs present on the portal but inaccessible via the R package functions).
Could you please advise whether this is a known limitation, or possibly a bug in the way SuperSeries are indexed in the .h5 file?
Thank you again for your time and work — it's a fantastic resource!
