Skip to content

Commit 622634b

Browse files
authored
Merge branch 'main' into main
2 parents d5126a6 + 6db9d1c commit 622634b

File tree

321 files changed

+6545
-1586
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

321 files changed

+6545
-1586
lines changed

datasets/1kg-ont-vienna.yaml

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
Name: 1KG-ONT-VIENNA panel
2+
Description: The 1KG-ONT-VIENNA panel comprises medium coverage ONT sequencing data for 1.019 samples from the 1000 Genomes Project collection, structural variants, and their haplotype context.
3+
Documentation: https://github.com/1kg-ont-vienna/sv-analysis
4+
5+
ManagedBy: Institute of Molecular Pathology
6+
UpdateFrequency: Irregular
7+
Tags:
8+
- genetic
9+
- genomic
10+
- life sciences
11+
- whole genome sequencing
12+
- fastq
13+
- fast5
14+
License: There are no restrictions on the use of this data. Use of the data should be cited in the usual way, with current details available at https://github.com/1kg-ont-vienna/sv-analysis/
15+
Resources:
16+
- Description: Primary and derived data
17+
ARN: arn:aws:s3:::1kg-ont-vienna
18+
Region: eu-west-1
19+
Type: S3 Bucket
20+
DataAtWork:
21+
Tutorials:
22+
Tools & Applications:
23+
Publications:
24+
- Title: Long-read sequencing and structural variant characterization in 1,019 samples from the 1000 Genomes Project
25+
URL: https://doi.org/10.1101/2024.04.18.590093
26+
AuthorName: Siegfried Schloissnig, Samarendra Pani, Bernardo Rodriguez-Martin, Jana Ebler, Carsten Hain, Vasiliki Tsapalou, Arda Söylev, Patrick Hüther, Hufsah Ashraf, Timofey Prodanov, Mila Asparuhova, Sarah Hunt, Tobias Rausch, Tobias Marschall, Jan O Korbel
27+

datasets/3kricegenome.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,10 @@ Documentation: https://github.com/awslabs/open-data-docs/tree/main/docs/3kricege
44
Contact: http://iric.irri.org/contact-us
55
ManagedBy: '[International Rice Research Institute](https://www.irri.org/)'
66
UpdateFrequency: Not updated
7+
Collabs:
8+
ASDI:
9+
Tags:
10+
- agriculture
711
Tags:
812
- agriculture
913
- food security

datasets/africa-field-boundary-labels.yaml

Lines changed: 27 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,22 @@
11
Name: A region-wide, multi-year set of crop field boundary labels for Africa
22
Description: >
33
Crop field boundaries digitized in Planet imagery collected across Africa
4-
between 2017 and 2023, developed by [Farmerline](https://farmerline.co/),
5-
[Spatial Collective](https://spatialcollective.com/), and the
6-
[Agricultural Impacts Research Group](https://agroimpacts.info/) at
7-
[Clark University](https://www.clarku.edu/), with support from the
8-
[Lacuna Fund](https://lacunafund.org/)
9-
Documentation: "https://github.com/agroimpacts/lacunalabels/"
4+
between 2017 and 2023, developed by [Farmerline](https://farmerline.co/), [Spatial Collective](https://spatialcollective.com/),
5+
and the [Agricultural Impacts Research Group](https://agroimpacts.info/) at [Clark University](https://www.clarku.edu/), with support from the
6+
[Lacuna Fund](https://lacunafund.org/) ([Estes et al, 2024](https://arxiv.org/abs/2412.18483); [Wussah et al. (2023)](https://zenodo.org/records/11060871)). This dataset has been
7+
further supplemented by additional labels collected primarily for
8+
for 2018 over a subset of countries, which provide an example of their
9+
application in training and validating a CNN-based cropland mapping model
10+
[(Khallaghi et al. 2025)](https://www.mdpi.com/2072-4292/17/3/474).
11+
Documentation: Information on the primary dataset can be found [here](https://github.com/agroimpacts/lacunalabels/).
12+
Documentation for added labels is available [here](https://github.com/agroimpacts/cnn-generalization-enhancement).
1013
1114
ManagedBy: "[The Agricultural Impacts Research Group](https://agroimpacts.info/)"
1215
UpdateFrequency: "Updated versions of the dataset are added as they are developed"
16+
Collabs:
17+
ASDI:
18+
Tags:
19+
- agriculture
1320
Tags:
1421
- agriculture
1522
- machine learning
@@ -19,10 +26,14 @@ Tags:
1926
- labeled
2027
License: "[Planet NICFI participant license agreement](https://assets.planet.com/docs/Planet_ParticipantLicenseAgreement_NICFI.pdf)"
2128
Resources:
22-
- Description: Field boundary labels and corresponding Planet images
29+
- Description: Field boundaries and corresponding Planet images
2330
ARN: arn:aws:s3:::africa-field-boundary-labels
2431
Region: us-west-2
2532
Type: S3 Bucket
33+
- Description: '[Additional rasterized field labels and corresponding Planet images](https://www.mdpi.com/2072-4292/17/3/474)'
34+
ARN: arn:aws:s3:::africa-field-boundary-labels/extra
35+
Region: us-west-2
36+
Type: S3 Bucket
2637
DataAtWork:
2738
Tutorials:
2839
- Title: Instructions on data access and label-making demonstration notebook
@@ -31,6 +42,12 @@ DataAtWork:
3142
AuthorName: Lyndon Estes
3243
AuthorURL: https://github.com/ldemaz
3344
Publications:
45+
- Title: Generalization enhancement strategies to enable cross-year cropland mapping with convolutional neural networks trained using historical samples
46+
URL: https://www.mdpi.com/2072-4292/17/3/474
47+
AuthorName: Khallaghi et al. (2025)
48+
- Title: A region-wide, multi-year set of crop field boundary labels for Africa
49+
URL: https://arxiv.org/abs/2412.18483
50+
AuthorName: Estes et al. (2024)
3451
- Title: A region-wide, multi-year set of crop field boundary labels for Africa
3552
URL: https://zenodo.org/records/11060871
3653
AuthorName: Wussah et al. (2023)
@@ -43,3 +60,6 @@ DataAtWork:
4360
- Title: A platform for crowdsourcing the creation of representative, accurate landcover maps
4461
URL: http://www.sciencedirect.com/science/article/pii/S136481521630010X
4562
AuthorName: Estes et al. (2016)
63+
Citation: >
64+
Primary dataset: Estes et al. (2024). A region-wide, multi-year set of crop field boundary labels for Africa. arXiv:2412.18483.
65+
Additional labels: Khallaghi et al. (2025). Generalization enhancement strategies to enable cross-year cropland mapping with convolutional neural networks trained using historical samples. Remote Sensing, 17(3), 474.

datasets/ag-loam.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,10 @@ Documentation: https://github.com/UCR-Robotics/AG-LOAM
99
Contact: Hanzhe Teng ([email protected]), Konstantinos Karydis ([email protected])
1010
ManagedBy: "[Autonomous Robots and Control Systems Lab](https://sites.google.com/view/arcs-lab)"
1111
UpdateFrequency: NA
12+
Collabs:
13+
ASDI:
14+
Tags:
15+
- agriculture
1216
Tags:
1317
- aws-pds
1418
- robotics

datasets/allen-sea-ad-atlas.yaml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,18 +30,30 @@ Resources:
3030
Type: S3 Bucket
3131
Explore:
3232
- '[Browse Bucket](https://sea-ad-single-cell-profiling.s3.amazonaws.com/index.html)'
33+
- Description: "Update notifications for s3://sea-ad-single-cell-profiling. Users can subscribe to this SNS topic with [AWS Lambda](https://aws.amazon.com/lambda/) or [AWS Simple Queue Service](https://aws.amazon.com/sqs/)."
34+
ARN: arn:aws:sns:us-west-2:208217671510:sea-ad-single-cell-profiling-object_created
35+
Region: us-west-2
36+
Type: SNS Topic
3337
- Description: Quantitative neuropathology (full resolution images, processed images, and quantifications) in a public bucket
3438
ARN: arn:aws:s3:::sea-ad-quantitative-neuropathology
3539
Region: us-west-2
3640
Type: S3 Bucket
3741
Explore:
3842
- '[Browse Bucket](https://sea-ad-quantitative-neuropathology.s3.amazonaws.com/index.html)'
43+
- Description: "Update notifications for s3://sea-ad-quantitative-neuropathology. Users can subscribe to this SNS topic with [AWS Lambda](https://aws.amazon.com/lambda/) or [AWS Simple Queue Service](https://aws.amazon.com/sqs/)."
44+
ARN: arn:aws:sns:us-west-2:208217671510:sea-ad-quantitative-neuropathology-object_created
45+
Region: us-west-2
46+
Type: SNS Topic
3947
- Description: Spatial transcriptomics data files in a public bucket
4048
ARN: arn:aws:s3:::sea-ad-spatial-transcriptomics
4149
Region: us-west-2
4250
Type: S3 Bucket
4351
Explore:
4452
- '[Browse Bucket](https://sea-ad-spatial-transcriptomics.s3.amazonaws.com/index.html)'
53+
- Description: "Update notifications for s3://sea-ad-spatial-transcriptomics. Users can subscribe to this SNS topic with [AWS Lambda](https://aws.amazon.com/lambda/) or [AWS Simple Queue Service](https://aws.amazon.com/sqs/)."
54+
ARN: arn:aws:sns:us-west-2:208217671510:sea-ad-spatial-transcriptomics-object_created
55+
Region: us-west-2
56+
Type: SNS Topic
4557
DataAtWork:
4658
Tools & Applications:
4759
- Title: Seattle Alzheimer’s Disease Brain Cell Atlas

datasets/allthebacteria.yaml

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
Name: AllTheBacteria
2+
Description: All bacterial isolate whole-genome sequencing data from INSDC, uniformly assembled, quality-controlled, annotated, and searchable.
3+
Documentation: https://allthebacteria.org
4+
Contact: https://github.com/AllTheBacteria/AllTheBacteria/issues
5+
ManagedBy: "[European Bioinformatics Institute](https://www.ebi.ac.uk/)"
6+
UpdateFrequency: |
7+
The current release is for all SRA bacterial isolate data up to August 2024. The
8+
colllection will be updated occasionally, with no fixed schedule.
9+
Tags:
10+
- assembly
11+
- bacteria
12+
- bioinformatics
13+
- fasta
14+
- genomic
15+
- life sciences
16+
- microbial genomics
17+
- short read sequencing
18+
- whole genome sequencing
19+
License: "[MIT License](https://opensource.org/license/mit)"
20+
Resources:
21+
- Description: Individual, compressed genome assemblies in .fasta format in a public S3 bucket.
22+
ARN: arn:aws:s3:::allthebacteria-assemblies
23+
Region: eu-west-2
24+
Type: S3 Bucket
25+
Explore:
26+
- Description: Phylogenetically-compressed, batched xz archives of all genome assemblies in .fasta format in a public S3 bucket.
27+
ARN: arn:aws:s3:::allthebacteria-phylogeneticbatches
28+
Region: eu-west-2
29+
Type: S3 Bucket
30+
Explore:
31+
- Description: Metadata for each genome assembly, including taxonomic information, in a public S3 bucket.
32+
ARN: arn:aws:s3:::allthebacteria-metadata
33+
Region: eu-west-2
34+
Type: S3 Bucket
35+
Explore:
36+
- Description: "A [LexicMap](https://github.com/shenwei356/LexicMap) index of all genome assemblies. This can be used for efficient sequence alignment against all genomes."
37+
ARN: arn:aws:s3:::allthebacteria-lexicmap
38+
Region: eu-west-2
39+
Type: S3 Bucket
40+
Explore:
41+
DataAtWork:
42+
Publications:
43+
- Title: AllTheBacteria - all bacterial genomes assembled, available and searchable
44+
URL: https://doi.org/10.1101/2024.03.08.584059
45+
AuthorName: Hunt M, Lima L, Anderson D, Hawkey J, Shen W, Lees J, Iqbal I
46+
AuthorURL: https://researchportal.bath.ac.uk/en/persons/zamin-iqbal
47+
ADXCategories:
48+
- Healthcare & Life Sciences Data

datasets/amazon-last-mile-challenges.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,10 @@ Contact: [email protected]
77
ManagedBy: "[Amazon](https://www.amazon.com/)"
88
UpdateFrequency: None
99

10+
Collabs:
11+
ASDI:
12+
Tags:
13+
- infrastructure
1014
Tags:
1115
- transportation
1216
- machine learning

datasets/aodn_animal_acoustic_tracking_delayed_qc.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea
2323
2424
ManagedBy: AODN
2525
UpdateFrequency: As Needed
26+
Collabs:
27+
ASDI:
28+
Tags:
29+
- biodiversity
2630
Tags:
2731
- oceans
2832
- marine mammals

datasets/aodn_animal_ctd_satellite_relay_tagging_delayed_qc.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,10 @@ Documentation: https://catalogue-imos.aodn.org.au/geonetwork/srv/eng/catalog.sea
2323
2424
ManagedBy: AODN
2525
UpdateFrequency: As Needed
26+
Collabs:
27+
ASDI:
28+
Tags:
29+
- biodiversity
2630
Tags:
2731
- oceans
2832
- marine mammals

datasets/aodn_model_sea_level_anomaly_gridded_realtime.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,10 @@ Resources:
3737
anomaly - Near real time
3838
Region: ap-southeast-2
3939
Type: S3 Bucket
40+
Collabs:
41+
ASDI:
42+
Tags:
43+
- oceans
4044
Tags:
4145
- oceans
4246
- ocean velocity

0 commit comments

Comments
 (0)