Skip to content

Commit 5245636

Browse files
authored
Merge branch 'main' into main
2 parents b72493f + 0d57273 commit 5245636

File tree

7 files changed

+145
-13
lines changed

7 files changed

+145
-13
lines changed

datasets/askap.yaml

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
Name: ASKAP Radio Telescope
2+
Description: |
3+
4+
ASKAP is the CSIRO’s newest radio telescope. It is situated at the Inyarrimanha Ilgari Bundara, the CSIRO Murchison Radio-astronomy Observatory on Wajarri Yamaji Country in the Murchison region of Western Australia, about 800 km north of Perth.
5+
6+
ASKAP consists of 36 12m dishes, spread-out as far as 6km apart. It uses a new technology called Phased Array Feeds (PAFs), which allows it to see more of the sky at once. This novel technology allows ASKAP to achieve extremely high survey speed, making it one of the best instruments in the world for mapping the sky at radio wavelengths.
7+
8+
Initial dataset available - The Rapid ASKAP Continuum Survey (RACS)
9+
10+
RACS is the first large-area survey completed with ASKAP. This survey is revolutionary as the entire sky was observed in a matter of weeks, doing what previously took telescopes years to do. RACS initially covered the whole sky at 890 MHz (RACS-Low), and has since expanded to ASKAP’s other bands (1.4 and 1.7 GHz). RACS also covers the sky in multiple epochs, with a second epoch of RACS-Low and RACS-Mid obtained and processed.
11+
12+
RACS provides astronomers with a unique opportunity to study the radio sky and radio populations, in particular supermassive blackholes (active galactic nuclei) and their role in galaxy evolution. The multi-epoch approach also allows a study of the transient sky and testing and verification of calibration methods. The large area allows for cosmological studies, such as a search for anisotropy in the galaxy population, or cosmic dipole.
13+
14+
Documentation: https://www.atnf.csiro.au/facilities/askap-radio-telescope/
15+
16+
ManagedBy: "[Australia Telescope National Facility, CSIRO](http://www.atnf.csiro.au/)"
17+
Citation: Please see the [ATNF acknowledgement page](https://www.atnf.csiro.au/resources/publications/atnf-publication-acknowledgement-statements/) for full citation instructions.
18+
UpdateFrequency: Roughly quarterly
19+
Tags:
20+
- aws-pds
21+
- astronomy
22+
- archives
23+
License: CC-BY-4.0. Attribution required for refereed scientific papers.
24+
Resources:
25+
- Description: The Rapid ASKAP Continuum Survey (RACS) Public Data Releases
26+
ARN: arn:aws:s3:::askap/racs
27+
Region: ap-southeast-2
28+
Type: S3 Bucket
29+
RequesterPays: False
30+
- Description: Notifications for new Rapid ASKAP Continuum Survey (RACS) data
31+
ARN: arn:aws:sns:ap-southeast-2:336305517014:racs-low1-object_created
32+
Region: sp-southeast-2
33+
Type: SNS Topic
34+
DataAtWork:
35+
Tutorials:
36+
- Title: CSIRO ASKAP Science Data Archive User Guide
37+
URL: https://research.csiro.au/casda/casda-user-guide/
38+
AuthorName: CSIRO, ATNF
39+
- Title: Rapid Askap Continuum Survey (RACS) Home Page
40+
URL: https://research.csiro.au/racs/
41+
AuthorName: CSIRO, ATNF
42+
Tools & Applications:
43+
Publications:
44+
- Title: ASKAP Publication List
45+
URL: https://www.atnf.csiro.au/facilities/askap-radio-telescope/publications/
46+
AuthorName: various, list maintained by CSIRO, ATNF
47+
- Title: ASKAP System Description paper
48+
URL: https://doi.org/10.1017/pasa.2021.1
49+
AuthorName: Hotan, A. et al.

datasets/depmap-omics-ccle.yaml

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
Name: The Cancer Dependency Map (DepMap) Cancer Cell Line Encyclopedia (CCLE) Dataset
2+
Description: This dataset consists of whole genome sequencing (WGS), whole exome sequencing (WES), and RNA sequencing files generated from ~1000 cancer cell lines described in Ghandi et al., 2019.
3+
Documentation: https://github.com/broadinstitute/depmap-omics-ccle
4+
Contact: https://forum.depmap.org
5+
ManagedBy: "[Cancer Data Science](https://cancerdatascience.org/), [Broad Institute](https://www.broadinstitute.org/)"
6+
UpdateFrequency: occasionally (as additional sequencings are generated for publicly-releasible CCLE models)
7+
Tags:
8+
- aws-pds
9+
- bam
10+
- biology
11+
- bioinformatics
12+
- cancer
13+
- genetic
14+
- genomic
15+
- Homo sapiens
16+
- life sciences
17+
- short read sequencing
18+
- transcriptomics
19+
- whole exome sequencing
20+
- whole genome sequencing
21+
License: https://grants.nih.gov/policy-and-compliance/policy-topics/sharing-policies/accessing-data/using-genomic-data
22+
Citation: Ghandi, Huang, Jané-Valbuena et al. Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature 569, 503–508 (2019). https://doi.org/10.1038/s41586-019-1186-3
23+
Resources:
24+
- Description: CRAM/BAM files (and their corresponding CRAI/BAI indexes) for RNA, WES, and WGS samples released by The Cancer Dependency Map (DepMap) as part of the Cancer Cell Line Encyclopedia (CCLE) project
25+
ARN: arn:aws:s3:::depmap-omics-ccle
26+
Region: us-east-1
27+
Type: S3 Bucket
28+
- Description: Notifications for new depmap-omics-ccle data
29+
ARN: arn:aws:sns:us-east-1:019511184952:depmap-omics-ccle-object_created
30+
Region: us-east-1
31+
Type: SNS Topic
32+
DataAtWork:
33+
Tutorials:
34+
- Title: DepMap Omics CCLE data on the AWS Open Data Registry
35+
URL: https://github.com/broadinstitute/depmap-omics-ccle
36+
AuthorName: Devin McCabe
37+
Tools & Applications:
38+
- Title: The Cancer Dependency Map (DepMap)
39+
URL: https://depmap.org
40+
AuthorName: Arafeh, Shibue, Dempster et al.
41+
- Title: Cancer Cell Line Encyclopedia (CCLE)
42+
URL: https://sites.broadinstitute.org/ccle
43+
AuthorName: Ghandi, Huang, Jané-Valbuena et al.
44+
Publications:
45+
- Title: Next-generation characterization of the Cancer Cell Line Encyclopedia
46+
URL: https://www.nature.com/articles/s41586-019-1186-3
47+
AuthorName: Ghandi, Huang, Jané-Valbuena et al.
48+
- Title: The present and future of the Cancer Dependency Map
49+
URL: https://www.nature.com/articles/s41568-024-00763-x
50+
AuthorName: Arafeh, Shibue, Dempster et al.
51+
AuthorURL: https://depmap.org
52+
- Title: Partial gene suppression improves identification of cancer vulnerabilities when CRISPR-Cas9 knockout is pan-lethal
53+
URL: https://genomebiology.biomedcentral.com/articles/10.1186/s13059-023-03020-w
54+
AuthorName: Krill-Burger, Dempster, Borah et al.
55+
- Title: Genetic dependencies associated with transcription factor activities in human cancer cell lines
56+
URL: https://www.sciencedirect.com/science/article/pii/S2211124724005035
57+
AuthorName: Thatikonda, Supper, Wachter et al.
58+
- Title: Bridging the gap between cancer cell line models and tumours using gene expression data
59+
URL: https://www.nature.com/articles/s41416-021-01359-0
60+
AuthorName: Noorbakhsh, Vazquez & McFarland
61+
- Title: Integrated cross-study datasets of genetic dependencies in cancer
62+
URL: https://www.nature.com/articles/s41467-021-21898-7
63+
AuthorName: Pacini, Dempster, Boyle et al.
64+
- Title: Machine learning multi-omics analysis reveals cancer driver dysregulation in pan-cancer cell lines compared to primary tumors
65+
URL: https://www.nature.com/articles/s42003-022-04075-4
66+
AuthorName: Sanders, Chandra, Zebarjadi et al.
67+
- Title: "The Network Zoo: a multilingual package for the inference and analysis of gene regulatory networks"
68+
URL: https://link.springer.com/article/10.1186/s13059-023-02877-1
69+
AuthorName: Ben Guebila, Wang, Lopes-Ramos et al.
70+
ADXCategories:
71+
- Healthcare & Life Sciences Data

datasets/mosaic.yaml

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
Name: Meta-Organized Stimuli And fMRI Imaging data for Computational modeling (MOSAIC)
22
Description: This extensible dataset, MOSAIC, aggregates individual functional magnetic resonance imaging (fMRI) datasets by leveraging a shared preprocessing pipeline and stimulus curation procedure. This dataset aggregation procedure achieves the scale necessary for neural network training and the diversity needed for generalizable results.
3-
Documentation: https://github.com/blahner/mosaic-preprocessing
3+
Documentation: https://blahner.github.io/MOSAICfmri/
44
55
ManagedBy: Massachusetts Institute of Technology, Georgia Tech
66
UpdateFrequency: New data is uploaded as researchers preprocess their fMRI data according to MOSAIC format and submit.
@@ -23,6 +23,20 @@ Resources:
2323
- '[Browse Bucket](https://mosaicfmri.s3.amazonaws.com/index.html)'
2424
DataAtWork:
2525
Tutorials:
26+
- Title: Preprocess fMRI datasets with MOSAIC shared pipeline
27+
URL: https://github.com/blahner/mosaic-preprocessing
28+
AuthorName: Benjamin Lahner
29+
- Title: MOSAIC Python package (mosaic-dataset)
30+
URL: https://pypi.org/project/mosaic-dataset/
31+
AuthorName: Mayukh Deb
32+
- Title: Download MOSAIC data, visualize fMRI responses, load and run brain-optimized models (Jupyter notebook)
33+
URL: https://github.com/murtylab/mosaic-dataset/blob/master/examples/mosaic-starter.ipynb
34+
NotebookURL: https://github.com/murtylab/mosaic-dataset/blob/master/examples/mosaic-starter.ipynb
35+
AuthorName: Mayukh Deb
36+
- Title: Run a synthetic localizer experiment using MOSAIC's brain-optimized models (Jupyter notebook)
37+
URL: https://github.com/murtylab/mosaic-dataset/blob/master/examples/mosaic_synthetic_localizer.ipynb
38+
NotebookURL: https://github.com/murtylab/mosaic-dataset/blob/master/examples/mosaic_synthetic_localizer.ipynb
39+
AuthorName: Benjamin Lahner
2640
- Title: Load HDF5 file (Jupyter notebook)
2741
URL: https://github.com/blahner/mosaic-preprocessing/blob/main/src/fmriDatasetPreparation/create_hdf5/load_hdf5.ipynb
2842
NotebookURL: https://github.com/blahner/mosaic-preprocessing/blob/main/src/fmriDatasetPreparation/create_hdf5/load_hdf5.ipynb

datasets/noaa-nexrad.yaml

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -52,14 +52,6 @@ Resources:
5252
ARN: arn:aws:sns:us-east-1:684042711724:NewNEXRADLevel3Object
5353
Region: us-east-1
5454
Type: SNS Topic
55-
- Description: "*OLD NEXRAD Level II archive bucket* which is now <b>Deprecated</b>. It is recommended to move to the new bucket: unidata-nexrad-level2 and SNS topic: arn:aws:sns:us-east-1:684042711724:NewNEXRADLevel2Archive"
56-
ARN: arn:aws:s3:::noaa-nexrad-level2
57-
Region: us-east-1
58-
Type: S3 Bucket
59-
- Description: "Notifications for the *OLD Level II archival bucket* which is now <b>Deprecated</b>. It is recommended to move to the new bucket: unidata-nexrad-level2 and SNS topic: arn:aws:sns:us-east-1:684042711724:NewNEXRADLevel2Archive"
60-
ARN: arn:aws:sns:us-east-1:811054952067:NewNEXRADLevel2Archive
61-
Region: us-east-1
62-
Type: SNS Topic
6355
DataAtWork:
6456
Tutorials:
6557
- Title: Using Python to Access NCEI Archived NEXRAD Level 2 Data (Jupyter notebook)

datasets/open-ceda.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
Name: Open CEDA by Watershed
22
Description: |
33
CEDA is a multi-regional Environmentally-Extended Input-Output (EEIO) model developed to support a wide range of environmental systems analyses—including corporate carbon accounting and sustainable spend analysis. CEDA provides unparalleled global coverage and granularity, representing 95% of the world's GDP across 148 countries and 400 sectors, enabling robust and geographically comprehensive Scope 3 greenhouse gas (GHG) measurement.
4-
Open CEDA is the publicly avaialable version of CEDA, now easy to download and available for free for all use cases. For more information please visit our website at openceda.org
5-
CEDA 2024, the latest version of CEDA, uses 2022 as its base year, ensuring that emissions factors and economic data reflect the most recent global economic landscape available. To maintain accuracy and relevance, CEDA is updated annually with the latest data releases.
4+
Open CEDA is the publicly avaialable version of CEDA, now easy to download and available for free for all use cases. For more information please visit our website at openceda.org.
5+
This data registry entry contains CEDA 2025 and CEDA 2024 in two separate files. CEDA 2025, the latest version of CEDA, uses 2023 as its base year, ensuring that emissions factors and economic data reflect the most recent global economic landscape available. To maintain accuracy and relevance, CEDA is updated annually with the latest data releases.
66
At its core, CEDA connects economic exchanges to GHG emissions by quantifying the life-cycle emissions of products and services. This is achieved through the integration of input-output tables, which represent the full supply-chain network of the global economy, with GHG emissions data. As a result, CEDA provides users with a powerful tool to assess the environmental impacts embedded in corporate value chains.
77
Documentation: https://openceda.org/
88

datasets/st-open-data.yaml

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,12 @@ Tags:
1111
- disaster response
1212
- geospatial
1313
- image processing
14-
License: Creative Commons Attribution-NonCommercial 4.0 International
14+
License: |
15+
Creative Commons Attribution 4.0 International (CC BY 4.0).
16+
For more information, See the document "ST-1 Product Terms of Use" at [our Terms of Use webpage](https://si-imaging.com/page/73)
17+
Citation: |
18+
When publicly post of ST products Open Data unedited, "SpaceEye-T © [Year] Satrec Initiative (Licensed under CC BY 4.0)".
19+
When publicly post derivative data created by using ST products Open Data, "SpaceEye-T-derived data © [Year] Satrec Initiative (Originally licensed under CC BY 4.0)".
1520
ManagedBy: "[SI Imaging Services](https://www.si-imaging.com/)"
1621
Resources:
1722
- Description: SpaceEye-T Imagery Collection

datasets/surface-pm2-5-v6gl.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,15 @@
11
Name: SatPM2.5
22
Description: Fine particulate matter (PM2.5) concentrations are estimated using information from satellite-, simulation- and monitor-based sources. Aerosol optical depth from multiple satellites (MODIS, VIIRS, MISR, SeaWiFS, and VIIRS) and their respective retrievals (Dark Target, Deep Blue, MAIAC) is combined with simulation (GEOS-Chem) based upon their relative uncertainties as determined using ground-based sun photometer (AERONET) observations to produce geophysical estimates that explain most of the variance in ground-based PM2.5 measurements. A subsequent statistical fusion incorporates additional information from ground-based PM2.5 measurements.
33
Documentation: https://sites.wustl.edu/acag/datasets/surface-pm2-5/#V6.GL.02.04
4-
4+
55
ManagedBy: "https://sites.wustl.edu/acag/"
66
UpdateFrequency: Yearly
77
Collabs:
88
ASDI:
99
Tags:
1010
- climate
1111
Tags:
12+
- aws-pds
1213
- atmosphere
1314
- netcdf
1415
- environmental

0 commit comments

Comments
 (0)