|
| 1 | +--- |
| 2 | +jupytext: |
| 3 | + text_representation: |
| 4 | + extension: .md |
| 5 | + format_name: myst |
| 6 | + format_version: 0.13 |
| 7 | + jupytext_version: 1.16.3 |
| 8 | +kernelspec: |
| 9 | + display_name: irsa-tutorials |
| 10 | + language: python |
| 11 | + name: python3 |
| 12 | +--- |
| 13 | + |
| 14 | +# Euclid Quick Release 1: Cloud Access |
| 15 | + |
| 16 | ++++ |
| 17 | + |
| 18 | +## Learning Goals |
| 19 | +- Learn where Euclid QR1 data is present in the cloud |
| 20 | +- Find an image and retireve its cutout from cloud |
| 21 | +- Find an object and retrieve its spectrum from cloud |
| 22 | + |
| 23 | ++++ |
| 24 | + |
| 25 | +## Introduction |
| 26 | +TODO: fill with links of docs/specs? |
| 27 | + |
| 28 | ++++ |
| 29 | + |
| 30 | +## Imports |
| 31 | + |
| 32 | +```{code-cell} ipython3 |
| 33 | +# Uncomment the next line to install dependencies if needed. |
| 34 | +# !pip install s3fs astropy astroquery matploltib |
| 35 | +``` |
| 36 | + |
| 37 | +```{code-cell} ipython3 |
| 38 | +import s3fs |
| 39 | +from astropy.coordinates import SkyCoord |
| 40 | +import astropy.units as u |
| 41 | +from astropy.visualization import ImageNormalize, PercentileInterval, AsinhStretch |
| 42 | +from astropy.io import fits |
| 43 | +from astropy.nddata import Cutout2D |
| 44 | +from astropy.wcs import WCS |
| 45 | +from astropy.table import Table |
| 46 | +from astroquery.ipac.irsa import Irsa |
| 47 | +from matplotlib import pyplot as plt |
| 48 | +import json |
| 49 | +``` |
| 50 | + |
| 51 | +## Browse Euclid QR1 Bucket |
| 52 | + |
| 53 | +```{code-cell} ipython3 |
| 54 | +BUCKET_NAME='nasa-irsa-euclid-q1' # internal to IPAC until public release (use LAN or VPN w/ Tunnel-all) |
| 55 | +``` |
| 56 | + |
| 57 | +```{code-cell} ipython3 |
| 58 | +s3 = s3fs.S3FileSystem(anon=True) |
| 59 | +``` |
| 60 | + |
| 61 | +```{code-cell} ipython3 |
| 62 | +s3.ls(f'{BUCKET_NAME}/q1') |
| 63 | +``` |
| 64 | + |
| 65 | +## Find images for a coordinate search |
| 66 | + |
| 67 | ++++ |
| 68 | + |
| 69 | +### Locate MER images in the bucket |
| 70 | + |
| 71 | +```{code-cell} ipython3 |
| 72 | +s3.ls(f'{BUCKET_NAME}/q1/MER') |
| 73 | +``` |
| 74 | + |
| 75 | +```{code-cell} ipython3 |
| 76 | +s3.ls(f'{BUCKET_NAME}/q1/MER/102018211') |
| 77 | +``` |
| 78 | + |
| 79 | +```{code-cell} ipython3 |
| 80 | +s3.ls(f'{BUCKET_NAME}/q1/MER/102018211/VIS') |
| 81 | +``` |
| 82 | + |
| 83 | +As per doc specification, we need `MER/{tile_id}/{instrument}/EUC_MER_BGSUB-MOSAIC*.fits` for displaying background-subtracted mosiac images. But these images are stored under TILE IDs so first we need to find TILE ID for a coordinate search we are interested in. We will use astroquery (in next section) to retrieve FITS file paths for our coordinates. |
| 84 | + |
| 85 | ++++ |
| 86 | + |
| 87 | +### Get image file paths for a coordinate search of interest |
| 88 | + |
| 89 | +```{code-cell} ipython3 |
| 90 | +# Star TYC 4429-1677-1 |
| 91 | +ra = 273.11730230 * u.deg |
| 92 | +dec = 68.20761349 * u.deg |
| 93 | +``` |
| 94 | + |
| 95 | +```{code-cell} ipython3 |
| 96 | +coord = SkyCoord(ra=ra, dec=dec) |
| 97 | +search_radius = 10 * u.arcsec |
| 98 | +``` |
| 99 | + |
| 100 | +List all Simple Image Access (SIA) collections for IRSA. |
| 101 | + |
| 102 | +```{code-cell} ipython3 |
| 103 | +tbl = Irsa.list_collections(servicetype='SIA') |
| 104 | +len(tbl) |
| 105 | +``` |
| 106 | + |
| 107 | +Filter to only those containing "euclid": |
| 108 | + |
| 109 | +```{code-cell} ipython3 |
| 110 | +tbl[['euclid' in v for v in tbl['collection']]] |
| 111 | +``` |
| 112 | + |
| 113 | +We identify the collection we need for MER images: |
| 114 | + |
| 115 | +```{code-cell} ipython3 |
| 116 | +img_collection = 'euclid_DpdMerBksMosaic' |
| 117 | +``` |
| 118 | + |
| 119 | +```{code-cell} ipython3 |
| 120 | +img_tbl = Irsa.query_sia(pos=(coord, search_radius), collection=img_collection).to_table() |
| 121 | +img_tbl |
| 122 | +``` |
| 123 | + |
| 124 | +Now we narrow it down to the images with science dataproduct subtype and Euclid facility: |
| 125 | + |
| 126 | +```{code-cell} ipython3 |
| 127 | +euclid_sci_img_tbl = img_tbl[[row['facility_name']=='Euclid' and row['dataproduct_subtype']=='science' for row in img_tbl]] |
| 128 | +euclid_sci_img_tbl |
| 129 | +``` |
| 130 | + |
| 131 | +We can see there's a `cloud_access` column that gives us the location info of the image files we are interested in. So let's extract the S3 bucket file path from it. |
| 132 | + |
| 133 | +```{code-cell} ipython3 |
| 134 | +def get_s3_fpath(cloud_access): |
| 135 | + cloud_info = json.loads(cloud_access) # converts str to dict |
| 136 | + bucket_name = cloud_info['aws']['bucket_name'] |
| 137 | + key = cloud_info['aws']['key'] |
| 138 | +
|
| 139 | + return f'{bucket_name}/{key}' |
| 140 | +``` |
| 141 | + |
| 142 | +```{code-cell} ipython3 |
| 143 | +[get_s3_fpath(row['cloud_access']) for row in euclid_sci_img_tbl] |
| 144 | +``` |
| 145 | + |
| 146 | +Let's also extract filter names to use when displaying the images: |
| 147 | + |
| 148 | +```{code-cell} ipython3 |
| 149 | +def get_filter_name(instrument, bandpass): |
| 150 | + return f'{instrument}_{bandpass}' if instrument!=bandpass else instrument |
| 151 | +``` |
| 152 | + |
| 153 | +```{code-cell} ipython3 |
| 154 | +[get_filter_name(row['instrument_name'], row['energy_bandpassname']) for row in euclid_sci_img_tbl] |
| 155 | +``` |
| 156 | + |
| 157 | +## Retrieve image cutouts from the cloud |
| 158 | +These image files are very big (~1.4GB), so we use astropy's lazy-loading capability of FITS for better performance. (See [Obtaining subsets from cloud-hosted FITS files](https://docs.astropy.org/en/stable/io/fits/usage/cloud.html#fits-io-cloud).) |
| 159 | + |
| 160 | +```{code-cell} ipython3 |
| 161 | +cutout_size = 1 * u.arcmin |
| 162 | +``` |
| 163 | + |
| 164 | +```{code-cell} ipython3 |
| 165 | +cutouts = [] |
| 166 | +filters = [] |
| 167 | +
|
| 168 | +for row in euclid_sci_img_tbl: |
| 169 | + s3_fpath = get_s3_fpath(row['cloud_access']) |
| 170 | + filter_name = get_filter_name(row['instrument_name'], row['energy_bandpassname']) |
| 171 | +
|
| 172 | + with fits.open(f's3://{s3_fpath}', fsspec_kwargs={"anon": True}) as hdul: |
| 173 | + print(f'Retrieving cutout for {filter_name} ...') |
| 174 | + cutout = Cutout2D(hdul[0].section, |
| 175 | + position=coord, |
| 176 | + size=cutout_size, |
| 177 | + wcs=WCS(hdul[0].header)) |
| 178 | + cutouts.append(cutout) |
| 179 | + filters.append(filter_name) |
| 180 | +``` |
| 181 | + |
| 182 | +```{code-cell} ipython3 |
| 183 | +fig, axes = plt.subplots(2, 2, figsize=(4 * 2, 4 * 2), subplot_kw={'projection': cutouts[0].wcs}) |
| 184 | +
|
| 185 | +for idx, ax in enumerate(axes.flat): |
| 186 | + norm = ImageNormalize(cutouts[idx].data, interval=PercentileInterval(99), stretch=AsinhStretch()) |
| 187 | + ax.imshow(cutouts[idx].data, cmap='gray', origin='lower', norm=norm) |
| 188 | + ax.set_xlabel('RA') |
| 189 | + ax.set_ylabel('Dec') |
| 190 | + ax.text(0.95, 0.05, filters[idx], color='white', fontsize=14, transform=ax.transAxes, va='bottom', ha='right') |
| 191 | +
|
| 192 | +plt.tight_layout() |
| 193 | +``` |
| 194 | + |
| 195 | +## Find objects for the coordinates of our interest |
| 196 | + |
| 197 | ++++ |
| 198 | + |
| 199 | +### Locate MER catalogs in the bucket |
| 200 | + |
| 201 | +```{code-cell} ipython3 |
| 202 | +s3.ls(f'{BUCKET_NAME}/q1/catalogs') |
| 203 | +``` |
| 204 | + |
| 205 | +```{code-cell} ipython3 |
| 206 | +s3.ls(f'{BUCKET_NAME}/q1/catalogs/MER_FINAL_CATALOG') |
| 207 | +``` |
| 208 | + |
| 209 | +```{code-cell} ipython3 |
| 210 | +s3.ls(f'{BUCKET_NAME}/q1/catalogs/MER_FINAL_CATALOG/102018211') |
| 211 | +``` |
| 212 | + |
| 213 | +As per doc specification, we need `catalogs/MER_FINAL_CATALOG/{tile_id}/EUC_MER_FINAL-CAT*.fits` for listing the objects catalogued. But we only need to find objects in our coordinates of interest so we will use astroquery to do a spatial search in MER catalog (combined for all tiles). |
| 214 | + |
| 215 | ++++ |
| 216 | + |
| 217 | +### Get object IDs for the coordinates of our interest |
| 218 | + |
| 219 | +```{code-cell} ipython3 |
| 220 | +tbl_catalogs = Irsa.list_catalogs(full=True).to_table() |
| 221 | +len(tbl_catalogs) |
| 222 | +``` |
| 223 | + |
| 224 | +```{code-cell} ipython3 |
| 225 | +tbl_catalogs[['euclid' in v for v in tbl_catalogs['schema_name']]] |
| 226 | +``` |
| 227 | + |
| 228 | +From this table, we can extract the MER catalog name. We also see several other interesting catalogs, let's also extract spectral file association catalog for retrieving spectra later. |
| 229 | + |
| 230 | +```{code-cell} ipython3 |
| 231 | +euclid_mer_catalog = 'euclid_q1_mer_catalogue' |
| 232 | +euclid_spec_association_catalog = 'euclid.objectid_spectrafile_association_q1' |
| 233 | +``` |
| 234 | + |
| 235 | +Now, we do a TAP search with spatial constraints for our coordinates. We use cone of 5 arcsec around our source to pinpoint its object ID in Euclid catalog. |
| 236 | + |
| 237 | +```{code-cell} ipython3 |
| 238 | +search_radius = (5 * u.arcsec).to('deg') |
| 239 | +
|
| 240 | +adql_query = f"SELECT * \ |
| 241 | + FROM {euclid_mer_catalog} \ |
| 242 | + WHERE CONTAINS(POINT('ICRS', ra, dec), \ |
| 243 | + CIRCLE('ICRS', {ra.value}, {dec.value}, {search_radius.value})) = 1" |
| 244 | +
|
| 245 | +mer_catalog_tbl = Irsa.query_tap(query=adql_query).to_table() |
| 246 | +mer_catalog_tbl |
| 247 | +``` |
| 248 | + |
| 249 | +```{code-cell} ipython3 |
| 250 | +object_id = int(mer_catalog_tbl['object_id'][0]) |
| 251 | +object_id |
| 252 | +``` |
| 253 | + |
| 254 | +## Find spectra for the coordinates of our interest |
| 255 | +Using the object ID(s) we extracted above, we can narrow down the spectral file association catalog to identify spectra file path(s). |
| 256 | + |
| 257 | +```{code-cell} ipython3 |
| 258 | +adql_query = f"SELECT * FROM {euclid_spec_association_catalog} \ |
| 259 | + WHERE objectid = {object_id}" |
| 260 | +
|
| 261 | +spec_association_tbl = Irsa.query_tap(adql_query).to_table() |
| 262 | +spec_association_tbl |
| 263 | +``` |
| 264 | + |
| 265 | +We can see the `uri` column that gives us location of spectra file on IBE, we can map it to S3 bucket key to retrieve spectra file from the cloud. This is a very big FITS spectra file with multiple extensions where each extension contains spectrum of one object. The `hdu` column gives us the extension number for our object. So let's extract both of these. |
| 266 | + |
| 267 | +```{code-cell} ipython3 |
| 268 | +spec_fpath_key = spec_association_tbl['uri'][0].replace('ibe/data/euclid/', '') |
| 269 | +spec_fpath_key |
| 270 | +``` |
| 271 | + |
| 272 | +```{code-cell} ipython3 |
| 273 | +object_hdu_idx = int(spec_association_tbl['hdu'][0]) |
| 274 | +object_hdu_idx |
| 275 | +``` |
| 276 | + |
| 277 | +## Retrieve spectrum from the cloud |
| 278 | +Again we use astropy's lazy-loading capability of FITS to only retrieve the spectrum table of our object from the S3 bucket. |
| 279 | + |
| 280 | +```{code-cell} ipython3 |
| 281 | +with fits.open(f's3://{BUCKET_NAME}/{spec_fpath_key}', fsspec_kwargs={'anon': True}) as hdul: |
| 282 | + spec_hdu = hdul[object_hdu_idx] |
| 283 | + spec_tbl = Table.read(spec_hdu) |
| 284 | +``` |
| 285 | + |
| 286 | +```{code-cell} ipython3 |
| 287 | +spec_tbl |
| 288 | +``` |
| 289 | + |
| 290 | +```{code-cell} ipython3 |
| 291 | +plt.plot(spec_tbl['WAVELENGTH'], spec_tbl['SIGNAL']) |
| 292 | +plt.xlabel(spec_tbl['WAVELENGTH'].unit.to_string('latex_inline')) |
| 293 | +plt.ylabel(spec_tbl['SIGNAL'].unit.to_string('latex_inline')) |
| 294 | +
|
| 295 | +plt.title(f'Euclid Object ID: {object_id}'); |
| 296 | +``` |
| 297 | + |
| 298 | +## About this Notebook |
| 299 | +Author: Jaladh Singhal (IRSA Developer) in conjunction with Tiffany Meshkat, Vandana Desai, Brigitta Sipőcz, and the IPAC Science Platform team |
| 300 | + |
| 301 | +Updated: 2025-03-13 |
| 302 | + |
| 303 | +Contact: the [IRSA Helpdesk](https://irsa.ipac.caltech.edu/docs/help_desk.html) with questions or reporting problems. |
0 commit comments