Skip to content

Commit 02a9ff9

Browse files
authored
Merge pull request #211 from troyraen/raen/issues/191/add-euclid-hats-tutorials-II
Add Euclid HATS Magnitude tutorial
2 parents 4793f83 + e9155bd commit 02a9ff9

File tree

4 files changed

+408
-23
lines changed

4 files changed

+408
-23
lines changed

toc.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,9 @@ project:
1616
- file: tutorials/euclid_access/2_Euclid_intro_MER_catalog.md
1717
- file: tutorials/euclid_access/4_Euclid_intro_PHZ_catalog.md
1818
- file: tutorials/euclid_access/5_Euclid_intro_SPE_catalog.md
19-
- file: tutorials/parquet-catalog-demos/euclid-q1-hats/1-euclid-q1-hats-intro.md
19+
- title: Merged Objects HATS Catalog
20+
children:
21+
- pattern: tutorials/parquet-catalog-demos/euclid-q1-hats/*-euclid-q1-hats-*.md
2022
- file: tutorials/cloud_access/euclid-cloud-access.md
2123
- file: tutorials/euclid_access/Euclid_ERO.md
2224
- title: WISE

tutorials/euclid_access/euclid.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,9 @@ Data products include MERged mosaics of calibrated and stacked frames; combined
2424
- [PHZ Catalogs](4_Euclid_intro_PHZ_catalog.md) — Join the PHZ and MER catalogs and do a box search for galaxies with quality redshifts, load a MER mosaic cutout of the box, and plot the cutout with the catalog results overlaid.
2525
Then plot the SIR spectrum of the brightest galaxy and look at a MER mosaic cutout of the galaxy in Firefly.
2626
- [SPE Catalogs](5_Euclid_intro_SPE_catalog.md) — Join the SPE and MER catalogs and query for galaxies with H-alpha line detections, then plot the SIR spectrum of a galaxy with a high SNR H-alpha line measurement.
27-
- [Merged Objects HATS Catalog](../parquet-catalog-demos/euclid-q1-hats/1-euclid-q1-hats-intro.md) — Understand the content and format of the Euclid Q1 Merged Objects HATS Catalog, then perform a basic query.
27+
- **Merged Objects HATS Catalog** — This product was created by IRSA and contains the Euclid MER, PHZ, and SPE catalogs in a single [HATS](https://hats.readthedocs.io/en/latest/) catalog.
28+
- [Introduction](../parquet-catalog-demos/euclid-q1-hats/1-euclid-q1-hats-intro.md) — Understand the content and format of the Euclid Q1 Merged Objects HATS Catalog, then perform a basic query.
29+
- [Magnitudes](../parquet-catalog-demos/euclid-q1-hats/4-euclid-q1-hats-magnitudes.md) — Review the types of flux measurements available, load template-fit and aperture magnitudes, and plot distributions and comparisons for different object types.
2830

2931
## Special Topics
3032

tutorials/parquet-catalog-demos/euclid-q1-hats/1-euclid-q1-hats-intro.md

Lines changed: 24 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,12 @@
11
---
2-
short_title: "Merged Objects HATS Catalog"
2+
short_title: Introduction
33
jupytext:
44
text_representation:
55
extension: .md
66
format_name: myst
77
format_version: 0.13
88
jupytext_version: 1.18.1
9+
root_level_metadata_filter: -short_title
910
kernelspec:
1011
display_name: Python 3 (ipykernel)
1112
language: python
@@ -18,16 +19,17 @@ kernelspec:
1819

1920
This tutorial is an introduction to the content and format of the Euclid Q1 Merged Objects HATS Catalog.
2021
Later tutorials in this series will show how to load quality samples.
22+
See [Euclid Tutorial Notebooks: Catalogs](../../euclid_access/euclid.md#catalogs) for a list of tutorials in this series.
2123

2224
+++
2325

2426
## Learning Goals
2527

2628
In this tutorial, we will:
2729

28-
- Learn about the Euclid Merged Objects catalog that IRSA created by combining information from multiple Euclid Quick Release 1 catalogs
30+
- Learn about the Euclid Merged Objects catalog that IRSA created by combining information from multiple Euclid Quick Release 1 (Q1) catalogs.
2931
- Find columns of interest.
30-
- Perform a basic spatial query in each of the Euclid Deep Fields using the Python library PyArrow.
32+
- Perform a basic query using the Python library PyArrow.
3133

3234
+++
3335

@@ -51,12 +53,12 @@ Access is free and no credentials are required.
5153

5254
## 2. Imports
5355

54-
```{code-cell} python3
56+
```{code-cell} ipython3
5557
# # Uncomment the next line to install dependencies if needed.
5658
# %pip install hpgeom pandas pyarrow
5759
```
5860

59-
```{code-cell} python3
61+
```{code-cell} ipython3
6062
import hpgeom # Find HEALPix indexes from RA and Dec
6163
import pyarrow.compute as pc # Filter the catalog
6264
import pyarrow.dataset # Load the catalog
@@ -70,7 +72,7 @@ First we'll load the Parquet schema (column information) of the Merged Objects c
7072
The Parquet schema is accessible from a few locations, all of which include the column names and types.
7173
Here, we load it from the `_common_metadata` file because it also includes the column units and descriptions.
7274

73-
```{code-cell} python3
75+
```{code-cell} ipython3
7476
# AWS S3 paths.
7577
s3_bucket = "nasa-irsa-euclid-q1"
7678
dataset_prefix = "contributed/q1/merged_objects/hats/euclid_q1_merged_objects-hats/dataset"
@@ -82,7 +84,7 @@ schema_path = f"{dataset_path}/_common_metadata"
8284
s3 = pyarrow.fs.S3FileSystem(anonymous=True)
8385
```
8486

85-
```{code-cell} python3
87+
```{code-cell} ipython3
8688
# Load the Parquet schema.
8789
schema = pyarrow.parquet.read_schema(schema_path, filesystem=s3)
8890
@@ -136,7 +138,7 @@ The tables are:
136138

137139
Find all columns from these tables in the Parquet schema:
138140

139-
```{code-cell} python3
141+
```{code-cell} ipython3
140142
mer_prefixes = ["mer_", "morph_", "cutouts_"]
141143
mer_col_counts = {p: len([n for n in schema.names if n.startswith(p)]) for p in mer_prefixes}
142144
@@ -193,7 +195,7 @@ The tables are:
193195

194196
Find all columns from these tables in the Parquet schema:
195197

196-
```{code-cell} python3
198+
```{code-cell} ipython3
197199
phz_prefixes = ["phz_", "class_", "physparam_", "galaxysed_", "physparamqso_",
198200
"starclass_", "starsed_", "physparamnir_"]
199201
phz_col_counts = {p: len([n for n in schema.names if n.startswith(p)]) for p in phz_prefixes}
@@ -240,7 +242,7 @@ The tables are:
240242

241243
Find all columns from these tables in the Parquet schema:
242244

243-
```{code-cell} python3
245+
```{code-cell} ipython3
244246
spe_prefixes = ["z_", "lines_", "models_"]
245247
spe_col_counts = {p: len([n for n in schema.names if n.startswith(p)]) for p in spe_prefixes}
246248
@@ -272,7 +274,7 @@ They are useful for spatial queries, as demonstrated in the Euclid Deep Fields s
272274

273275
The HEALPix, Euclid object ID, and Euclid tile ID columns appear first:
274276

275-
```{code-cell} python3
277+
```{code-cell} ipython3
276278
schema.names[:5]
277279
```
278280

@@ -288,7 +290,7 @@ However, PyArrow automatically makes them available as regular columns when the
288290

289291
The HATS columns appear at the end:
290292

291-
```{code-cell} python3
293+
```{code-cell} ipython3
292294
schema.names[-3:]
293295
```
294296

@@ -297,12 +299,12 @@ schema.names[-3:]
297299
The subsections above show how to find all columns from a given Euclid table as well as the additional columns.
298300
Here we show some additional techniques for finding columns.
299301

300-
```{code-cell} python3
302+
```{code-cell} ipython3
301303
# Access the data type using the `field` method.
302304
schema.field("mer_flux_y_2fwhm_aper")
303305
```
304306

305-
```{code-cell} python3
307+
```{code-cell} ipython3
306308
# The column metadata includes unit and description.
307309
# Parquet metadata is always stored as bytestrings, which are denoted by a leading 'b'.
308310
schema.field("mer_flux_y_2fwhm_aper").metadata
@@ -311,7 +313,7 @@ schema.field("mer_flux_y_2fwhm_aper").metadata
311313
Euclid Q1 offers many flux measurements, both from Euclid detections and from external ground-based surveys.
312314
They are given in microjanskys, so all flux columns can be found by searching the metadata for this unit.
313315

314-
```{code-cell} python3
316+
```{code-cell} ipython3
315317
# Find all flux columns.
316318
flux_columns = [field.name for field in schema if field.metadata[b"unit"] == b"uJy"]
317319
@@ -321,7 +323,7 @@ flux_columns[:4]
321323

322324
Columns associated with external surveys are identified by the inclusion of "ext" in the name.
323325

324-
```{code-cell} python3
326+
```{code-cell} ipython3
325327
external_flux_columns = [name for name in flux_columns if "ext" in name]
326328
print(f"{len(external_flux_columns)} flux columns from external surveys. First four are:")
327329
external_flux_columns[:4]
@@ -332,14 +334,14 @@ external_flux_columns[:4]
332334
+++
333335

334336
Euclid Q1 includes data from three Euclid Deep Fields: EDF-N (North), EDF-S (South), EDF-F (Fornax; also in the southern hemisphere).
335-
There is also a small amount of data from a fourth field: LDN1641 (Lynds' Dark Nebula 1641), which was observed for technical reasons during Euclid's verification phase and mostly ignored here.
337+
There is also a small amount of data from a fourth field: LDN1641 (Lynds' Dark Nebula 1641), which was observed for technical reasons during Euclid's verification phase.
336338
The fields are described in [Euclid Collaboration: Aussel et al., 2025](https://arxiv.org/pdf/2503.15302) and can be seen on this [skymap](https://irsa.ipac.caltech.edu/data/download/parquet/euclid/q1/merged_objects/hats/euclid_q1_merged_objects-hats/skymap.png).
337339

338340
The regions are well separated, so we can distinguish them using a simple cone search without having to be too picky about the radius.
339341
We can load data more efficiently using the HEALPix order 9 pixels that cover each area rather than using RA and Dec values directly.
340342
These will be used in later tutorials.
341343

342-
```{code-cell} python3
344+
```{code-cell} ipython3
343345
# EDF-N (Euclid Deep Field - North)
344346
ra, dec, radius = 269.733, 66.018, 4 # 20 sq deg
345347
edfn_k9_pixels = hpgeom.query_circle(hpgeom.order_to_nside(9), ra, dec, radius, inclusive=True)
@@ -360,9 +362,10 @@ ldn_k9_pixels = hpgeom.query_circle(hpgeom.order_to_nside(9), ra, dec, radius, i
360362
## 6. Basic Query
361363

362364
To demonstrate a basic query, we'll search for objects with a galaxy photometric redshift estimate of 6.0 (largest possible).
363-
Other tutorials in this series will show more complex queries and describe the redshifts and other data in more detail.
365+
Other tutorials in this series will show more complex queries, and describe the redshifts and other data in more detail.
366+
PyArrow dataset filters are described at [Filtering by Expressions](https://arrow.apache.org/docs/python/compute.html#filtering-by-expressions), and the list of available functions is at [Compute Functions](https://arrow.apache.org/docs/python/api/compute.html).
364367

365-
```{code-cell} python3
368+
```{code-cell} ipython3
366369
dataset = pyarrow.dataset.dataset(dataset_path, partitioning="hive", filesystem=s3, schema=schema)
367370
368371
highz_objects = dataset.to_table(
@@ -375,6 +378,6 @@ highz_objects
375378

376379
**Authors:** Troy Raen, Vandana Desai, Andreas Faisst, Shoubaneh Hemmati, Jaladh Singhal, Brigitta Sipőcz, Jessica Krick, the IRSA Data Science Team, and the Euclid NASA Science Center at IPAC (ENSCI).
377380

378-
**Updated:** 2025-12-22
381+
**Updated:** 2025-12-23
379382

380383
**Contact:** [IRSA Helpdesk](https://irsa.ipac.caltech.edu/docs/help_desk.html)

0 commit comments

Comments
 (0)