Skip to content

Euclid Q1 Catalog in HATS format available for testing #416

@troyraen

Description

@troyraen

[Updated 2025-06-29] This dataset is now public at:

bucket_name = 'nasa-irsa-euclid-q1'
euclid_prefix =  'contributed/q1/merged_objects/hats/'

Euclid Q1 tables are available in HATS catalog format for testing. This includes twelve Q1 tables, joined on the column 'OBJECT_ID' into a single dataset. Basic information is provided below. IRSA's tutorial notebook for this dataset is being prepared in Caltech-IPAC/irsa-tutorials#108 (comments welcome).

If you have trouble with these data products or want to suggest changes, please comment below. This issue will remain open until the products are released publicly.

S3 bucket and prefixes

# Testing bucket. Access restricted to IPAC IP addresses and the Fornax Science Console. See above for public bucket info.
bucket_name = 'irsa-fornax-testdata'

euclid_prefix = 'EUCLID/q1/catalogues/'

Data products

The prefix given above points to a HATS Collection that includes the products described below. (Note: Users of lsdb should know that these particular products are available but don't need to know their exact paths or how to use them, since lsdb will handle most of it under the hood. Users of other libraries will benefit from the additional detail.)

HATS Catalog

  • This is the main data product, including metadata and a Parquet dataset which holds all Q1 catalog data joined on 'object_id'. It is partitioned by 'Norder', 'Dir', and 'Npix'.

HATS Margin Cache at 10"

  • hats_margin_10arcsec/
  • Includes metadata and a Parquet dataset that stores the Q1 catalog data that is located in the margin around each Catalog partition.
  • This is useful for cross matching or anytime you want to load a partition (=HEALPix pixel) plus a little extra padding around the outside. For example, when when parallel processing (thus handling partitions independently) but need to ensure that spatial searches don't miss rows whose ra/dec put them just outside the given pixel/partition.

HATS Index Table for Euclid MER Object IDs

  • hats_index_object_id/
  • Includes metadata and a Parquet dataset that maps the Euclid MER Object IDs ('object_id') to their Catalog partitions. The dataset contains the IDs, their partitions (values of 'Norder', etc.), and their HEALPix pixel indexes at orders 29, 19, and 9. It is partitioned by ranges of Euclid MER Object IDs.
  • This can be used to look up which Catalog partition a Euclid object is in, given its ID.

Read

The basic read with lsdb.read_hats() is:

import lsdb

# Read with lsdb. This picks up the entire HATS Collection.
euclid_lsdb = lsdb.read_hats(f's3://{bucket_name}/{euclid_prefix}')

Note that this gives you a small subset of columns by default. To see a list of all 1324 columns, use euclid_lsdb.all_columns. To load different columns, use the columns keyword described in the docs linked above.

Schema

See the tutorial linked above.

Sub-issues

Metadata

Metadata

Assignees

Labels

HATS CatalogsAnnounce available HATS catalogs and discuss features

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions