Skip to content

Commit 171dc26

Browse files
authored
Merge pull request #85 from bsipocz/euclid_spectra
Euclid: Simplify spectra notebook
2 parents bee65b1 + 0fd4317 commit 171dc26

File tree

1 file changed

+74
-67
lines changed

1 file changed

+74
-67
lines changed

tutorials/euclid_access/3_Euclid_intro_1D_spectra.md

Lines changed: 74 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -47,126 +47,133 @@ If you have questions about it, please contact the [IRSA helpdesk](https://irsa.
4747

4848
```{code-cell} ipython3
4949
# Uncomment the next line to install dependencies if needed
50-
# !pip install matplotlib pandas requests astropy pyvo
50+
# !pip install matplotlib astropy 'astroquery>=0.4.10'
5151
```
5252

5353
```{code-cell} ipython3
54-
from io import BytesIO
54+
import urllib
5555
56+
import numpy as np
5657
import matplotlib.pyplot as plt
57-
import pandas as pd
58-
import requests
5958
6059
from astropy.io import fits
61-
from astropy.table import Table
60+
from astropy.table import QTable
61+
from astropy import units as u
62+
from astropy.visualization import quantity_support
6263
63-
import pyvo as vo
64+
from astroquery.ipac.irsa import Irsa
6465
```
6566

66-
## 1. Download 1D spectra from IRSA directly to this notebook
67+
## 1. Search for the spectrum of a specific galaxy
6768

68-
Search for all tables in IRSA labeled as euclid
69+
First, explore what Euclid catalogs are available. Note that we need to use the object ID for our targets to be able to download their spectrum.
6970

70-
```{code-cell} ipython3
71-
service = vo.dal.TAPService("https://irsa.ipac.caltech.edu/TAP")
71+
Search for all tables in IRSA labeled as "euclid".
7272

73-
tables = service.tables
74-
for tablename in tables.keys():
75-
if "tap_schema" not in tablename and "euclid" in tablename:
76-
tables[tablename].describe()
73+
```{code-cell} ipython3
74+
Irsa.list_catalogs(filter='euclid')
7775
```
7876

7977
```{code-cell} ipython3
80-
table_mer= 'euclid_q1_mer_catalogue'
81-
table_1dspectra= 'euclid.objectid_spectrafile_association_q1'
82-
table_phz= 'euclid_q1_phz_photo_z'
83-
table_galaxy_candidates= 'euclid_q1_spectro_zcatalog_spe_galaxy_candidates'
78+
table_1dspectra = 'euclid.objectid_spectrafile_association_q1'
8479
```
8580

81+
## 2. Search for the spectrum of a specific galaxy in the 1D spectra table
82+
8683
```{code-cell} ipython3
87-
## Change the settings so we can see all the columns in the dataframe and the full column width
88-
## (to see the full long URL)
89-
pd.set_option('display.max_columns', None)
90-
pd.set_option('display.max_colwidth', None)
84+
obj_id = 2689918641685825137
85+
```
86+
87+
We will use TAP and an ASQL query to find the spectral data for our galaxy. (ADQL is the [IVOA Astronomical Data Query Language](https://www.ivoa.net/documents/latest/ADQL.html) and is based on SQL.)
9188

89+
```{code-cell} ipython3
90+
adql_object = f"SELECT * FROM {table_1dspectra} WHERE objectid = {obj_id}"
9291
93-
## Can use the following lines to reset the max columns and column width of pandas
94-
# pd.reset_option('display.max_columns')
95-
# pd.reset_option('display.max_colwidth')
92+
# Pull the data on this particular galaxy
93+
result = Irsa.query_tap(adql_object).to_table()
9694
```
9795

98-
## 2. Search for the spectrum of a specific galaxy in the 1D spectra table
96+
Pull out the file name from the ``result`` table:
9997

10098
```{code-cell} ipython3
101-
obj_id=2739401293646823742
99+
file_uri = urllib.parse.urljoin(Irsa.tap_url, result['uri'][0])
100+
file_uri
101+
```
102+
103+
## 3. Read in the spectrum for only our specific object
102104

103-
## Pull the data on these objects
104-
adql_object = f"SELECT * \
105-
FROM {table_1dspectra} \
106-
WHERE objectid = {obj_id} \
107-
AND uri IS NOT NULL "
105+
Currently IRSA has the spectra stored in very large files containing multiple (14220) extensions with spectra of many targets within one tile. You can choose to read in the big file below to see what it looks like (takes a few mins to load) or skip this step and just read in the specific extension we want for the 1D spectra (recommended).
108106

109-
## Pull the data on this particular galaxy
110-
result2 = service.search(adql_object)
111-
df2=result2.to_table().to_pandas()
112-
df2
107+
```{code-cell} ipython3
108+
# hdul = fits.open(file_uri)
109+
# hdul.info()
113110
```
114111

115-
### Create the full filename/url
112+
Open the large FITS file without loading it entirely into memory, pulling out just the extension we want for the 1D spectra of our object
116113

117114
```{code-cell} ipython3
118-
irsa_url='https://irsa.ipac.caltech.edu/'
115+
with fits.open(file_uri) as hdul:
116+
spectra = QTable.read(hdul[result['hdu'][0]], format='fits')
119117
120-
file_url=irsa_url+df2['uri'].iloc[0]
121-
file_url
118+
spec_header = hdul[result['hdu'][0]].header
122119
```
123120

124-
## 3. Read in the spectrum using the file_url and the extension just for this object
125-
126-
Currently IRSA has the spectra stored in very large files containing multiple (14220) extensions with spectra of many targets within one tile. You can choose to read in the big file below to see what it looks like (takes a few mins to load) or skip this step and just read in the specific extension we want for the 1D spectra (recommended).
121+
```{code-cell} ipython3
122+
spectra
123+
```
127124

128125
```{code-cell} ipython3
129-
#### Code to read in the large file with many extensions and spectra from a tile
130-
#### Currently commented out
126+
spec_header
127+
```
131128

132-
# ## Complete file url with the irsa url at the start
133-
# url = file_url
134-
# response = requests.get(url)
129+
## 4. Plot the image of the extracted spectrum
135130

136-
# hdul = fits.open(BytesIO(response.content)) # Open FITS file from memory
137-
# hdul.info() # Show file info
131+
```{tip}
132+
As we use astropy.visualization's ``quantity_support``, matplotlib automatically picks up the axis units from the quantitites we plot.
138133
```
139134

140-
### Open the large FITS file without loading it entirely into memory, pulling out just the extension we want for the 1D spectra of our object
141-
142135
```{code-cell} ipython3
143-
response = requests.get(file_url)
136+
quantity_support()
137+
```
138+
139+
```{note}
140+
The 1D combined spectra table contains 6 columns, below are a few highlights:
144141
145-
with fits.open(BytesIO(response.content), memmap=True) as hdul:
146-
hdu = hdul[df2['hdu'].iloc[0]]
147-
dat = Table.read(hdu, format='fits', hdu=1)
148-
df_obj_irsa = dat.to_pandas()
142+
- WAVELENGTH is in Angstroms by default
143+
- SIGNAL is the flux and should be multiplied by the FSCALE factor in the header
144+
- MASK values can be used to determine which flux bins to discard. MASK = odd and MASK >=64 means the flux bins not be used.
149145
```
150146

151-
### Plot the image of the extracted spectrum
147+
```{code-cell} ipython3
148+
signal_scaled = spectra['SIGNAL'] * spec_header['FSCALE']
149+
```
150+
151+
We investigate the MASK column to see which flux bins are recommended to keep vs "Do Not Use"
152+
153+
```{code-cell} ipython3
154+
plt.plot(spectra['WAVELENGTH'].to(u.micron), spectra['MASK'])
155+
plt.ylabel('Mask value')
156+
plt.title('Values of MASK by flux bin')
157+
```
152158

153-
- Convert the wavelength to microns
159+
We use the MASK column to create a boolean mask for values to ignore. We use the inverse of this mask to mark the flux bins to use.
154160

155161
```{code-cell} ipython3
156-
## Now the data are read in, show an image
162+
bad_mask = (spectra['MASK'].value % 2 == 1) | (spectra['MASK'].value >= 64)
157163
158-
## Converting from Angstrom to microns
159-
plt.plot(df_obj_irsa['WAVELENGTH']/10000., df_obj_irsa['SIGNAL'])
164+
plt.plot(spectra['WAVELENGTH'].to(u.micron), np.ma.masked_where(bad_mask, signal_scaled), color='black', label='Spectrum')
165+
plt.plot(spectra['WAVELENGTH'], np.ma.masked_where(~bad_mask, signal_scaled), color='red', label='Do not use')
166+
plt.plot(spectra['WAVELENGTH'], np.sqrt(spectra['VAR']) * spec_header['FSCALE'], color='grey', label='Error')
160167
161-
plt.xlabel('Wavelength (microns)')
162-
plt.ylabel('Flux'+dat['SIGNAL'].unit.to_string('latex_inline'))
163-
plt.title(obj_id)
168+
plt.legend(loc='upper right')
169+
plt.ylim(-0.15E-16, 0.25E-16)
170+
plt.title(f'Object ID {obj_id}')
164171
```
165172

166173
## About this Notebook
167174

168-
**Author**: Tiffany Meshkat (IPAC Scientist)
175+
**Author**: Tiffany Meshkat, Anahita Alavi, Anastasia Laity, Andreas Faisst, Brigitta Sipőcz, Dan Masters, Harry Teplitz, Jaladh Singhal, Shoubaneh Hemmati, Vandana Desai
169176

170-
**Updated**: 2025-03-19
177+
**Updated**: 2025-03-31
171178

172179
**Contact:** [the IRSA Helpdesk](https://irsa.ipac.caltech.edu/docs/help_desk.html) with questions or reporting problems.

0 commit comments

Comments
 (0)