You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: tutorials/spherex/spherex_cutouts.md
+26-15Lines changed: 26 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,11 +17,13 @@ kernelspec:
17
17
18
18
- Perform a query for the list of SPHEREx Spectral Image Multi-Extension FITS files (MEFs) that overlap a given coordinate.
19
19
- Retrieve cutouts for every entry in this list and package the cutouts as a new MEF.
20
-
- Learn how to use parallel or serial processing to retrieve the cutouts
20
+
- Learn how to use parallel or serial processing to retrieve the cutouts.
21
21
22
22
## 2. SPHEREx Overview
23
23
24
-
SPHEREx is a NASA Astrophysics Medium Explorer mission that launched in March 2025. During its planned two-year mission, SPHEREx will obtain 0.75-5 micron spectroscopy over the entire sky, with deeper data in the SPHEREx Deep Fields. SPHEREx data will be used to:
24
+
SPHEREx is a NASA Astrophysics Medium Explorer mission that launched in March 2025.
25
+
During its planned two-year mission, SPHEREx will obtain 0.75-5 micron spectroscopy over the entire sky, with deeper data in the SPHEREx Deep Fields.
26
+
SPHEREx data will be used to:
25
27
26
28
***constrain the physics of inflation** by measuring its imprints on the three-dimensional large-scale distribution of matter,
27
29
***trace the history of galactic light production** through a deep multi-band measurement of large-scale clustering,
## 5. Query IRSA for a list of cutouts that satisfy the criteria specified above.
84
86
85
87
Here we show how to use the `pyvo` TAP SQL query to retrieve all images that overlap with the position defined above.
86
-
This query will retrieve a table of URLs that link to the MEF cutouts. Each row in the table corresponds to a single cutout and includes the data access URL and an observation timestamp. The results are sorted from oldest to newest.
88
+
This query will retrieve a table of URLs that link to the MEF cutouts.
89
+
Each row in the table corresponds to a single cutout and includes the data access URL and an observation timestamp.
90
+
The results are sorted from oldest to newest.
87
91
88
92
```{code-cell} ipython3
89
93
# Define the service endpoint for IRSA's Table Access Protocol (TAP)
@@ -115,6 +119,7 @@ print("Number of images found: {}".format(len(results)))
115
119
## 6. Define a function that processes a list of SPHEREx Spectral Image Cutouts
116
120
117
121
This function takes in a row of the catalog that we created above and does the following:
122
+
118
123
- It downloads the cutout
119
124
- It computes the wavelength of the center pixel of the cutout (in micro-meters)
120
125
- It combines the image HDUs into a new HDU and adds it to the table row.
This process can take a while. If run in series, it can take about 5 minutes for 700 images on a typical laptop machine.
170
-
Here, we therefore exploit two different methods. First we show the serial approach and next we show how to parallelize the methods. The later can be run on many CPUs and is therefore significantly faster.
174
+
This process can take a while.
175
+
If run in series, it can take about 5 minutes for 700 images on a typical laptop machine.
176
+
Here, we therefore exploit two different methods.
177
+
First we show the serial approach and next we show how to parallelize the methods.
178
+
The latter can be run on many CPUs and is therefore significantly faster.
171
179
172
180
### 7.1 Serial Approach
173
181
174
182
First, we implement the serial approach -- a simple `for` loop.
175
183
Before that, we turn the results into an astropy table and add some place holders that will be filled in by the `process_cutout()` function.
176
184
177
185
```{warning}
178
-
Running the cell below may take a while for a large number of cutouts. Approximately 5-7 minutes for 700 images of cutout size 0.01 degree on a typical machine.
186
+
Running the cell below may take a while for a large number of cutouts.
187
+
Approximately 5-7 minutes for 700 images of cutout size 0.01 degree on a typical machine.
179
188
```
180
189
181
190
```{tip}
@@ -203,14 +212,19 @@ print("Time to create cutouts in serial mode: {:2.2f} minutes.".format((time.tim
203
212
### 7.2 Parallel Approach
204
213
205
214
Next, we implement parallel processing, which will make the cutout creation faster.
206
-
The maximal number of workers can be limited by setting the `max_workers` argument. The choice of this value depends on the number of cores but also on the number of parallel calls that can be digested by the IRSA server.
215
+
The maximal number of workers can be limited by setting the `max_workers` argument.
216
+
The choice of this value depends on the number of cores but also on the number of parallel calls that can be digested by the IRSA server.
207
217
208
218
```{tip}
209
219
A good value for the maximum number of workers is between 7 and 12 for a machine with 8 cores.
210
220
```
211
221
212
222
```{tip}
213
-
The astropy `fits.open()` supports a caching argument. This can be passed through in the `process_cutout()` function. If cache=True is set, the images are cached and the cutout creation is sped up next time the code is run (even if the Jupyter kernel is restarted!). The downside is that the images are saved on the machine where this notebook is run. If many cutouts are created, this can sum up to a large cached data volume, in which case cache=False is preferred.
223
+
The astropy `fits.open()` supports a caching argument.
224
+
This can be passed through in the `process_cutout()` function.
225
+
If cache=True is set, the images are cached and the cutout creation is sped up next time the code is run (even if the Jupyter kernel is restarted!).
226
+
The downside is that the images are saved on the machine where this notebook is run.
227
+
If many cutouts are created, this can sum up to a large cached data volume, in which case cache=False is preferred.
214
228
```
215
229
216
230
Again, before running the cutout processing we define some place holders.
@@ -233,7 +247,7 @@ print("Time to create cutouts in parallel mode: {:2.2f} minutes.".format((time.t
233
247
In the following, we continue to use the output of the parallel mode.
234
248
The following cell does the following:
235
249
236
-
- Create a summary FITS table
250
+
- Create a summary FITS table.
237
251
- Create the final FITS HDU including the summary table.
238
252
239
253
```{code-cell} ipython3
@@ -309,16 +323,13 @@ plt.show()
309
323
310
324
## About this notebook
311
325
312
-
**Authors:** IRSA Data Science Team, including Vandana Desai, Andreas Faisst, Troy Raen, Brigitta Sipőcz, Jessica Krick,
313
-
Shoubaneh Hemmati
326
+
**Authors:** IRSA Data Science Team, including Vandana Desai, Andreas Faisst, Troy Raen, Brigitta Sipőcz, Jessica Krick, Shoubaneh Hemmati
314
327
315
328
**Updated:** 2025-09-10
316
329
317
-
**Contact:**[IRSA Helpdesk](https://irsa.ipac.caltech.edu/docs/help_desk.html) with questions
318
-
or problems.
330
+
**Contact:**[IRSA Helpdesk](https://irsa.ipac.caltech.edu/docs/help_desk.html) with questions or problems.
319
331
320
-
**Runtime:** As of the date above, this notebook takes about 3 minutes to run to completion on
321
-
a machine with 8GB RAM and 4 CPU.
332
+
**Runtime:** As of the date above, this notebook takes about 3 minutes to run to completion on a machine with 8GB RAM and 4 CPU.
322
333
(Note: This notebook doesn't take significant time to run, but please report actual numbers and
323
334
machine details for your notebook if it is expected to run longer or requires specific machines,
324
335
e.g., on Fornax. Also, if querying archives, please include a statement like, "This runtime is
Copy file name to clipboardExpand all lines: tutorials/spherex/spherex_psf.md
+17-9Lines changed: 17 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,7 +25,9 @@ kernelspec:
25
25
26
26
## 2. SPHEREx Overview
27
27
28
-
SPHEREx is a NASA Astrophysics Medium Explorer mission that launched in March 2025. During its planned two-year mission, SPHEREx will obtain 0.75-5 micron spectroscopy over the entire sky, with deeper data in the SPHEREx Deep Fields. SPHEREx data will be used to:
28
+
SPHEREx is a NASA Astrophysics Medium Explorer mission that launched in March 2025.
29
+
During its planned two-year mission, SPHEREx will obtain 0.75-5 micron spectroscopy over the entire sky, with deeper data in the SPHEREx Deep Fields.
30
+
SPHEREx data will be used to:
29
31
30
32
***constrain the physics of inflation** by measuring its imprints on the three-dimensional large-scale distribution of matter,
31
33
***trace the history of galactic light production** through a deep multi-band measurement of large-scale clustering,
@@ -62,7 +64,9 @@ from astropy.wcs import WCS
62
64
63
65
## 4. Get SPHEREx Cutout
64
66
65
-
We first obtain a SPHEREx cutout for a given coordinate of interest from IRSA archive. For this we define a coordinate and a size of the cutout. Both should be defined using `astropy` units.
67
+
We first obtain a SPHEREx cutout for a given coordinate of interest from IRSA archive.
68
+
For this we define a coordinate and a size of the cutout.
69
+
Both should be defined using `astropy` units.
66
70
The goal is to obtain the cutout and then extract the PSF corresponding to the coordinates of interest.
67
71
68
72
```{tip}
@@ -128,16 +132,19 @@ with fits.open(spectral_image_url) as hdul:
128
132
```
129
133
130
134
The downloaded SPHEREx image cutout contains 5 FITS layers, which are described in the [SPHEREx Explanatory Supplement](https://irsa.ipac.caltech.edu/data/SPHEREx/docs/SPHEREx_Expsupp_QR.pdf).
131
-
We focus in this example on the extensions `IMAGE` and `PSF`. We have already loaded their data as well as their header.
135
+
We focus in this example on the extensions `IMAGE` and `PSF`.
136
+
We have already loaded their data as well as their header.
132
137
133
138
```{code-cell} ipython3
134
139
psfcube.shape
135
140
```
136
141
137
-
The shape of the `psfcube` is (121,101,101). This corresponds to a grid of 11x11 PSFs across the image, each of them of the size 101x101 pixels.
142
+
The shape of the `psfcube` is (121,101,101).
143
+
This corresponds to a grid of 11x11 PSFs across the image, each of them of the size 101x101 pixels.
138
144
139
145
```{note}
140
-
Remember that the PSFs are oversampled by a factor of 10. This means that the actual size of the PSFs is about 10x10 SPHEREx pixels, which corresponds to about 60x60 arcseconds.
146
+
Remember that the PSFs are oversampled by a factor of 10.
147
+
This means that the actual size of the PSFs is about 10x10 SPHEREx pixels, which corresponds to about 60x60 arcseconds.
141
148
```
142
149
143
150
+++
@@ -149,11 +156,13 @@ psf_header[22:40]
149
156
```
150
157
151
158
We confirm that the oversampling factor (`OVERSAMP`) is 10.
152
-
The PSFs are distributed in an even grid with 11x11 zones. Each of the 121 PSFs is responsible for one of these zones.
159
+
The PSFs are distributed in an even grid with 11x11 zones.
160
+
Each of the 121 PSFs is responsible for one of these zones.
153
161
The PSF header therefore includes the center position of these zones as well as the width of the zones.
154
162
These center coordinate are specified with `XCTR_i` and `YCTR_i`, respectively, where i = 1...121.
155
163
The widths are specified with `XWID_i` and `YWID_i`, respectively, where again i = 1...121.
156
-
The zones have equal widths and are arranged in an even grid. In principle, the zones can have any size, but this arrangement is enough to capture well the changes of the PSF size and structure with wavelength and spatial coordinates.
164
+
The zones have equal widths and are arranged in an even grid.
165
+
In principle, the zones can have any size, but this arrangement is enough to capture well the changes of the PSF size and structure with wavelength and spatial coordinates.
157
166
158
167
The goal of this tutorial now is to find the PSF corresponding to our input coordinates of interest.
159
168
@@ -262,7 +271,6 @@ plt.show()
262
271
263
272
**Updated:** 2025-09-25
264
273
265
-
**Contact:** Contact [IRSA Helpdesk](https://irsa.ipac.caltech.edu/docs/help_desk.html) with questions
266
-
or problems.
274
+
**Contact:** Contact [IRSA Helpdesk](https://irsa.ipac.caltech.edu/docs/help_desk.html) with questions or problems.
0 commit comments