@@ -101,37 +101,151 @@ Accessing and manipulating data in Xarray
101101
102102We can select a Data variable from the dataset using a dictionary-like syntax: ::
103103
104- temperature_data = ds['Temperature_isobaric']
104+ >>> temperature_data = ds['Temperature_isobaric']
105+ >>> temperature_data
106+ <xarray.DataArray 'Temperature_isobaric' (time1: 1, isobaric1: 29, y: 119,
107+ x: 268)> Size: 4MB
108+ [924868 values with dtype=float32]
109+ Coordinates:
110+ * time1 (time1) datetime64[ns] 8B 1993-03-13
111+ * isobaric1 (isobaric1) float32 116B 100.0 125.0 150.0 ... 950.0 975.0 1e+03
112+ * y (y) float32 476B -3.117e+03 -3.084e+03 -3.052e+03 ... 681.6 714.1
113+ * x (x) float32 1kB -3.324e+03 -3.292e+03 ... 5.311e+03 5.343e+03
114+ Attributes:
115+ long_name: Temperature @ Isobaric surface
116+ units: K
117+ description: Temperature
118+ grid_mapping: LambertConformal_Projection
119+ Grib_Variable_Id: VAR_7-15-131-11_L100
120+ Grib1_Center: 7
121+ Grib1_Subcenter: 15
122+ Grib1_TableVersion: 131
123+ Grib1_Parameter: 11
124+ Grib1_Level_Type: 100
125+ Grib1_Level_Desc: Isobaric surface
126+
105127
106128The new variable ``temperature_data `` is a ``DataArray `` object. An xarray ``Dataset `` typically consists of multiple ``DataArrays ``.
107129
108130Xarray uses Numpy(-like) arrays under the hood, we can always access the raw data using the ``.values `` attribute: ::
109131
110- temperature_numpy = ds['Temperature_isobaric'].values
132+ >>> temperature_numpy = ds['Temperature_isobaric'].values
133+ >>> temperature_numpy
134+ array([[[[201.88957, 202.2177 , 202.49895, ..., 195.10832, 195.23332,
135+ 195.37395],
136+ [201.68645, 202.0302 , 202.3427 , ..., 195.24895, 195.38957,
137+ 195.51457],
138+ [201.5302 , 201.87395, 202.20207, ..., 195.37395, 195.51457,
139+ 195.63957],
140+ ...,
141+ [276.735 , 276.70374, 276.6881 , ..., 289.235 , 289.1725 ,
142+ 289.07874],
143+ [276.86 , 276.84436, 276.78186, ..., 289.1881 , 289.11 ,
144+ 289.01624],
145+ [277.01624, 276.82874, 276.82874, ..., 289.14124, 289.0475 ,
146+ 288.96936]]]], dtype=float32)
147+
111148
112149Xarray allows you to select data using the ``.sel() `` method, which uses the labels of the dimensions to extract data: ::
113150
114- ds['Temperature_isobaric'].sel(x='-3292.0078')
151+ >>> ds['Temperature_isobaric'].sel(x='-3292.0078')
152+ <xarray.DataArray 'Temperature_isobaric' (time1: 1, isobaric1: 29, y: 119)> Size: 14kB
153+ array([[[202.2177 , 202.0302 , ..., 219.67082, 219.74895],
154+ [202.58566, 202.58566, ..., 219.16379, 219.28879],
155+ ...,
156+ [292.1622 , 292.14658, ..., 275.05283, 275.11533],
157+ [294.1256 , 294.14124, ..., 276.84436, 276.82874]]], dtype=float32)
158+ Coordinates:
159+ * time1 (time1) datetime64[ns] 8B 1993-03-13
160+ * isobaric1 (isobaric1) float32 116B 100.0 125.0 150.0 ... 950.0 975.0 1e+03
161+ * y (y) float32 476B -3.117e+03 -3.084e+03 -3.052e+03 ... 681.6 714.1
162+ x float32 4B -3.292e+03
163+ Attributes:
164+ long_name: Temperature @ Isobaric surface
165+ units: K
166+ description: Temperature
167+ grid_mapping: LambertConformal_Projection
168+ Grib_Variable_Id: VAR_7-15-131-11_L100
169+ Grib1_Center: 7
170+ Grib1_Subcenter: 15
171+ Grib1_TableVersion: 131
172+ Grib1_Parameter: 11
173+ Grib1_Level_Type: 100
174+ Grib1_Level_Desc: Isobaric surface
175+
115176
116177We can still access the same data by index using the ``.isel() `` method: ::
117178
118- ds['Temperature_isobaric'].isel(x=1)
179+ >>> ds['Temperature_isobaric'].isel(x=1)
180+ <xarray.DataArray 'Temperature_isobaric' (time1: 1, isobaric1: 29, y: 119)> Size: 14kB
181+ array([[[202.2177 , 202.0302 , ..., 219.67082, 219.74895],
182+ [202.58566, 202.58566, ..., 219.16379, 219.28879],
183+ ...,
184+ [292.1622 , 292.14658, ..., 275.05283, 275.11533],
185+ [294.1256 , 294.14124, ..., 276.84436, 276.82874]]], dtype=float32)
186+ Coordinates:
187+ * time1 (time1) datetime64[ns] 8B 1993-03-13
188+ * isobaric1 (isobaric1) float32 116B 100.0 125.0 150.0 ... 950.0 975.0 1e+03
189+ * y (y) float32 476B -3.117e+03 -3.084e+03 -3.052e+03 ... 681.6 714.1
190+ x float32 4B -3.292e+03
191+ Attributes:
192+ long_name: Temperature @ Isobaric surface
193+ units: K
194+ description: Temperature
195+ grid_mapping: LambertConformal_Projection
196+ Grib_Variable_Id: VAR_7-15-131-11_L100
197+ Grib1_Center: 7
198+ Grib1_Subcenter: 15
199+ Grib1_TableVersion: 131
200+ Grib1_Parameter: 11
201+ Grib1_Level_Type: 100
202+ Grib1_Level_Desc: Isobaric surface
203+
119204
120205Xarray also provides a wide range of aggregation methods such as ``sum() ``, ``mean() ``, ``median() ``, ``min() ``, and ``max() ``. We can use these methods to aggregate data over one or multiple dimensions: ::
121206
122- # Calculate the mean over the 'isobaric1' dimension
123- ds['Temperature_isobaric'].mean(dim='isobaric1')
207+ >>> # Calculate the mean over the 'isobaric1' dimension
208+ >>> ds['Temperature_isobaric'].mean(dim='isobaric1')
209+ <xarray.DataArray 'Temperature_isobaric' (time1: 1, y: 119, x: 268)> Size: 128kB
210+ array([[[259.88446, 259.90222, 259.91678, ..., 262.61667, 262.6285 ,
211+ 262.65167],
212+ [259.74866, 259.76752, 259.78638, ..., 262.5757 , 262.58218,
213+ 262.57516],
214+ [259.6156 , 259.63498, 259.65115, ..., 262.52075, 262.51215,
215+ 262.4976 ],
216+ ...,
217+ [249.8796 , 249.83649, 249.79501, ..., 254.43617, 254.49059,
218+ 254.54985],
219+ [249.8505 , 249.80202, 249.75244, ..., 254.37044, 254.42378,
220+ 254.47711],
221+ [249.82195, 249.75998, 249.71204, ..., 254.30956, 254.35805,
222+ 254.41139]]], dtype=float32)
223+ Coordinates:
224+ * time1 (time1) datetime64[ns] 8B 1993-03-13
225+ * y (y) float32 476B -3.117e+03 -3.084e+03 -3.052e+03 ... 681.6 714.1
226+ * x (x) float32 1kB -3.324e+03 -3.292e+03 ... 5.311e+03 5.343e+03
227+
124228
125229Let's take a look at a concrete example and compare it to NumPy. We will calculate the max temperature over the 'isobaric1' dimension at a specific value for x: ::
126230
127- # Xarray
128- ds['Temperature_isobaric'].sel(x='-3259.5447').max(dim='isobaric1').values
231+ >>> # Xarray
232+ >>> ds['Temperature_isobaric'].sel(x='-3259.5447').max(dim='isobaric1').values
233+ array([[294.11 , 294.14124, 294.1256 , 294.0475 , 293.90686, 293.6256 ,
234+ ...,
235+ 276.46936, 276.59436, 276.6881 , 276.78186, 276.82874]],
236+ dtype=float32)
237+
238+
239+ >>> # NumPy
240+ >>> np.max(temperature_numpy[:, :, :, 2 ], axis = 1)
241+ array([[294.11 , 294.14124, 294.1256 , 294.0475 , 293.90686, 293.6256 ,
242+ ...,
243+ 276.46936, 276.59436, 276.6881 , 276.78186, 276.82874]],
244+ dtype=float32)
129245
130- # NumPy
131- np.max(temperature_numpy[:, :, :, 2 ], axis = 1)
132246
133247
134- As you can see, the Xarray code is much more readable and we didn't need to keep track of the right indexes and order of the dimensions.
248+ As you can see, the Xarray code is much more readable and we didn't need to keep track of the right indices and order of the dimensions.
135249
136250Plotting data in Xarray
137251-----------------------
0 commit comments