Skip to content

Commit 80042c0

Browse files
authored
Merge pull request #316 from gregordecristoforo/fix_#295
Update exercise 2 in xarray lecture
2 parents 37f16c9 + 5937bc5 commit 80042c0

File tree

2 files changed

+42
-34
lines changed

2 files changed

+42
-34
lines changed

content/xarray.rst

Lines changed: 41 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -380,54 +380,61 @@ Exercises 2
380380

381381
.. challenge:: Exercises: Xarray-2
382382

383-
Let's change from climate science to finance for this example. Put the stock prices and trading volumes of three companies over ten days in one dataset. Create an Xarray Dataset that uses time and company as dimensions and contains two DataArrays: ``stock_price`` and ``trading_volume``. You can choose the values for the stock prices and trading volumes yourself. As a last thing, add the currency of the stock prices as an attribute to the Dataset.
383+
Let's change from climate science to finance for this example. Put the stock prices and trading volumes of three companies in one dataset. Create an Xarray Dataset that uses time and company as dimensions and contains two DataArrays: ``stock_price`` and ``trading_volume``. You can download the data as a pandas DataFrame with the following code: ::
384+
385+
import yfinance as yf
386+
387+
AAPL_df = yf.download("AAPL", start="2020-01-01", end="2024-01-01")
388+
GOOGL_df = yf.download("GOOGL", start="2020-01-01", end="2024-01-01")
389+
MSFT_df = yf.download("MSFT", start="2020-01-01", end="2024-01-01")
390+
391+
392+
As a last thing, add the currency of the stock prices as an attribute to the Dataset.
384393

385394
.. solution:: Solutions: Xarray-2
386395

387396
We can use a script similar to this one: ::
388397

389398
import xarray as xr
390399
import numpy as np
400+
import yfinance as yf
401+
402+
start_date = "2020-01-01"
403+
end_date = "2024-01-01"
404+
405+
AAPL_df = yf.download("AAPL", start=start_date, end=end_date)
406+
GOOGL_df = yf.download("GOOGL", start=start_date, end=end_date)
407+
MSFT_df = yf.download("MSFT", start=start_date, end=end_date)
408+
409+
410+
stock_prices = np.array(
411+
[
412+
AAPL_df["Close"].values,
413+
GOOGL_df["Close"].values,
414+
MSFT_df["Close"].values,
415+
]
416+
)
417+
418+
trading_volumes = np.array(
419+
[
420+
AAPL_df["Volume"].values,
421+
GOOGL_df["Volume"].values,
422+
MSFT_df["Volume"].values,
423+
]
424+
)
425+
391426

392-
time = [
393-
"2023-01-01",
394-
"2023-01-02",
395-
"2023-01-03",
396-
"2023-01-04",
397-
"2023-01-05",
398-
"2023-01-06",
399-
"2023-01-07",
400-
"2023-01-08",
401-
"2023-01-09",
402-
"2023-01-10",
403-
]
404427
companies = ["AAPL", "GOOGL", "MSFT"]
405-
stock_prices = np.random.normal(loc=[100, 1500, 200], scale=[10, 50, 20], size=(10, 3))
406-
trading_volumes = np.random.randint(1000, 10000, size=(10, 3))
428+
time = AAPL_df.index[:].strftime("%Y-%m-%d").tolist()
429+
407430
ds = xr.Dataset(
408-
data_vars = {
409-
"stock_price": (["time", "company"], stock_prices),
410-
"trading_volume": (["time", "company"], trading_volumes),
431+
{
432+
"stock_price": (["company", "time"], stock_prices[:, :, 0]),
433+
"trading_volume": (["company", "time"], trading_volumes[:, :, 0]),
411434
},
412435
coords={"time": time, "company": companies},
413436
attrs={"currency": "USD"},
414437
)
415-
print(ds)
416-
417-
The output should then resemble this: ::
418-
419-
> python exercise.py
420-
<xarray.Dataset> Size: 940B
421-
Dimensions: (time: 10, company: 3)
422-
Coordinates:
423-
* time (time) <U10 400B '2023-01-01' '2023-01-02' ... '2023-01-10'
424-
* company (company) <U5 60B 'AAPL' 'GOOGL' 'MSFT'
425-
Data variables:
426-
stock_price (time, company) float64 240B 101.1 1.572e+03 ... 217.8
427-
trading_volume (time, company) int64 240B 1214 7911 4578 ... 4338 6861 6958
428-
Attributes:
429-
currency: USD
430-
431438

432439

433440

software/environment.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ dependencies:
2424
- vega_datasets
2525
- xarray
2626
- netcdf4
27+
- yfinance
2728
- pip
2829
- pip:
2930
- pythia_datasets

0 commit comments

Comments
 (0)