Skip to content

Read netcdf files from web into xarray #3

@fostermh

Description

@fostermh

this appears to work

import xarray as xr
from bs4 import BeautifulSoup
import requests

url = 'https://pac-dev2.cioos.org/dev/Amundsen_Bottle_Files/'
ext = 'nc'

def listFD(url, ext=''):
    page = requests.get(url).text
    soup = BeautifulSoup(page, 'html.parser')
    return [url +  node.get('href') for node in soup.find_all('a') if node.get('href').endswith(ext)]



files = listFD(url, ext)
files = ["%s#mode=bytes" % x for x in files]

len(files)
files_subset = files[1:10]
ds = xr.open_mfdataset(files_subset, compat='override', coords='all')
ds

and results in

<xarray.Dataset>
Dimensions:           (bottle: 24)
Coordinates:
  * bottle            (bottle) float64 1.0 2.0 3.0 4.0 ... 21.0 22.0 23.0 24.0
Data variables: (12/33)
    filename          (bottle) object '1601010.btl' '1601010.btl' ... nan nan
    file_header_text  (bottle) object '* Sea-Bird SBE 9 Data File:\n* FileNam...
    instrument_model  (bottle) object '9' '9' '9' '9' '9' ... nan nan nan nan
    instrument_type   (bottle) object 'CTD-bottle' 'CTD-bottle' ... nan nan
    start_latitude    (bottle) float64 68.5 68.5 68.5 68.5 ... nan nan nan nan
    start_longitude   (bottle) float64 -58.52 -58.52 -58.52 ... nan nan nan
    ...                ...
    svCM              (bottle) float64 dask.array<chunksize=(24,), meta=np.ndarray>
    svDM              (bottle) float64 dask.array<chunksize=(24,), meta=np.ndarray>
    svWM              (bottle) float64 dask.array<chunksize=(24,), meta=np.ndarray>
    wetCDOM           (bottle) float64 dask.array<chunksize=(24,), meta=np.ndarray>
    Upoly1            (bottle) float64 dask.array<chunksize=(24,), meta=np.ndarray>
    cpar              (bottle) float64 dask.array<chunksize=(24,), meta=np.ndarray>
Attributes: (12/16)
    history:               2021-11-30T14:50:13.901854 Read by seabird Python ...
    DATE_CREATION:         20211130145013
    LATITUDE:              58.55866666666667
    LONGITUDE:             -52.838166666666666
    date_created:          2016-06-07T18:13:14
    date_modified:         2021-11-30T22:50:13.901854
    ...                    ...
    md5:                   753b4bf26928c9c670f5a1b79dfd1ae5
    original_header:       * Sea-Bird SBE 9 Data File:\n* FileName = E:\CTD-R...
    original_header_json:  {\n  "instrument_header": {\n    "FileName": "E:\\...
    sbe_model:             9
    seasave:               V 7.23.2
    start_time:            Jun 07 2016 18:07:01 [NMEA time, header]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions