Working With Xarray and WindKit

Many data structures in PyWAsP are Xarray objects, which means they are collections of largely self-describing multidimensional arrays and collections of arrays. Each xarray.Dataset holds one or several xarray.DataAarray’s, as well as labels and names for the array-dimensions and coordinates.

The advantages of labeled arrays are many. Xarray’s Quick Overview goes through a lot of them, so we will highlight just a few here:

  • Select data and apply operations by name rather than axis/index

  • Automatic alingment of coordiantes and broadcasting of dimensions

  • Attach attributes to data with it’s .attrs attribute

In WindKit and PyWAsP, we leverage these capabilities to a great extend by defining specific object-types as xarray.Dataset’s with specific names for dimensions and variables (you could call it a schema). For example, PyWAsP recognises a “binned wind climate” object as:

  • A xarray.Dataset object

  • Has variables wsfreq and wdfreq for the wind speed and direction frequencies

  • wsfreq has (at least, but not limited to) the dimensions wsbin and sector

  • wdfreq has sector as a dimension.

In this example, wsbin and sector are core dimensions and need to be present. These required naming patterns enables us to write Xarray-based code where assumptions can be made about the object, but as long as the expected patterns are there, the objects are allowed to vary in shape and size and, by e.g. having other non-core dimensions. We go into depth with this in the WindKit Introduction, but we will quickly illustrate it here as well.

Let’s read in a binned wind climate and print out inspect the xarray.Dataset:

In [1]: from pathlib import Path

In [2]: import pywasp as pw

In [3]: import windkit as wk

In [4]: path = Path("../modules/examples/tutorial_4/data")

In [5]: bwc = wk.read_bwc(path / "SerraSantaLuzia.omwc", crs="EPSG:4326")

In [6]: print(bwc)
<xarray.Dataset>
Dimensions:       (point: 1, sector: 12, wsbin: 32)
Coordinates:
    height        (point) float64 25.3
    south_north   (point) float64 41.74
    west_east     (point) float64 -8.823
    crs           int8 0
  * wsbin         (wsbin) float64 0.5 1.5 2.5 3.5 4.5 ... 28.5 29.5 30.5 31.5
    wsceil        (wsbin) float64 1.0 2.0 3.0 4.0 5.0 ... 29.0 30.0 31.0 32.0
    wsfloor       (wsbin) float64 0.0 1.0 2.0 3.0 4.0 ... 28.0 29.0 30.0 31.0
  * sector        (sector) float64 0.0 30.0 60.0 90.0 ... 270.0 300.0 330.0
    sector_ceil   (sector) float64 15.0 45.0 75.0 105.0 ... 285.0 315.0 345.0
    sector_floor  (sector) float64 345.0 15.0 45.0 75.0 ... 255.0 285.0 315.0
Dimensions without coordinates: point
Data variables:
    wdfreq        (sector, point) float64 0.05314 0.03321 ... 0.1148 0.0707
    wsfreq        (wsbin, sector, point) float64 0.02601 0.04219 ... 0.0 0.0
Attributes:
    Conventions:      CF-1.8
    history:          2024-02-16T10:48:55:\twindkit==0.7.1.dev17+ga2ef68d\tcr...
    wasp_header:      SerraSantaluzia
    Package name:     windkit
    Package version:  0.7.1.dev17+ga2ef68d
    Creation date:    2024-02-16T10:48:55
    Object type:      Binned Wind Climate
    author:           DTU CI Config
    author_email:     pywasp@dtu.dk
    institution:      DTU Wind

We can see the “binned wind climate” has the pattern described above, and that in this case it has one extra dimension point with associated geospatial information. To calculate the mean wind speed for this object windkit.wind_climate.mean_windspeed() can be used:

In [7]: mean_ws = wk.mean_windspeed(bwc)

In [8]: print(mean_ws)
<xarray.DataArray 'wspd_combined' (point: 1)>
array([6.28573163])
Coordinates:
    height       (point) float64 25.3
    south_north  (point) float64 41.74
    west_east    (point) float64 -8.823
    crs          int8 0
Dimensions without coordinates: point
Attributes:
    cell_methods:   sector: combined
    description:    Mean wind speed for all wind directions, representing the...
    long_name:      Horizontal Wind Speed
    standard_name:  wind_speed
    units:          m s-1
    grid_mapping:   crs

We can see that all the “core” dimensions have been reduced and we obtain the mean wind speed for the single point. If the object has more points or other extra dimensions they would be looped-over and one value per extra dimension returned:

In [9]: bwc = bwc.expand_dims(extra_dim=["one", "two", "three"])

In [10]: mean_ws = wk.mean_windspeed(bwc)

In [11]: print(mean_ws)
<xarray.DataArray 'wspd_combined' (extra_dim: 3, point: 1)>
array([[6.28573163],
       [6.28573163],
       [6.28573163]])
Coordinates:
  * extra_dim    (extra_dim) object 'one' 'two' 'three'
    height       (point) float64 25.3
    south_north  (point) float64 41.74
    west_east    (point) float64 -8.823
    crs          int8 0
Dimensions without coordinates: point
Attributes:
    cell_methods:   sector: combined
    description:    Mean wind speed for all wind directions, representing the...
    long_name:      Horizontal Wind Speed
    standard_name:  wind_speed
    units:          m s-1
    grid_mapping:   crs

What is happening “under the hood” can be expressed using just the built-in Xarray methods as:

# Multiply wind speeds with frequencies and sum over wind speed and direction bins
In [12]: mean_ws = (bwc["wsbin"] * bwc["wsfreq"] * bwc["wdfreq"]).sum(dim=("wsbin", "sector"))

In [13]: print(mean_ws)
<xarray.DataArray (extra_dim: 3, point: 1)>
array([[6.28573163],
       [6.28573163],
       [6.28573163]])
Coordinates:
    crs          int8 0
  * extra_dim    (extra_dim) object 'one' 'two' 'three'
    height       (point) float64 25.3
    south_north  (point) float64 41.74
    west_east    (point) float64 -8.823
Dimensions without coordinates: point

Becoming proficient with PyWAsP means, at a minimum, becoming familiar with Xarray, and what it can do. However, we encourage users to spend enough time with Xarray to become comfortable with it. It will make your experience using PyWAsP much easier.