Working With Xarray and WindKit

Many data structures in PyWAsP are Xarray objects. These objects are collections of largely self-describing multidimensional arrays. Each xarray.Dataset holds one or more xarray.DataArray objects, along with labels and names for the array dimensions and coordinates.

Labeled arrays have many advantages. Xarray’s Quick Overview covers many of them, but we will highlight a few here:

  • Select data and apply operations by name rather than by axis or index.

  • Automatic alignment of coordinates and broadcasting of dimensions.

  • Attach attributes to data with its .attrs attribute.

In WindKit and PyWAsP, we leverage these capabilities extensively by defining specific object types as xarray.Dataset objects with specific names for dimensions and variables (this can be thought of as a schema). For example, PyWAsP recognizes a “binned wind climate” object as:

  • A xarray.Dataset object.

  • It has variables wsfreq and wdfreq for the wind speed and direction frequencies.

  • wsfreq has dimensions wsbin and sector (among others).

  • wdfreq has sector as a dimension.

In this example, wsbin and sector are core dimensions and must be present. These required naming patterns enable us to write Xarray-based code where assumptions can be made about the object. As long as the expected patterns are present, the objects can vary in shape and size, for example, by having other non-core dimensions. We cover this in more depth in the WindKit Introduction, but we will also illustrate it here.

Let’s read in a binned wind climate and inspect the xarray.Dataset:

In [1]: from pathlib import Path

In [2]: import pywasp as pw

In [3]: import windkit as wk

In [4]: path = Path("../modules/examples/tutorial_1/data")

In [5]: bwc = wk.read_bwc(path / "SerraSantaLuzia.omwc", crs="EPSG:4326")

In [6]: print(bwc)
<xarray.Dataset> Size: 4kB
Dimensions:       (point: 1, sector: 12, wsbin: 32)
Coordinates:
    height        (point) float64 8B 25.3
    south_north   (point) float64 8B 41.74
    west_east     (point) float64 8B -8.823
    crs           int8 1B 0
    wsceil        (wsbin) float64 256B 1.0 2.0 3.0 4.0 ... 29.0 30.0 31.0 32.0
    wsfloor       (wsbin) float64 256B 0.0 1.0 2.0 3.0 ... 28.0 29.0 30.0 31.0
    sector_ceil   (sector) float64 96B 15.0 45.0 75.0 ... 285.0 315.0 345.0
    sector_floor  (sector) float64 96B 345.0 15.0 45.0 ... 255.0 285.0 315.0
  * wsbin         (wsbin) float64 256B 0.5 1.5 2.5 3.5 ... 28.5 29.5 30.5 31.5
  * sector        (sector) float64 96B 0.0 30.0 60.0 90.0 ... 270.0 300.0 330.0
Dimensions without coordinates: point
Data variables:
    wdfreq        (sector, point) float64 96B 0.05314 0.03321 ... 0.1148 0.0707
    wsfreq        (wsbin, sector, point) float64 3kB 0.02601 0.04219 ... 0.0 0.0
Attributes:
    Conventions:      CF-1.8
    history:          2025-07-22T14:10:38+00:00:\twindkit==1.0.3.dev1+ga61276...
    description:      SerraSantaluzia
    Package name:     windkit
    Package version:  1.0.3.dev1+ga612767
    Creation date:    2025-07-22T14:10:38+00:00
    Object type:      Binned Wind Climate
    author:           Default User
    author_email:     default_email@example.com
    institution:      Default Institution

We can see that the “binned wind climate” follows the pattern described above. In this case, it also has an extra dimension, point, with associated geospatial information. To calculate the mean wind speed for this object, you can use windkit.mean_wind_speed():

In [7]: mean_ws = wk.mean_wind_speed(bwc)

In [8]: print(mean_ws)
<xarray.DataArray (point: 1)> Size: 8B
array([6.28573163])
Coordinates:
    height       (point) float64 8B 25.3
    south_north  (point) float64 8B 41.74
    west_east    (point) float64 8B -8.823
    crs          int8 1B 0
Dimensions without coordinates: point
Attributes:
    history:  2025-07-22T14:10:38+00:00:\twindkit==1.0.3.dev1+ga612767\tbysec...

We can see that all the core dimensions have been reduced, and we now have the mean wind speed for the single point. If the object had more points or other extra dimensions, they would be looped over, and one value per extra dimension would be returned:

In [9]: bwc = bwc.expand_dims(extra_dim=["one", "two", "three"])

In [10]: mean_ws = wk.mean_wind_speed(bwc)

In [11]: print(mean_ws)
<xarray.DataArray (extra_dim: 3, point: 1)> Size: 24B
array([[6.28573163],
       [6.28573163],
       [6.28573163]])
Coordinates:
  * extra_dim    (extra_dim) object 24B 'one' 'two' 'three'
    height       (point) float64 8B 25.3
    south_north  (point) float64 8B 41.74
    west_east    (point) float64 8B -8.823
    crs          int8 1B 0
Dimensions without coordinates: point
Attributes:
    history:  2025-07-22T14:10:38+00:00:\twindkit==1.0.3.dev1+ga612767\tbysec...

What is happening “under the hood” can be expressed using only the built-in Xarray methods:

# Multiply wind speeds with frequencies and sum over wind speed and direction bins
In [12]: mean_ws = (bwc["wsbin"] * bwc["wsfreq"] * bwc["wdfreq"]).sum(dim=("wsbin", "sector"))

In [13]: print(mean_ws)
<xarray.DataArray (extra_dim: 3, point: 1)> Size: 24B
array([[6.28573163],
       [6.28573163],
       [6.28573163]])
Coordinates:
    crs          int8 1B 0
  * extra_dim    (extra_dim) object 24B 'one' 'two' 'three'
    height       (point) float64 8B 25.3
    south_north  (point) float64 8B 41.74
    west_east    (point) float64 8B -8.823
Dimensions without coordinates: point

Becoming proficient with PyWAsP means, at a minimum, becoming familiar with Xarray and its capabilities. We encourage you to spend enough time with Xarray to become comfortable with it. This will make your experience using PyWAsP much easier.