Working With Xarray and WindKit
Many data structures in PyWAsP are Xarray objects. These objects are collections of largely
self-describing multidimensional arrays. Each xarray.Dataset
holds one or more xarray.DataArray
objects, along with labels and names for the array dimensions and coordinates.
Labeled arrays have many advantages. Xarray’s Quick Overview covers many of them, but we will highlight a few here:
Select data and apply operations by name rather than by axis or index.
Automatic alignment of coordinates and broadcasting of dimensions.
Attach attributes to data with its
.attrs
attribute.
In WindKit and PyWAsP, we leverage these capabilities extensively by defining specific
object types as xarray.Dataset
objects with specific names for dimensions and variables
(this can be thought of as a schema). For example, PyWAsP recognizes a “binned wind climate” object as:
A
xarray.Dataset
object.It has variables
wsfreq
andwdfreq
for the wind speed and direction frequencies.
wsfreq
has dimensionswsbin
andsector
(among others).
wdfreq
hassector
as a dimension.
In this example, wsbin
and sector
are core dimensions and must be present.
These required naming patterns enable us to write Xarray-based code where assumptions can be made about the object.
As long as the expected patterns are present, the objects can vary in shape and size, for example, by having other non-core dimensions. We cover this in more depth in the WindKit Introduction,
but we will also illustrate it here.
Let’s read in a binned wind climate and inspect the xarray.Dataset
:
In [1]: from pathlib import Path
In [2]: import pywasp as pw
In [3]: import windkit as wk
In [4]: path = Path("../modules/examples/tutorial_1/data")
In [5]: bwc = wk.read_bwc(path / "SerraSantaLuzia.omwc", crs="EPSG:4326")
In [6]: print(bwc)
<xarray.Dataset> Size: 4kB
Dimensions: (point: 1, sector: 12, wsbin: 32)
Coordinates:
height (point) float64 8B 25.3
south_north (point) float64 8B 41.74
west_east (point) float64 8B -8.823
crs int8 1B 0
wsceil (wsbin) float64 256B 1.0 2.0 3.0 4.0 ... 29.0 30.0 31.0 32.0
wsfloor (wsbin) float64 256B 0.0 1.0 2.0 3.0 ... 28.0 29.0 30.0 31.0
sector_ceil (sector) float64 96B 15.0 45.0 75.0 ... 285.0 315.0 345.0
sector_floor (sector) float64 96B 345.0 15.0 45.0 ... 255.0 285.0 315.0
* wsbin (wsbin) float64 256B 0.5 1.5 2.5 3.5 ... 28.5 29.5 30.5 31.5
* sector (sector) float64 96B 0.0 30.0 60.0 90.0 ... 270.0 300.0 330.0
Dimensions without coordinates: point
Data variables:
wdfreq (sector, point) float64 96B 0.05314 0.03321 ... 0.1148 0.0707
wsfreq (wsbin, sector, point) float64 3kB 0.02601 0.04219 ... 0.0 0.0
Attributes:
Conventions: CF-1.8
history: 2025-07-22T14:10:38+00:00:\twindkit==1.0.3.dev1+ga61276...
description: SerraSantaluzia
Package name: windkit
Package version: 1.0.3.dev1+ga612767
Creation date: 2025-07-22T14:10:38+00:00
Object type: Binned Wind Climate
author: Default User
author_email: default_email@example.com
institution: Default Institution
We can see that the “binned wind climate” follows the pattern described above.
In this case, it also has an extra dimension, point
, with associated geospatial
information. To calculate the mean wind speed for this object, you can use windkit.mean_wind_speed()
:
In [7]: mean_ws = wk.mean_wind_speed(bwc)
In [8]: print(mean_ws)
<xarray.DataArray (point: 1)> Size: 8B
array([6.28573163])
Coordinates:
height (point) float64 8B 25.3
south_north (point) float64 8B 41.74
west_east (point) float64 8B -8.823
crs int8 1B 0
Dimensions without coordinates: point
Attributes:
history: 2025-07-22T14:10:38+00:00:\twindkit==1.0.3.dev1+ga612767\tbysec...
We can see that all the core dimensions have been reduced, and we now have the mean wind speed for the single point. If the object had more points or other extra dimensions, they would be looped over, and one value per extra dimension would be returned:
In [9]: bwc = bwc.expand_dims(extra_dim=["one", "two", "three"])
In [10]: mean_ws = wk.mean_wind_speed(bwc)
In [11]: print(mean_ws)
<xarray.DataArray (extra_dim: 3, point: 1)> Size: 24B
array([[6.28573163],
[6.28573163],
[6.28573163]])
Coordinates:
* extra_dim (extra_dim) object 24B 'one' 'two' 'three'
height (point) float64 8B 25.3
south_north (point) float64 8B 41.74
west_east (point) float64 8B -8.823
crs int8 1B 0
Dimensions without coordinates: point
Attributes:
history: 2025-07-22T14:10:38+00:00:\twindkit==1.0.3.dev1+ga612767\tbysec...
What is happening “under the hood” can be expressed using only the built-in Xarray methods:
# Multiply wind speeds with frequencies and sum over wind speed and direction bins
In [12]: mean_ws = (bwc["wsbin"] * bwc["wsfreq"] * bwc["wdfreq"]).sum(dim=("wsbin", "sector"))
In [13]: print(mean_ws)
<xarray.DataArray (extra_dim: 3, point: 1)> Size: 24B
array([[6.28573163],
[6.28573163],
[6.28573163]])
Coordinates:
crs int8 1B 0
* extra_dim (extra_dim) object 24B 'one' 'two' 'three'
height (point) float64 8B 25.3
south_north (point) float64 8B 41.74
west_east (point) float64 8B -8.823
Dimensions without coordinates: point
Becoming proficient with PyWAsP means, at a minimum, becoming familiar with Xarray and its capabilities. We encourage you to spend enough time with Xarray to become comfortable with it. This will make your experience using PyWAsP much easier.