Working With Xarray and WindKit
Many data structures in PyWAsP are Xarray objects, which means they are collections of largely
self-describing multidimensional arrays and collections of arrays. Each xarray.Dataset
holds one or several xarray.DataAarray
’s, as well as labels and names for the array-dimensions and coordinates.
The advantages of labeled arrays are many. Xarray’s Quick Overview goes through a lot of them, so we will highlight just a few here:
Select data and apply operations by name rather than axis/index
Automatic alingment of coordiantes and broadcasting of dimensions
Attach attributes to data with it’s
.attrs
attribute
In WindKit and PyWAsP, we leverage these capabilities to a great extend by defining specific
object-types as xarray.Dataset
’s with specific names for dimensions and variables
(you could call it a schema). For example, PyWAsP recognises a “binned wind climate” object as:
A
xarray.Dataset
objectHas variables
wsfreq
andwdfreq
for the wind speed and direction frequencies
wsfreq
has (at least, but not limited to) the dimensionswsbin
andsector
wdfreq
hassector
as a dimension.
In this example, wsbin
and sector
are core dimensions and need to be present.
These required naming patterns enables us to write Xarray-based code where assumptions can be made about the object,
but as long as the expected patterns are there, the objects are allowed to vary in shape and size and,
by e.g. having other non-core dimensions. We go into depth with this in the WindKit Introduction,
but we will quickly illustrate it here as well.
Let’s read in a binned wind climate and print out inspect the xarray.Dataset
:
In [1]: from pathlib import Path
In [2]: import pywasp as pw
In [3]: import windkit as wk
In [4]: path = Path("../modules/examples/tutorial_1/data")
In [5]: bwc = wk.read_bwc(path / "SerraSantaLuzia.omwc", crs="EPSG:4326")
In [6]: print(bwc)
<xarray.Dataset>
Dimensions: (point: 1, sector: 12, wsbin: 32)
Coordinates:
height (point) float64 25.3
south_north (point) float64 41.74
west_east (point) float64 -8.823
crs int8 0
* wsbin (wsbin) float64 0.5 1.5 2.5 3.5 4.5 ... 28.5 29.5 30.5 31.5
wsceil (wsbin) float64 1.0 2.0 3.0 4.0 5.0 ... 29.0 30.0 31.0 32.0
wsfloor (wsbin) float64 0.0 1.0 2.0 3.0 4.0 ... 28.0 29.0 30.0 31.0
* sector (sector) float64 0.0 30.0 60.0 90.0 ... 270.0 300.0 330.0
sector_ceil (sector) float64 15.0 45.0 75.0 105.0 ... 285.0 315.0 345.0
sector_floor (sector) float64 345.0 15.0 45.0 75.0 ... 255.0 285.0 315.0
Dimensions without coordinates: point
Data variables:
wdfreq (sector, point) float64 0.05314 0.03321 ... 0.1148 0.0707
wsfreq (wsbin, sector, point) float64 0.02601 0.04219 ... 0.0 0.0
Attributes:
Conventions: CF-1.8
history: 2024-06-11T13:35:36+00:00:\twindkit==0.8.0\tcreate_data...
wasp_header: SerraSantaluzia
Package name: windkit
Package version: 0.8.0
Creation date: 2024-06-11T13:35:36+00:00
Object type: Binned Wind Climate
author: DTU CI Config
author_email: pywasp@dtu.dk
institution: DTU Wind
We can see the “binned wind climate” has the pattern described above,
and that in this case it has one extra dimension point
with associated geospatial
information. To calculate the mean wind speed for this object windkit.wind_climate.mean_windspeed()
can be used:
In [7]: mean_ws = wk.mean_windspeed(bwc)
In [8]: print(mean_ws)
<xarray.DataArray 'wspd_combined' (point: 1)>
array([6.28573163])
Coordinates:
height (point) float64 25.3
south_north (point) float64 41.74
west_east (point) float64 -8.823
crs int8 0
Dimensions without coordinates: point
Attributes:
cell_methods: sector: combined
description: Mean wind speed for all wind directions, representing the...
long_name: Horizontal Wind Speed
standard_name: wind_speed
units: m s-1
grid_mapping: crs
We can see that all the “core” dimensions have been reduced and we obtain the mean wind speed for the single point. If the object has more points or other extra dimensions they would be looped-over and one value per extra dimension returned:
In [9]: bwc = bwc.expand_dims(extra_dim=["one", "two", "three"])
In [10]: mean_ws = wk.mean_windspeed(bwc)
In [11]: print(mean_ws)
<xarray.DataArray 'wspd_combined' (extra_dim: 3, point: 1)>
array([[6.28573163],
[6.28573163],
[6.28573163]])
Coordinates:
* extra_dim (extra_dim) object 'one' 'two' 'three'
height (point) float64 25.3
south_north (point) float64 41.74
west_east (point) float64 -8.823
crs int8 0
Dimensions without coordinates: point
Attributes:
cell_methods: sector: combined
description: Mean wind speed for all wind directions, representing the...
long_name: Horizontal Wind Speed
standard_name: wind_speed
units: m s-1
grid_mapping: crs
What is happening “under the hood” can be expressed using just the built-in Xarray methods as:
# Multiply wind speeds with frequencies and sum over wind speed and direction bins
In [12]: mean_ws = (bwc["wsbin"] * bwc["wsfreq"] * bwc["wdfreq"]).sum(dim=("wsbin", "sector"))
In [13]: print(mean_ws)
<xarray.DataArray (extra_dim: 3, point: 1)>
array([[6.28573163],
[6.28573163],
[6.28573163]])
Coordinates:
crs int8 0
* extra_dim (extra_dim) object 'one' 'two' 'three'
height (point) float64 25.3
south_north (point) float64 41.74
west_east (point) float64 -8.823
Dimensions without coordinates: point
Becoming proficient with PyWAsP means, at a minimum, becoming familiar with Xarray, and what it can do. However, we encourage users to spend enough time with Xarray to become comfortable with it. It will make your experience using PyWAsP much easier.