Models
2 In:
from rich import print
import ewatercycle.models
ERROR 1: PROJ: proj_create_from_database: Open of /home/bart/micromamba/envs/ewc3.11/share/proj failed
In eWaterCycle models can be added as plugins. The ewatercycle package itself does not ship with any models. Depending on who set up your system, some or all of the following models will already be available:
The process for adding new models is documented in Adding models
To show the currently available models do:
3 In:
print(ewatercycle.models.sources)
ModelSources[ "Hype", "LeakyBucket", "Lisflood", "MarrmotM01", "MarrmotM14", "PCRGlobWB", "Wflow", ]
Creating, setting up, and initializing a model instance
The way models are created, setup, and initialized matches PyMT as much as possible. There are three steps:
instantiate (create a python object that represents the model)
setup (create a container with the right model, directories, and configuration files)
initialize (start the model inside the container)
To a new user, these steps can be confusing as they seem to be related to “starting a model”. However, you will see that there are some useful things that we can do in between each of these steps. As a side effect, splitting these steps also makes it easier to run a lot of models in parallel (e.g. for calibration). Experience tells us that you will quickly get used to it.
When a model instance is created, we have to specify the version and pass in a suitable parameter set and forcing.
4 In:
import ewatercycle.forcing
import ewatercycle.models
import ewatercycle.parameter_sets
parameter_set = ewatercycle.parameter_sets.available_parameter_sets(
target_model="wflow"
)["wflow_rhine_sbm_nc"]
forcing = ewatercycle.forcing.sources["WflowForcing"](
directory=str(parameter_set.directory),
start_time="1991-01-01T00:00:00Z",
end_time="1991-12-31T00:00:00Z",
shape=None,
# Additional information about the external forcing data needed for the model configuration
netcdfinput="inmaps.nc",
Precipitation="/P",
EvapoTranspiration="/PET",
Temperature="/TEMP",
)
5 In:
model_instance = ewatercycle.models.Wflow(
version="2020.1.3", parameter_set=parameter_set, forcing=forcing
)
WARNING:ewatercycle_wflow.model:Config file from parameter set is missing API section, adding section
WARNING:ewatercycle_wflow.model:Config file from parameter set is missing RiverRunoff option in API section, added it with value '2, m/s option'
In some specific cases the parameter set (e.g. for marrmot) or the forcing (e.g. when it is already included in the parameter set) is not needed.
Most models have a variety of parameters that can be set. An opiniated subset of these parameters is exposed through the eWaterCycle API. We focus on those settings that are relevant from a scientific point of view and prefer to hide technical settings. These parameters and their default values can be inspected as follows:
6 In:
model_instance.parameters
6 Out:
dict_items([('start_time', '1991-01-01T00:00:00Z'), ('end_time', '1991-12-31T00:00:00Z')])
The start date and end date are automatically set based on the forcing data.
Alternative values for each of these parameters can be passed on to the setup function:
7 In:
cfg_file, cfg_dir = model_instance.setup(
end_time="1991-12-15T00:00:00Z",
# use `cfg_dir="/path/to/output_dir"` to specify the output directory
)
The setup
function does the following:
Create a config directory which serves as the current working directory for the mode instance
Creates a configuration file in this directory based on the settings
Starts a container with the requested model version and access to the forcing and parameter sets.
Input is mounted read-only, the working directory is mounted read-write (if a model cannot cope with inputs outside the working directory, the input will be copied).
Setup will complain about incompatible model version, parameter_set, and forcing.
After setup
but before initialize
everything is good-to-go, but nothing has been done yet. This is an opportunity to inspect the generated configuration file, and make any changes manually that could not be done through the setup method.
To modify the config file: print the path, open it in an editor, and save:
7 In:
print(cfg_file)
/home/bart/ewatercycle/output/wflow_20240312_151922/wflow_ewatercycle.ini
Once you’re happy with the setup, it is time to initialize the model. You’ll have to pass in the config file, even if you’ve not made any changes:
8 In:
model_instance.initialize(cfg_file) # for some models, this step can take some time
Running (and interacting with) a model
A model instance can be controlled by calling functions for running a single timestep (update
), setting variables, and getting variables. Besides the rather lowlevel BMI functions like get_value
and set_value
, we also added convenience functions such as get_value_as_xarray
, get_value_at_coords
, time_as_datetime
, and time_as_isostr
. These make it even more pleasant to interact with the model.
For example, to run our model instance from start to finish, fetching the value of variable discharge
at the location of a grdc station:
9 In:
grdc_latitude = 51.756918
grdc_longitude = 6.395395
10 In:
output = []
while model_instance.time < model_instance.end_time:
model_instance.update()
discharge = model_instance.get_value_at_coords(
"RiverRunoff", lon=[grdc_longitude], lat=[grdc_latitude]
)[0]
output.append(discharge)
# Here you could do whatever you like, e.g. update soil moisture values before doing the next timestep.
print(
model_instance.time_as_isostr,
end="\r",
flush=True,
) # "\r" clears the output before printing the next timestamp
1991-12-15T00:00:00Z
We can also get the entire model field at a single time step. To simply plot it:
11 In:
model_instance.get_value_as_xarray("RiverRunoff").plot()
11 Out:
<matplotlib.collections.QuadMesh at 0x7f82e83849d0>
To get the RiverRunoff at certain location
12 In:
model_instance.get_value_at_coords("RiverRunoff", lat=[50.0], lon=[8.05])
12 Out:
array([1850.8435], dtype=float32)
If you want to know which variables are available, you can use
13 In:
model_instance.output_var_names
13 Out:
('RiverRunoff',)
Destroying the model
A model instance running in a container can take up quite a bit of resources on the system. When you’re done with an experiment, it is good practice to always finalize the model. This will make sure the model properly performs any tear-down tasks and eventually the container will be destroyed.
14 In:
model_instance.finalize()
Observations
eWaterCycle also includes utilities to easily load observations. Currently, eWaterCycle systems provide access to GRDC and USGS data, and we’re hoping to expand this in the future.
15 In:
import ewatercycle.observation.grdc
To load GRDC station data:
16 In:
grdc_station_id = "6335020"
observations, metadata = ewatercycle.observation.grdc.get_grdc_data(
station_id=grdc_station_id,
start_time="1990-01-01T00:00:00Z", # or: model_instance.start_time_as_isostr
end_time="1990-12-15T00:00:00Z",
column="GRDC",
)
observations.head()
16 Out:
GRDC | |
---|---|
time | |
1990-01-01 | 2200.0 |
1990-01-02 | 1990.0 |
1990-01-03 | 1840.0 |
1990-01-04 | 1720.0 |
1990-01-05 | 1620.0 |
Since not all GRDC stations are complete, some information is stored in metadata to inform you about the data.
17 In:
print(metadata)
{'grdc_file_name': '/home/bart/ewatercycle/grdc-observations/6335020_Q_Day.Cmd.txt', 'id_from_grdc': 6335020, 'file_generation_date': '2019-03-27', 'river_name': 'RHINE RIVER', 'station_name': 'REES', 'country_code': 'DE', 'grdc_latitude_in_arc_degree': 51.756918, 'grdc_longitude_in_arc_degree': 6.395395, 'grdc_catchment_area_in_km2': 159300.0, 'altitude_masl': 8.0, 'dataSetContent': 'MEAN DAILY DISCHARGE (Q)', 'units': 'm³/s', 'time_series': '1814-11 - 2016-12', 'no_of_years': 203, 'last_update': '2018-05-24', 'nrMeasurements': 73841, 'UserStartTime': '1990-01-01T00:00:00Z', 'UserEndTime': '1990-12-15T00:00:00Z', 'nrMissingData': 0}
Analysis
To easily analyse model output, eWaterCycle also includes an analysis
module.
18 In:
import ewatercycle.analysis
For example, we will plot a hydrograph of the model run and GRDC observations. To this end, we combine the two timeseries in a single dataframe
20 In:
combined_discharge = observations
combined_discharge["wflow"] = output
21 In:
ewatercycle.analysis.hydrograph(
discharge=combined_discharge,
reference="GRDC",
)
21 Out:
(<Figure size 1000x1000 with 2 Axes>,
(<Axes: title={'center': 'Hydrograph'}, xlabel='time', ylabel='Discharge (m$^3$ s$^{-1}$)'>,
<Axes: >))
In: