Welcome to ewatercycle’s documentation!
The eWaterCycle Python package brings together many components from the eWaterCycle project. An overall goal of this project is to make hydrological modelling fully reproducible, open, and FAIR.
Modelled after PyMT, it enables interactively running a model from a Python environment like so:
from ewatercycle.models import Wflow
model = Wflow(version="2020.1.1", parameterset=example_parameter_set, forcing=example_forcing)
cfg_file, cfg_dir = model.setup(end_time="2020-01-01T00:00:00Z")
model.initialize(cfg_file)
output = []
while model.time < model.end_time:
model.update()
discharge = model.get_value_at_coords("RiverRunoff", lat=[52.3], lon=[5.2])
output.append(discharge)
To learn how to use the package, see the User guide and example pages.
Typically the eWaterCycle platform is deployed on a system that can be accessed through the browser via JupyterHub, and comes preconfigured with readily available parameter sets, meteorological forcing data, model images, etcetera. This makes it possible for researchers to quickly run an experiment without the hassle of installing a model or creating suitable input data. To learn more about the system setup, read our System setup page.
In general eWaterCycle tries to strike a balance between making it easy to use standard available elements of an experiment (datasets, models, analysis algorithms), and supplying custom elements. This does mean that a simple usecase sometimes requires slightly more lines of code than strictly nescessary, for the sake of making it easy to adapt this code to more complex and/or custom usecases.
Glossary
To avoid miscommunication, here we define explicitly what we mean by some terms that are commonly used throughout this documentation.
Experiment: A notebook running one or more hydrological models and producing a scientific result.
Model: Software implementation of an algorithm. Note this excludes data required for this model.
Forcing: all time dependent data needed to run a model, and that is not impacted by the model.
Model Parameters: fixed parameters (depth of river, land use, irrigation channels, dams). Considered constant during a model run.
Parameter Set: File based collection of parameters for a certain model, resolution, and possibly area.
Model instance: single running instance of a model, including all data required, and with a current state.
1 In:
# Suppress distracting outputs in these examples
import logging
import warnings
warnings.filterwarnings("ignore", category=UserWarning)
logger = logging.getLogger("esmvalcore")
logger.setLevel(logging.WARNING)
User guide
This user manual will explain how the eWaterCycle Python package can be used to perform hydrological experiments. We will walk through the following chapters:
parameter sets
forcing data
model instances
using observations
analysis
Each of these chapters correspond to a so-called “subpackage” of eWaterCycle Python package. Before we continue, however, we will briefly explain the configuration file.
Configuration
To be able to find all needed data and models eWaterCycle comes with a configuration object. This configuration contains system settings for eWaterCycle (which container technology to use, where is the data located, etc). In general these should not need to be changed by the user for a specific experiment, and ideally a user would never need to touch this configuration on a properly managed system. However, it is good to know that it is there.
You can see the default configuration on your system like so:
2 In:
from ewatercycle import CFG
CFG
2 Out:
Config({'container_engine': 'singularity',
'ewatercycle_config': PosixPath('/home/fakhereh/.config/ewatercycle/ewatercycle.yaml'),
'grdc_location': PosixPath('/projects/0/wtrcycle/GRDC/GRDC_GCOSGTN-H_27_03_2019'),
'output_dir': PosixPath('/scratch-shared/ewatercycle'),
'parameter_sets': {'lisflood_fraser': {'config': 'lisflood_fraser/settings_lat_lon-Run.xml',
'directory': 'lisflood_fraser',
'doi': 'N/A',
'supported_model_versions': {'20.10'},
'target_model': 'lisflood'},
'lisflood_global-masked_01degree': {'config': 'lisflood_global-masked_01degree/settings_lisflood_ERA5.xml',
'directory': 'lisflood_global-masked_01degree',
'supported_model_versions': {'20.10'},
'target_model': 'lisflood'},
'pcrglobwb_merrimack_05min': {'config': 'pcrglobwb_global/merrimack_05min_era5.ini',
'directory': 'pcrglobwb_global',
'doi': 'https://doi.org/10.5281/zenodo.1045339',
'supported_model_versions': {'setters'},
'target_model': 'pcrglobwb'},
'pcrglobwb_rhine_05min': {'config': 'pcrglobwb_global/rhine_05min_era5.ini',
'directory': 'pcrglobwb_global',
'doi': 'https://doi.org/10.5281/zenodo.1045339',
'supported_model_versions': {'setters'},
'target_model': 'pcrglobwb'},
'pcrglobwb_rhinemeuse_30min': {'config': 'pcrglobwb_rhinemeuse_30min/setup_natural_test.ini',
'directory': 'pcrglobwb_rhinemeuse_30min',
'doi': 'https://doi.org/10.5281/zenodo.1045339',
'supported_model_versions': {'setters'},
'target_model': 'pcrglobwb'},
'wflow_Doring_ERA5-calibrated': {'config': 'wflow_Doring_ERA5-calibrated/wflow_sbm_era5.ini',
'directory': 'wflow_Doring_ERA5-calibrated',
'supported_model_versions': {'2020.1.1',
'2020.1.2'},
'target_model': 'wflow'},
'wflow_Great-Kei_ERA5-calibrated': {'config': 'wflow_Great-Kei_ERA5-calibrated/wflow_sbm_era5.ini',
'directory': 'wflow_Great-Kei_ERA5-calibrated',
'supported_model_versions': {'2020.1.1',
'2020.1.2'},
'target_model': 'wflow'},
'wflow_Great-Kei_ERA_Interim-calibrated': {'config': 'wflow_Great-Kei_ERA_Interim-calibrated/wflow_sbm_era-interim.ini',
'directory': 'wflow_Great-Kei_ERA_Interim-calibrated',
'supported_model_versions': {'2020.1.1',
'2020.1.2'},
'target_model': 'wflow'},
'wflow_Merrimack_ERA5-calibrated': {'config': 'wflow_Merrimack_ERA5-calibrated/wflow_sbm_era5.ini',
'directory': 'wflow_Merrimack_ERA5-calibrated',
'supported_model_versions': {'2020.1.1',
'2020.1.2'},
'target_model': 'wflow'},
'wflow_Merrimack_ERA_Interim-calibrated': {'config': 'wflow_Merrimack_ERA_Interim-calibrated/wflow_sbm_era-interim.ini',
'directory': 'wflow_Merrimack_ERA_Interim-calibrated',
'supported_model_versions': {'2020.1.1',
'2020.1.2'},
'target_model': 'wflow'},
'wflow_Meuse_ERA5-calibrated': {'config': 'wflow_Meuse_ERA5-calibrated/wflow_sbm_era5.ini',
'directory': 'wflow_Meuse_ERA5-calibrated',
'supported_model_versions': {'2020.1.1',
'2020.1.2'},
'target_model': 'wflow'},
'wflow_Meuse_ERA_Interim-calibrated': {'config': 'wflow_Meuse_ERA_Interim-calibrated/wflow_sbm_era-interim.ini',
'directory': 'wflow_Meuse_ERA_Interim-calibrated',
'supported_model_versions': {'2020.1.1',
'2020.1.2'},
'target_model': 'wflow'},
'wflow_Rhine_ERA5-calibrated': {'config': 'wflow_Rhine_ERA5-calibrated/wflow_sbm_era5.ini',
'directory': 'wflow_Rhine_ERA5-calibrated',
'supported_model_versions': {'2020.1.1',
'2020.1.2'},
'target_model': 'wflow'},
'wflow_Savannah_ERA-Interim-calibrated': {'config': 'wflow_Savannah_ERA-Interim-calibrated/wflow_sbm_era-interim.ini',
'directory': 'wflow_Savannah_ERA-Interim-calibrated',
'supported_model_versions': {'2020.1.1',
'2020.1.2'},
'target_model': 'wflow'},
'wflow_merrimack_techpaper': {'config': 'wflow_merrimack_techpaper/wflow_sbm_era5_test.ini',
'directory': 'wflow_merrimack_techpaper',
'supported_model_versions': {'2020.1.1',
'2020.1.2'},
'target_model': 'wflow'},
'wflow_rhine_ERA_Interim-calibrated': {'config': 'wflow_rhine_ERA_Interim-calibrated/wflow_sbm_era-interim.ini',
'directory': 'wflow_rhine_ERA_Interim-calibrated',
'supported_model_versions': {'2020.1.1',
'2020.1.2'},
'target_model': 'wflow'},
'wflow_rhine_sbm_nc': {'config': 'wflow_rhine_sbm_nc/wflow_sbm_NC.ini',
'directory': 'wflow_rhine_sbm_nc',
'doi': 'N/A',
'supported_model_versions': {'2020.1.1',
'2020.1.2'},
'target_model': 'wflow'}},
'parameterset_dir': PosixPath('/projects/0/wtrcycle/parameter-sets'),
'singularity_dir': PosixPath('/projects/0/wtrcycle/singularity-images')})
Note: a path on the local filesystem is always denoted as “dir” (short for directory), instead of folder, path, or location. Especially location can be confusing in the context of geospatial modeling.
It is also possible to store and load custom configuration files. For more information, see system setup
Parameter sets
Parameter sets are an essential part of many hydrological models, and for the eWaterCycle package as well.
3 In:
import ewatercycle.parameter_sets
The default system setup includes a number of example parameter sets that can be used directly. System administrators can also add available parameter sets that are globally availble to all users. In the future, we’re hoping to add functionality to fetch new parameter sets using a DOI as well.
To see the available parameter sets:
4 In:
ewatercycle.parameter_sets.available_parameter_sets()
4 Out:
('lisflood_fraser',
'pcrglobwb_rhinemeuse_30min',
'wflow_rhine_sbm_nc',
'wflow_Rhine_ERA5-calibrated',
'wflow_Great-Kei_ERA5-calibrated',
'wflow_Doring_ERA5-calibrated',
'wflow_Merrimack_ERA5-calibrated',
'wflow_Meuse_ERA5-calibrated',
'wflow_Savannah_ERA-Interim-calibrated',
'lisflood_global-masked_01degree',
'pcrglobwb_merrimack_05min',
'pcrglobwb_rhine_05min',
'wflow_merrimack_techpaper')
Since most parameter sets are model specific, you can filter the results as well:
5 In:
ewatercycle.parameter_sets.available_parameter_sets(target_model="wflow")
5 Out:
('wflow_rhine_sbm_nc',
'wflow_Rhine_ERA5-calibrated',
'wflow_Great-Kei_ERA5-calibrated',
'wflow_Doring_ERA5-calibrated',
'wflow_Merrimack_ERA5-calibrated',
'wflow_Meuse_ERA5-calibrated',
'wflow_Savannah_ERA-Interim-calibrated',
'wflow_merrimack_techpaper')
Once you have found a suitable parameter set, you can load it and see some more details:
6 In:
parameter_set = ewatercycle.parameter_sets.get_parameter_set("wflow_rhine_sbm_nc")
print(parameter_set)
Parameter set
-------------
name=wflow_rhine_sbm_nc
directory=/gpfs/work1/0/wtrcycle/parameter-sets/wflow_rhine_sbm_nc
config=/gpfs/work1/0/wtrcycle/parameter-sets/wflow_rhine_sbm_nc/wflow_sbm_NC.ini
doi=N/A
target_model=wflow
supported_model_versions={'2020.1.1', '2020.1.2'}
or you can access individual attributes of the parameter sets
7 In:
parameter_set.supported_model_versions
7 Out:
{'2020.1.1', '2020.1.2'}
Should you wish to configure your own parameter set (e.g. for PCRGlobWB in this case), this is also possible:
8 In:
custom_parameter_set = ewatercycle.parameter_sets.ParameterSet(
name="custom_parameter_set",
directory="~/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min",
config="~/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min/setup_natural_test.ini",
target_model="pcrglobwb",
doi="https://doi.org/10.5281/zenodo.1045339",
supported_model_versions={"setters"},
)
As you can see, an eWaterCycle parameter set is defined fully by a directory and a configuration file. The configuration file typically informs the model about the structure of the parameter set (e.g. “what is the filename of the land use data”). It is possible to change these settings later, when setting up the model.
Forcing data
eWaterCycle can load or generate forcing data for a model using the forcing
module.
8 In:
import ewatercycle.forcing
Existing forcing from external source
We first show how existing forcing data can be loaded with eWaterCycle. The wflow example parameter set already includes forcing data that was generated manually by the scientists at Deltares.
9 In:
forcing = ewatercycle.forcing.load_foreign(
directory=str(parameter_set.directory),
target_model="wflow",
start_time="1991-01-01T00:00:00Z",
end_time="1991-12-31T00:00:00Z",
shape=None,
forcing_info=dict(
# Additional information about the external forcing data needed for the model configuration
netcdfinput="inmaps.nc",
Precipitation="/P",
EvapoTranspiration="/PET",
Temperature="/TEMP",
),
)
print(forcing)
Forcing data for Wflow
----------------------
Directory: /gpfs/work1/0/wtrcycle/parameter-sets/wflow_rhine_sbm_nc
Start time: 1991-01-01T00:00:00Z
End time: 1991-12-31T00:00:00Z
Shapefile: None
Additional information for model config:
- netcdfinput: inmaps.nc
- Precipitation: /P
- Temperature: /TEMP
- EvapoTranspiration: /PET
- Inflow: None
As you can see, the forcing consists of a generic part which is the same for all eWaterCycle models, and a model-specific part (forcing_info
). If you’re familiar with wflow, you might recognize that the model-specific settings map directly to wflow configuration settings.
Generating forcing data
In most cases, you will not have access to tailor-made forcing data, and manually pre-processing existing datasets can be quite a pain. eWaterCycle includes a forcing generator that can do all the required steps to go from the available datasets (ERA5, ERA-Interim, etc) to whatever format the models require. This is done through ESMValTool recipes. For some models (e.g. lisflood) additional computations are done, as some steps require data and/or code that is not available to ESMValTool.
Apart from some standard parameters (start time, datasets, etc.), the forcing generator sometimes requires additional model-specific options. For our wflow example case, we need to pass the DEM file to the ESMValTool recipe as well. All model-specific options are listed in the API documentation.
ESMValTool configuration
As eWaterCycle relies on ESMValTool for processing forcing data, configuration for forcing is mostly defered to the esmvaltool configuration file. What ESMValTool configuration file to use can be specified in the system setup.
10 In:
forcing = ewatercycle.forcing.generate(
target_model="wflow",
dataset="ERA5",
start_time="1990-01-01T00:00:00Z",
end_time="1990-01-31T00:00:00Z",
shape="~/GitHub/ewatercycle/docs/examples/data/Rhine/Rhine.shp",
model_specific_options={
"dem_file": f"{parameter_set.directory}/staticmaps/wflow_dem.map",
},
)
print(forcing)
{'auxiliary_data_dir': PosixPath('/projects/0/wtrcycle/comparison/recipes_auxiliary_datasets'),
'compress_netcdf': False,
'config_developer_file': None,
'config_file': PosixPath('/home/fakhereh/.esmvaltool/config-user.yml'),
'drs': {'CMIP5': 'default', 'CMIP6': 'default'},
'exit_on_warning': False,
'extra_facets_dir': (),
'log_level': 'info',
'max_parallel_tasks': 1,
'output_dir': PosixPath('/scratch-shared/ewatercycle/recipe_wflow_20211129_150429'),
'output_file_type': 'png',
'plot_dir': PosixPath('/scratch-shared/ewatercycle/recipe_wflow_20211129_150429/plots'),
'preproc_dir': PosixPath('/scratch-shared/ewatercycle/recipe_wflow_20211129_150429/preproc'),
'profile_diagnostic': False,
'remove_preproc_dir': True,
'rootpath': {'CMIP5': [PosixPath('/home/fakhereh/cmip5_inputpath1'),
PosixPath('/home/fakhereh/cmip5_inputpath2')],
'CMIP6': [PosixPath('/home/fakhereh/cmip6_inputpath1'),
PosixPath('/home/fakhereh/cmip6_inputpath2')],
'OBS6': [PosixPath('/projects/0/wtrcycle/comparison/obs6')],
'RAWOBS': [PosixPath('/projects/0/wtrcycle/comparison/rawobs')],
'default': [PosixPath('/projects/0/wtrcycle/comparison')]},
'run_dir': PosixPath('/scratch-shared/ewatercycle/recipe_wflow_20211129_150429/run'),
'save_intermediary_cubes': False,
'work_dir': PosixPath('/scratch-shared/ewatercycle/recipe_wflow_20211129_150429/work'),
'write_netcdf': True,
'write_plots': True}
Shapefile /gpfs/home2/fakhereh/GitHub/ewatercycle/docs/examples/data/Rhine/Rhine.shp is not in forcing directory /gpfs/scratch1/shared/ewatercycle/recipe_wflow_20211129_150429/work/wflow_daily/script. So, it won't be saved in /gpfs/scratch1/shared/ewatercycle/recipe_wflow_20211129_150429/work/wflow_daily/script/ewatercycle_forcing.yaml.
Forcing data for Wflow
----------------------
Directory: /gpfs/scratch1/shared/ewatercycle/recipe_wflow_20211129_150429/work/wflow_daily/script
Start time: 1990-01-01T00:00:00Z
End time: 1990-01-31T00:00:00Z
Shapefile: /gpfs/home2/fakhereh/GitHub/ewatercycle/docs/examples/data/Rhine/Rhine.shp
Additional information for model config:
- netcdfinput: wflow_ERA5_Rhine_1990_1990.nc
- Precipitation: /pr
- Temperature: /tas
- EvapoTranspiration: /pet
- Inflow: None
Generated forcing is automatically saved to the ESMValTool output directory. A yaml
file is stored there as well, such that you can easily reload the forcing later without having to generate it again.
ewatercycle_forcing.yaml
:
!WflowForcing
start_time: '1990-01-01T00:00:00Z'
end_time: '1990-12-31T00:00:00Z'
shape:
netcdfinput: wflow_ERA5_Rhine_1990_1990.nc
Precipitation: /pr
EvapoTranspiration: /pet
Temperature: /tas
Inflow:
11 In:
reloaded_forcing = ewatercycle.forcing.load(
directory="/scratch-shared/ewatercycle/recipe_wflow_20211129_103921/work/wflow_daily/script"
)
Models
12 In:
import ewatercycle.models
eWaterCycle currently integrates the following models:
and we’re expecting to add more models soon. The process for adding new models is documented in Adding models
Model versions
To help with reproducibility the version of a model must always be specified when creating a model instance. The available versions can be seen like so:
13 In:
import ewatercycle.models
ewatercycle.models.Wflow.available_versions
13 Out:
('2020.1.1', '2020.1.2')
Creating, setting up, and initializing a model instance
The way models are created, setup, and initialized matches PyMT as much as possible. There are three steps:
instantiate (create a python object that represents the model)
setup (create a container with the right model, directories, and configuration files)
initialize (start the model inside the container)
To a new user, these steps can be confusing as they seem to be related to “starting a model”. However, you will see that there are some useful things that we can do in between each of these steps. As a side effect, splitting these steps also makes it easier to run a lot of models in parallel (e.g. for calibration). Experience tells us that you will quickly get used to it.
When a model instance is created, we have to specify the version and pass in a suitable parameter set and forcing.
14 In:
model_instance = ewatercycle.models.Wflow(
version="2020.1.2", parameter_set=parameter_set, forcing=forcing
)
Config file from parameter set is missing API section, adding section
Config file from parameter set is missing RiverRunoff option in API section, added it with value '2, m/s option'
In some specific cases the parameter set (e.g. for marrmot) or the forcing (e.g. when it is already included in the parameter set) is not needed.
Most models have a variety of parameters that can be set. An opiniated subset of these parameters is exposed through the eWaterCycle API. We focus on those settings that are relevant from a scientific point of view and prefer to hide technical settings. These parameters and their default values can be inspected as follows:
15 In:
model_instance.parameters
15 Out:
[('start_time', '1990-01-01T00:00:00Z'), ('end_time', '1990-01-31T00:00:00Z')]
The start date and end date are automatically set based on the forcing data.
Alternative values for each of these parameters can be passed on to the setup function:
16 In:
cfg_file, cfg_dir = model_instance.setup(end_time="1990-12-15T00:00:00Z")
Running /projects/0/wtrcycle/singularity-images/ewatercycle-wflow-grpc4bmi_2020.1.2.sif singularity container on port 35805
The setup
function does the following:
Create a config directory which serves as the current working directory for the mode instance
Creates a configuration file in this directory based on the settings
Starts a container with the requested model version and access to the forcing and parameter sets.
Input is mounted read-only, the working directory is mounted read-write (if a model cannot cope with inputs outside the working directory, the input will be copied).
Setup will complain about incompatible model version, parameter_set, and forcing.
After setup
but before initialize
everything is good-to-go, but nothing has been done yet. This is an opportunity to inspect the generated configuration file, and make any changes manually that could not be done through the setup method.
To modify the config file: print the path, open it in an editor, and save:
17 In:
model_instance.work_dir
17 Out:
PosixPath('/gpfs/scratch1/shared/ewatercycle/wflow_20211129_150535')
18 In:
print(cfg_file)
/gpfs/scratch1/shared/ewatercycle/wflow_20211129_150535/wflow_ewatercycle.ini
Once you’re happy with the setup, it is time to initialize the model. You’ll have to pass in the config file, even if you’ve not made any changes:
19 In:
model_instance.initialize(cfg_file) # for some models, this step can take some time
Running (and interacting with) a model
A model instance can be controlled by calling functions for running a single timestep (update
), setting variables, and getting variables. Besides the rather lowlevel BMI functions like get_value
and set_value
, we also added convenience functions such as get_value_as_xarray
, get_value_at_coords
, time_as_datetime
, and time_as_isostr
. These make it even more pleasant to interact with the model.
For example, to run our model instance from start to finish, fetching the value of variable discharge
at the location of a grdc station:
20 In:
grdc_latitude = 51.756918
grdc_longitude = 6.395395
21 In:
output = []
while model_instance.time < model_instance.end_time:
model_instance.update()
discharge = model_instance.get_value_at_coords(
"RiverRunoff", lon=[grdc_longitude], lat=[grdc_latitude]
)[0]
output.append(discharge)
# Here you could do whatever you like, e.g. update soil moisture values before doing the next timestep.
print(
model_instance.time_as_isostr, end="\r"
) # "\r" clears the output before printing the next timestamp
1990-12-15T00:00:00Z
We can also get the entire model field at a single time step. To simply plot it:
22 In:
model_instance.get_value_as_xarray("RiverRunoff").plot()
22 Out:
<matplotlib.collections.QuadMesh at 0x1460a949f100>

If you want to know which variables are available, you can use
23 In:
model_instance.output_var_names
23 Out:
('RiverRunoff',)
Destroying the model
A model instance running in a container can take up quite a bit of resources on the system. When you’re done with an experiment, it is good practice to always finalize the model. This will make sure the model properly performs any tear-down tasks and eventually the container will be destroyed.
24 In:
model_instance.finalize()
Observations
eWaterCycle also includes utilities to easily load observations. Currently, eWaterCycle systems provide access to GRDC and USGS data, and we’re hoping to expand this in the future.
26 In:
import ewatercycle.observation.grdc
To load GRDC station data:
27 In:
grdc_station_id = "6335020"
observations, metadata = ewatercycle.observation.grdc.get_grdc_data(
station_id=grdc_station_id,
start_time="1990-01-01T00:00:00Z", # or: model_instance.start_time_as_isostr
end_time="1990-12-15T00:00:00Z",
column="GRDC",
)
observations.head()
GRDC station 6335020 is selected. The river name is: RHINE RIVER.The coordinates are: (51.756918, 6.395395).The catchment area in km2 is: 159300.0. There are 0 missing values during 1990-01-01T00:00:00Z_1990-12-15T00:00:00Z at this station. See the metadata for more information.
27 Out:
GRDC | |
---|---|
time | |
1990-01-01 | 2200.0 |
1990-01-02 | 1990.0 |
1990-01-03 | 1840.0 |
1990-01-04 | 1720.0 |
1990-01-05 | 1620.0 |
Since not all GRDC stations are complete, some information is stored in metadata to inform you about the data.
28 In:
print(metadata)
{'grdc_file_name': '/gpfs/work1/0/wtrcycle/GRDC/GRDC_GCOSGTN-H_27_03_2019/6335020_Q_Day.Cmd.txt', 'id_from_grdc': 6335020, 'file_generation_date': '2019-03-27', 'river_name': 'RHINE RIVER', 'station_name': 'REES', 'country_code': 'DE', 'grdc_latitude_in_arc_degree': 51.756918, 'grdc_longitude_in_arc_degree': 6.395395, 'grdc_catchment_area_in_km2': 159300.0, 'altitude_masl': 8.0, 'dataSetContent': 'MEAN DAILY DISCHARGE (Q)', 'units': 'm³/s', 'time_series': '1814-11 - 2016-12', 'no_of_years': 203, 'last_update': '2018-05-24', 'nrMeasurements': 73841, 'UserStartTime': '1990-01-01T00:00:00Z', 'UserEndTime': '1990-12-15T00:00:00Z', 'nrMissingData': 0}
Analysis
To easily analyse model output, eWaterCycle also includes an analysis
module.
29 In:
import ewatercycle.analysis
For example, we will plot a hydrograph of the model run and GRDC observations. To this end, we combine the two timeseries in a single dataframe
30 In:
combined_discharge = observations
combined_discharge["wflow"] = output
31 In:
ewatercycle.analysis.hydrograph(
discharge=combined_discharge,
reference="GRDC",
)
31 Out:
(<Figure size 720x720 with 2 Axes>,
(<AxesSubplot:title={'center':'Hydrograph'}, xlabel='time', ylabel='Discharge (m$^3$ s$^{-1}$)'>,
<AxesSubplot:>))

In:
System setup
To use eWaterCycle package you need to setup the system with software and data.
This chapter is for system administrators or Research Software Engineers who need to set up a system for the eWaterCycle platform.
The setup steps:
Conda environment
The eWaterCycle Python package uses a lot of geospatial dependencies which can be installed using Conda package management system.
Install Conda by using the miniconda installer.
After conda is installed you can install the software dependencies with a conda environment file.
wget https://raw.githubusercontent.com/eWaterCycle/ewatercycle/main/environment.yml
conda install mamba -n base -c conda-forge -y
mamba env create --file environment.yml
conda activate ewatercycle
Do not forget that any terminal or Jupyter kernel should activate the conda environment before the eWaterCycle Python package can be used.
Install eWaterCycle package
The Python package can be installed using pip
pip install ewatercycle
Configure ESMValTool
ESMValTool is used to generate forcing (temperature, precipitation, etc.) files from climate data for hydrological models. The ESMValTool has been installed as a dependency of the package.
See https://docs.esmvaltool.org/en/latest/quickstart/configuration.html how configure ESMValTool.
Download climate data
The ERA5 and ERA-Interim data can be used to generate forcings.
ERA5
To download ERA5 data files you can use the era5cli tool.
pip install era5cli
Follow instructions to get access to data.
As an example, the hourly ERA5 data for the years 1990 and 1991 and for variables pr, psl, tas, taxmin, tasmax, tdps, uas, vas, rsds, rsdt and fx orog are downloaded as:
cd <ESMValTool ERA5 raw directory for example /projects/0/wtrcycle/comparison/rawobs/Tier3/ERA5/1>
era5cli hourly --startyear 1990 --endyear 1991 --variables total_precipitation
era5cli hourly --startyear 1990 --endyear 1991 --variables mean_sea_level_pressure
era5cli hourly --startyear 1990 --endyear 1991 --variables 2m_temperature
era5cli hourly --startyear 1990 --endyear 1991 --variables minimum_2m_temperature_since_previous_post_processing
era5cli hourly --startyear 1990 --endyear 1991 --variables maximum_2m_temperature_since_previous_post_processing
era5cli hourly --startyear 1990 --endyear 1991 --variables 2m_dewpoint_temperature
era5cli hourly --startyear 1990 --endyear 1991 --variables 10m_u_component_of_wind
era5cli hourly --startyear 1990 --endyear 1991 --variables 10m_v_component_of_wind
era5cli hourly --startyear 1990 --endyear 1991 --variables surface_solar_radiation_downwards
era5cli hourly --startyear 1990 --endyear 1991 --variables toa_incident_solar_radiation
era5cli hourly --startyear 1990 --endyear 1991 --variables orography
cd -
The hourly data needs need be converted to daily using a ESMValTool recipe
esmvaltool run cmorizers/recipe_era5.yml
ERA-Interim
ERA-Interim has been superseeded by ERA5, but could be useful for reproduction studies and its smaller size. The ERA-Interim data files can be downloaded at https://www.ecmwf.int/en/forecasts/datasets/reanalysis-datasets/era-interim
Or you can use the download_era_interim.py
script to download ERA-Interim data files. See first lines of script for documentation.
The files should be downloaded to the ESMValTool ERA-Interim raw directory for example /projects/0/wtrcycle/comparison/rawobs/Tier3/ERA-Interim
.
The ERA5-Interim raw data files need to be cmorized using script:
cmorize_obs -o ERA-Interim
Install container engine
In eWaterCycle package, the hydrological models are run in containers with engines like Singularity or Docker. At least Singularity or Docker should be installed.
Installing a container engine requires root permission on the machine.
Singularity
Install Singularity using instructions.
Docker
Install Docker using instructions. Docker should be configured so it can be called without sudo
Configure eWaterCycle
The eWaterCycle package simplifies the API by reading some of the directories and settings from a configuration file.
The configuration can be set in Python with
import logging
logging.basicConfig(level=logging.INFO)
import ewatercycle
import ewatercycle.parameter_sets
# Which container engine is used to run the hydrological models
ewatercycle.CFG['container_engine'] = 'singularity' # or 'docker'
# If container_engine==singularity then where can the singularity images files (*.sif) be found.
ewatercycle.CFG['singularity_dir'] = './singularity-images'
# Directory in which output of model runs is stored. Each model run will generate a sub directory inside output_dir
ewatercycle.CFG['output_dir'] = './'
# Where can GRDC observation files (<station identifier>_Q_Day.Cmd.txt) be found.
ewatercycle.CFG['grdc_location'] = './grdc-observations'
# Where can parameters sets prepared by the system administator be found
ewatercycle.CFG['parameterset_dir'] = './parameter-sets'
# Where is the configuration saved or loaded from
ewatercycle.CFG['ewatercycle_config'] = './ewatercycle.yaml'
and then written to disk with
ewatercycle.CFG.save_to_file()
Later it can be loaded by using:
ewatercycle.CFG.load_from_file('./ewatercycle.yaml')
To make the ewatercycle configuration load by default for current user
it should be copied to ~/.config/ewatercycle/ewatercycle.yaml
.
To make the ewatercycle configuration available to all users on the
system it should be copied to /etc/ewatercycle.yaml
.
Configuration file for Snellius system
Users part of the eWaterCycle project can use the following configurations on the Snellius system of SURF:
container_engine: singularity
singularity_dir: /projects/0/wtrcycle/singularity-images
output_dir: /scratch-shared/ewatercycle
grdc_location: /projects/0/wtrcycle/GRDC/GRDC_GCOSGTN-H_27_03_2019
parameterset_dir: /projects/0/wtrcycle/parameter-sets
The /scratch-shared/ewatercycle output directory will be automatically removed if its content is older than 14 days. If the output directory is missing it can be recreated with
mkdir /scratch-shared/ewatercycle
chgrp wtrcycle /scratch-shared/ewatercycle
chmod 2770 /scratch-shared/ewatercycle
Configuration file for ewatecycle Jupyter machine
Users can use the following configurations on systems constructed with eWaterCycle application on SURF Research Cloud:
container_engine: singularity
singularity_dir: /mnt/data/singularity-images
output_dir: /scratch
grdc_location: /mnt/data/GRDC
parameterset_dir: /mnt/data/parameter-sets
Model container images
As hydrological models run in containers, their container images should be made available on the system.
The names of the images can be found in the ewatercycle.models.*
classes.
Docker
Docker images will be downloaded with docker pull
:
docker pull ewatercycle/lisflood-grpc4bmi:20.10
docker pull ewatercycle/marrmot-grpc4bmi:2020.11
docker pull ewatercycle/pcrg-grpc4bmi:setters
docker pull ewatercycle/wflow-grpc4bmi:2020.1.1
docker pull ewatercycle/wflow-grpc4bmi:2020.1.2
docker pull ewatercycle/wflow-grpc4bmi:2020.1.3
docker pull ewatercycle/hype-grpc4bmi:feb2021
Singularity
Singularity images should be stored in configured directory
(ewatercycle.CFG['singularity_dir']
) and can build from Docker with:
cd {ewatercycle.CFG['singularity_dir']}
singularity build ewatercycle-lisflood-grpc4bmi_20.10.sif docker://ewatercycle/lisflood-grpc4bmi:20.10
singularity build ewatercycle-marrmot-grpc4bmi_2020.11.sif docker://ewatercycle/marrmot-grpc4bmi:2020.11
singularity build ewatercycle-pcrg-grpc4bmi_setters.sif docker://ewatercycle/pcrg-grpc4bmi:setters
singularity build ewatercycle-wflow-grpc4bmi_2020.1.1.sif docker://ewatercycle/wflow-grpc4bmi:2020.1.1
singularity build ewatercycle-wflow-grpc4bmi_2020.1.2.sif docker://ewatercycle/wflow-grpc4bmi:2020.1.2
singularity build ewatercycle-wflow-grpc4bmi_2020.1.3.sif docker://ewatercycle/wflow-grpc4bmi:2020.1.3
singularity build ewatercycle-hype-grpc4bmi_feb2021.sif docker://ewatercycle/hype-grpc4bmi:feb2021
cd -
Download example parameter sets
To quickly run the models it is advised to setup a example parameter sets for each model.
ewatercycle.parameter_sets.download_example_parameter_sets()
INFO:ewatercycle.parameter_sets._example:Downloading example parameter set wflow_rhine_sbm_nc to /home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/wflow_rhine_sbm_nc...
INFO:ewatercycle.parameter_sets._example:Download complete.
INFO:ewatercycle.parameter_sets._example:Adding parameterset wflow_rhine_sbm_nc to ewatercycle.CFG...
INFO:ewatercycle.parameter_sets._example:Downloading example parameter set pcrglobwb_rhinemeuse_30min to /home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min...
INFO:ewatercycle.parameter_sets._example:Download complete.
INFO:ewatercycle.parameter_sets._example:Adding parameterset pcrglobwb_rhinemeuse_30min to ewatercycle.CFG...
INFO:ewatercycle.parameter_sets._example:Downloading example parameter set lisflood_fraser to /home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/lisflood_fraser...
INFO:ewatercycle.parameter_sets._example:Download complete.
INFO:ewatercycle.parameter_sets._example:Adding parameterset lisflood_fraser to ewatercycle.CFG...
INFO:ewatercycle.parameter_sets:3 example parameter sets were downloaded
INFO:ewatercycle.config._config_object:Config written to /home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/ewatercycle.yaml
INFO:ewatercycle.parameter_sets:Saved parameter sets to configuration file /home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/ewatercycle.yaml
Example parameter sets have been downloaded and added to the configuration file.
cat ./ewatercycle.yaml
container_engine: null
grdc_location: None
output_dir: None
parameter_sets:
lisflood_fraser:
config: lisflood_fraser/settings_lat_lon-Run.xml
directory: lisflood_fraser
doi: N/A
supported_model_versions: !!set {'20.10': null}
target_model: lisflood
pcrglobwb_rhinemeuse_30min:
config: pcrglobwb_rhinemeuse_30min/setup_natural_test.ini
directory: pcrglobwb_rhinemeuse_30min
doi: N/A
supported_model_versions: !!set {setters: null}
target_model: pcrglobwb
wflow_rhine_sbm_nc:
config: wflow_rhine_sbm_nc/wflow_sbm_NC.ini
directory: wflow_rhine_sbm_nc
doi: N/A
supported_model_versions: !!set {2020.1.1: null}
target_model: wflow
parameterset_dir: /home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets
singularity_dir: None
ewatercycle.parameter_sets.available_parameter_sets()
('lisflood_fraser', 'pcrglobwb_rhinemeuse_30min', 'wflow_rhine_sbm_nc')
parameter_set = ewatercycle.parameter_sets.get_parameter_set('pcrglobwb_rhinemeuse_30min')
print(parameter_set)
Parameter set
-------------
name=pcrglobwb_rhinemeuse_30min
directory=/home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min
config=/home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min/setup_natural_test.ini
doi=N/A
target_model=pcrglobwb
supported_model_versions={'setters'}
The parameter_set
variable can be passed to a model class
constructor.
Prepare other parameter sets
The example parameter sets downloaded in the previous section are nice to show off the platform features but are a bit small.
To perform more advanced experiments, additional parameter sets are needed.
Users could use ewatercycle.parameter_sets.ParameterSet
to construct parameter sets themselves.
Or they can be made available via ewatercycle.parameter_sets.available_parameter_sets()
and ewatercycle.parameter_sets.get_parameter_set()
by extending the configuration file (ewatercycle.yaml).
A new parameter set should be added as a key/value pair in the parameter_sets
map of the configuration file.
The key should be a unique string on the current system.
The value is a dictionary with the following items:
directory: Location on disk where files of the parameter set are stored. If Path is relative then relative to
ewatercycle.CFG['parameterset_dir']
.config: Model configuration file which uses files from directory. If Path is relative then relative to
ewatercycle.CFG['parameterset_dir']
.doi: Persistent identifier of the parameter set. For example a DOI for a Zenodo record.
target_model: Name of the model that parameter set can work with
supported_model_versions: Set of model versions that are supported by this parameter set. If not set then parameter set will be supported by all versions of model
For example the parameter set for PCR-GLOBWB from https://doi.org/10.5281/zenodo.1045339 after downloading and unpacking to /data/pcrglobwb2_input/
could be added with following config:
pcrglobwb_rhinemeuse_30min:
directory: /data/pcrglobwb2_input/global_30min/
config: /data/pcrglobwb2_input/global_30min/iniFileExample/setup_30min_non-natural.ini
doi: https://doi.org/10.5281/zenodo.1045339
target_model: pcrglobwb
supported_model_versions: !!set {setters: null}
Download example forcing
To be able to run the Marrmot example notebooks you need a forcing file.
You can use ewatercycle.forcing.generate()
to make it or use an
already prepared forcing
file.
cd docs/examples
wget https://github.com/wknoben/MARRMoT/raw/master/BMI/Config/BMI_testcase_m01_BuffaloRiver_TN_USA.mat
cd -
Download observation data
Observation data is needed to calculate metrics of the model performance or plot a hydrograph . The ewatercycle package can use Global Runoff Data Centre (GRDC) or U.S. Geological Survey Water Services (USGS) data.
The GRDC daily data files can be ordered at https://www.bafg.de/GRDC/EN/02_srvcs/21_tmsrs/riverdischarge_node.html.
The GRDC files should be stored in ewatercycle.CFG['grdc_location']
directory.
Adding a model
Integrating a new model into the eWaterCycle system involves the following steps:
Create model as subclass of
AbstractModel
(src/ewatercycle/models/abstract.py
)Import model in
src/ewatercycle/models/__init__.py
Add
src/ewatercycle/forcing/<model>.py
Register model in
src/ewatercycle/forcing/__init__.py:FORCING_CLASSES
Add model to
docs/conf.py
Write example notebook
Write tests?
If model needs custom parameter set class add it in
src/ewatercycle/parameter_sets/_<model name>.py
Add example parameter set in
src/ewatercycle/parameter_sets/__init__.py
Add container image to System setup
Add container image to infrastructure data preparation scripts
We will expand this documentation in due time.
Adding a new version of a model
A model can have different versions. A model version in the eWaterCycle Python package corresponds to the tag of Docker image and the version in a Singularity container image filename. The version of the container image should preferably be one of release versions of the model code. Alternativly the version could be the name of a feature branch or a date.
Also parameter sets can be specify which versions of a model they support.
To add a new version of a model involves the following steps:
Create container image
Create Docker container image named
ewatercycle/<model>-grpc4bmi:<version>
with grpc4bmi server running as entrypointHost Docker container image on Docker Hub
Create Singularity image from Docker with
singularity build ./ewatercycle-<model>-grpc4bmi_<version>.sif docker://ewatercycle/<model>-grpc4bmi:<version>
Add to Python package
Add container image to System setup page by editing
docs/system_setup.rst
In
src/ewatercycle/models/<model>.py
add new version to
available_versions
class property.to
__init__()
method add support for new version
Optionally: Add new version to existing example parameter set or add new parameter set in
src/ewatercycle/parameter_sets/_<model>.py:example_parameter_sets()
Add new version to supported parameter sets in local eWaterCycle config file (
/etc/ewatercycle.yaml
and~/.config/ewatercycle/ewatercycle.yaml
)Test it out locally
Create pull request and get it merged
Create new release of Python package. Done by package maintainers
Add to platform
For platform developers and deployers.
Add Singularity image to dCache shared folder
ewcdcache:/singularity-images/<model>-grpc4bmi_<version>.sif
Add container image to infrastructure repository
data preparation scripts
Install version/branch of eWaterCycle Python package with new model version on any running virtual machines
Optionally: Add example parameter set to explorer catalog. The forcing, parameter set and model image should be available on Jupyter server connected to explorer.
Examples
Generate forcing in eWaterCycle with ESMValTool
This notebooks shows how to generate forcing data using ERA5 data and ESMValTool hydrological recipes. More information about data, configuration and installation instructions can be found in the System setup in the eWaterCycle documentation.
1 In:
import logging
import warnings
warnings.filterwarnings("ignore", category=UserWarning)
logger = logging.getLogger("esmvalcore")
logger.setLevel(logging.WARNING)
2 In:
import xarray as xr
import ewatercycle.forcing
Wflow
Generate forcing
Forcing for Wflow is created using the ESMValTool recipe. It produces one file that contains three variables: temperature, precipitation, and potential evapotranspiration. You can set the start and end date, and the region. See eWaterCycle documentation for more information.
To download wflow_dem.map
, see the instructions.
3 In:
wflow_forcing = ewatercycle.forcing.generate(
target_model="wflow",
dataset="ERA5",
start_time="1990-01-01T00:00:00Z",
end_time="1990-12-31T00:00:00Z",
shape="./data/Rhine/Rhine.shp",
model_specific_options={
"dem_file": "./wflow_rhine_sbm_nc/staticmaps/wflow_dem.map",
},
)
{'auxiliary_data_dir': PosixPath('/home/sarah/GitHub/ewatercycle/docs/examples'),
'compress_netcdf': False,
'config_developer_file': None,
'config_file': PosixPath('/home/sarah/.esmvaltool/config-user.yml'),
'drs': {'CMIP5': 'default', 'CMIP6': 'default'},
'exit_on_warning': False,
'log_level': 'debug',
'max_parallel_tasks': 1,
'output_dir': PosixPath('/home/sarah/temp/output'),
'output_file_type': 'png',
'plot_dir': PosixPath('/home/sarah/temp/output/recipe_wflow_20210713_095838/plots'),
'preproc_dir': PosixPath('/home/sarah/temp/output/recipe_wflow_20210713_095838/preproc'),
'profile_diagnostic': False,
'remove_preproc_dir': True,
'rootpath': {'OBS6': [PosixPath('/home/sarah/temp/ForRecipe')]},
'run_dir': PosixPath('/home/sarah/temp/output/recipe_wflow_20210713_095838/run'),
'save_intermediary_cubes': False,
'work_dir': PosixPath('/home/sarah/temp/output/recipe_wflow_20210713_095838/work'),
'write_netcdf': True,
'write_plots': True}
7 In:
print(wflow_forcing)
Forcing data for Wflow
----------------------
Directory: /home/sarah/temp/output/recipe_wflow_20210713_095838/work/wflow_daily/script
Start time: 1990-01-01T00:00:00Z
End time: 1990-12-31T00:00:00Z
Shapefile: None
Additional information for model config:
- netcdfinput: wflow_ERA5_Rhine_1990_1990.nc
- Precipitation: /pr
- Temperature: /tas
- EvapoTranspiration: /pet
- Inflow: None
Plot forcing
8 In:
dataset = xr.load_dataset(f"{wflow_forcing.directory}/{wflow_forcing.netcdfinput}")
print(dataset)
for var in ["pr", "tas", "pet"]:
dataset[var].isel(time=1).plot(cmap="coolwarm", robust=True, size=5)
<xarray.Dataset>
Dimensions: (bnds: 2, lat: 169, lon: 187, time: 365)
Coordinates:
* time (time) datetime64[ns] 1990-01-01T12:00:00 ... 1990-12-31T12:00:00
* lat (lat) float64 52.05 52.02 51.98 51.94 ... 46.0 45.97 45.93 45.89
* lon (lon) float64 5.227 5.264 5.3 5.337 ... 11.94 11.97 12.01 12.05
height float64 2.0
Dimensions without coordinates: bnds
Data variables:
pr (time, lat, lon) float32 0.2794 0.2794 0.2794 ... nan nan nan
time_bnds (time, bnds) datetime64[ns] 1990-01-01 1990-01-02 ... 1991-01-01
lat_bnds (lat, bnds) float64 52.07 52.04 52.04 52.0 ... 45.91 45.91 45.88
lon_bnds (lon, bnds) float64 5.209 5.245 5.245 5.282 ... 12.03 12.03 12.07
tas (time, lat, lon) float32 0.09246 0.07101 0.03317 ... nan nan nan
pet (time, lat, lon) float32 0.5102 0.5103 0.5106 ... nan nan nan
Attributes:
Conventions: CF-1.7
provenance: <?xml version='1.0' encoding='ASCII'?>\n<prov:document xmln...
software: Created with ESMValTool v2.2.0
caption: Forcings for the wflow hydrological model.



PCRGlobWB
Generate forcing
Forcing for PCRGlobWB is created using the ESMValTool recipe. It produces one file per each variable: temperature, and precipitation. You can set the start and end date, and the region. See eWaterCycle documentation for more information.
3 In:
pcrglobwb_forcing = ewatercycle.forcing.generate(
target_model="pcrglobwb",
dataset="ERA5",
start_time="1990-01-01T00:00:00Z",
end_time="1990-12-31T00:00:00Z",
shape="./data/Rhine/Rhine.shp",
model_specific_options={
"start_time_climatology": "1990-01-01T00:00:00Z",
"end_time_climatology": "1990-01-01T00:00:00Z",
},
)
{'auxiliary_data_dir': PosixPath('/home/sarah/GitHub/ewatercycle/docs/examples'),
'compress_netcdf': False,
'config_developer_file': None,
'config_file': PosixPath('/home/sarah/.esmvaltool/config-user.yml'),
'drs': {'CMIP5': 'default', 'CMIP6': 'default'},
'exit_on_warning': False,
'log_level': 'debug',
'max_parallel_tasks': 1,
'output_dir': PosixPath('/home/sarah/temp/output'),
'output_file_type': 'png',
'plot_dir': PosixPath('/home/sarah/temp/output/recipe_pcrglobwb_20210714_152509/plots'),
'preproc_dir': PosixPath('/home/sarah/temp/output/recipe_pcrglobwb_20210714_152509/preproc'),
'profile_diagnostic': False,
'remove_preproc_dir': True,
'rootpath': {'OBS6': [PosixPath('/home/sarah/temp/ForRecipe')]},
'run_dir': PosixPath('/home/sarah/temp/output/recipe_pcrglobwb_20210714_152509/run'),
'save_intermediary_cubes': False,
'work_dir': PosixPath('/home/sarah/temp/output/recipe_pcrglobwb_20210714_152509/work'),
'write_netcdf': True,
'write_plots': True}
Shapefile /home/sarah/GitHub/ewatercycle/docs/examples/data/Rhine/Rhine.shp is not in forcing directory /home/sarah/temp/output/recipe_pcrglobwb_20210714_152509/work/diagnostic_daily/script. So, it won't be saved in /home/sarah/temp/output/recipe_pcrglobwb_20210714_152509/work/diagnostic_daily/script/ewatercycle_forcing.yaml.
4 In:
print(pcrglobwb_forcing)
Forcing data for PCRGlobWB
--------------------------
Directory: /home/sarah/temp/output/recipe_pcrglobwb_20210714_152509/work/diagnostic_daily/script
Start time: 1990-01-01T00:00:00Z
End time: 1990-12-31T00:00:00Z
Shapefile: /home/sarah/GitHub/ewatercycle/docs/examples/data/Rhine/Rhine.shp
Additional information for model config:
- temperatureNC: pcrglobwb_OBS6_ERA5_reanaly_1_day_tas_1990-1990_Rhine.nc
- precipitationNC: pcrglobwb_OBS6_ERA5_reanaly_1_day_pr_1990-1990_Rhine.nc
Plot forcing
8 In:
for file_name in [pcrglobwb_forcing.temperatureNC, pcrglobwb_forcing.precipitationNC]:
dataset = xr.load_dataset(f"{pcrglobwb_forcing.directory}/{file_name}")
print(dataset)
print("------------------------")
var = list(dataset.data_vars.keys())[0]
dataset[var].isel(time=-1).plot(cmap="coolwarm", robust=True, size=5)
<xarray.Dataset>
Dimensions: (bnds: 2, lat: 23, lon: 31, time: 730)
Coordinates:
* time (time) datetime64[ns] 1989-01-01 1989-01-02 ... 1990-12-31
* lat (lat) float32 52.0 51.75 51.5 51.25 ... 47.25 47.0 46.75 46.5
* lon (lon) float32 4.251 4.501 4.751 5.001 ... 11.0 11.25 11.5 11.75
height float64 2.0
Dimensions without coordinates: bnds
Data variables:
tas (time, lat, lon) float32 273.6 273.2 273.0 ... 271.6 268.9 267.0
time_bnds (time, bnds) datetime64[ns] 1988-12-31T12:00:00 ... 1990-12-31...
lat_bnds (lat, bnds) float32 51.88 52.12 51.62 51.88 ... 46.88 46.38 46.62
lon_bnds (lon, bnds) float32 4.125 4.375 4.375 4.625 ... 11.62 11.62 11.88
Attributes:
comment: Contains modified Copernicus Climate Change Service Informa...
Conventions: CF-1.7
provenance: <?xml version='1.0' encoding='ASCII'?>\n<prov:document xmln...
software: Created with ESMValTool v2.2.0
caption: Forcings for the PCR-GLOBWB hydrological model.
------------------------
<xarray.Dataset>
Dimensions: (bnds: 2, lat: 23, lon: 31, time: 730)
Coordinates:
* time (time) datetime64[ns] 1989-01-01 1989-01-02 ... 1990-12-31
* lat (lat) float32 52.0 51.75 51.5 51.25 ... 47.25 47.0 46.75 46.5
* lon (lon) float32 4.251 4.501 4.751 5.001 ... 11.0 11.25 11.5 11.75
Dimensions without coordinates: bnds
Data variables:
pr (time, lat, lon) float32 9.197e-06 2.069e-05 ... 0.0002843
time_bnds (time, bnds) datetime64[ns] 1988-12-31T12:00:00 ... 1990-12-31...
lat_bnds (lat, bnds) float32 51.88 52.12 51.62 51.88 ... 46.88 46.38 46.62
lon_bnds (lon, bnds) float32 4.125 4.375 4.375 4.625 ... 11.62 11.62 11.88
Attributes:
comment: Contains modified Copernicus Climate Change Service Informa...
Conventions: CF-1.7
provenance: <?xml version='1.0' encoding='ASCII'?>\n<prov:document xmln...
software: Created with ESMValTool v2.2.0
caption: Forcings for the PCR-GLOBWB hydrological model.
------------------------


LISFLOOD
Generate forcing
Forcing for LISFLOOD is created using the ESMValTool recipe. It produces one file per each variable: temperature, precipitation, maximum temperature, minimum temperature, u component of wind, v component of wind, surface solar radiation downwards, and dewpoint temperature. Running LISVAP is shown below. For now, LISFLOOD forcing data ‘e0’, ‘es0’ and ‘et0’ are not generated. However, the recipe creates LISVAP input data that can be found in lisflood_forcing.directory
. You can set the start
and end date, and the region. See eWaterCycle documentation for more information.
3 In:
lisflood_forcing = ewatercycle.forcing.generate(
target_model="lisflood",
dataset="ERA5",
start_time="1990-01-01T00:00:00Z",
end_time="1990-12-31T00:00:00Z",
shape="./data/Rhine/Rhine.shp",
)
WARNING:ewatercycle.forcing._lisflood:target_grid was not given, guestimating from shape
WARNING:ewatercycle.forcing._lisflood:Parameter `run_lisvap` is set to False. No forcing data will be generated for 'e0', 'es0' and 'et0'. However, the recipe creates LISVAP input data that can be found in /home/vagrant/ewatercycle/docs/examples/esmvaltool_output/recipe_lisflood_20220330_104829/work/diagnostic_daily/script.
4 In:
print(lisflood_forcing)
eWaterCycle forcing
-------------------
start_time=1990-01-01T00:00:00Z
end_time=1990-12-31T00:00:00Z
directory=/home/vagrant/ewatercycle/docs/examples/esmvaltool_output/recipe_lisflood_20220330_104829/work/diagnostic_daily/script
shape=/home/vagrant/ewatercycle/docs/examples/data/Rhine/Rhine.shp
PrefixPrecipitation=lisflood_ERA5_Rhine_pr_1990_1990.nc
PrefixTavg=lisflood_ERA5_Rhine_tas_1990_1990.nc
PrefixE0=e0.nc
PrefixES0=es0.nc
PrefixET0=et0.nc
Plot forcing
7 In:
lisvap_input_files = [
"lisflood_ERA5_Rhine_e_1990_1990.nc",
"lisflood_ERA5_Rhine_sfcWind_1990_1990.nc",
"lisflood_ERA5_Rhine_rsds_1990_1990.nc",
"lisflood_ERA5_Rhine_tasmax_1990_1990.nc",
"lisflood_ERA5_Rhine_tasmin_1990_1990.nc",
]
for file_name in [
lisflood_forcing.PrefixTavg,
lisflood_forcing.PrefixPrecipitation,
] + lisvap_input_files:
dataset = xr.load_dataset(f"{lisflood_forcing.directory}/{file_name}")
var = list(dataset.data_vars.keys())[0]
dataset[var].isel(time=1).plot(cmap="coolwarm", robust=True, size=5)







Generate forcing using LISVAP
Forcing for LISFLOOD is created using the ESMValTool recipe and LISVAP model. ESMValTool recipe produces one file per variable cropped for the catchment: temperature, precipitation, maximum temperature, minimum temperature, u component of wind, v component of wind, surface solar radiation downwards, and dewpoint temperature. Some of these are LISVAP input data. Then, LISVAP generates forcing data ‘e0’, ‘es0’ and ‘et0’, again one file per variable but with global extents. It also generates global datasets from other forcing files and store them in the forcing directory, see the example below.
Running LISVAP needs some model parameters. Currently, ewatercycle supports only a global parameter-set with a resolution of 01 degrees masked by six catchments (the convex hull shapes). The available version of both LISVAP and LISFLOOD only works with this parameter-set. Therefore, we need to prepare some LISVAP-specific data and pass them to ewatercycle.forcing.generate
function.
Input arguments of LISVAP:
lisvap_config: This is a configuration file in
xml
format e.g.settings_lisvap.xml
. A template file is avialbel in era5-comparison/lisflood/utils/settings_templates and also under its parameter-set directory.mask_map: a mask for the catchment. This file should have a global extent that matches our global parameter-set. We explain below how to get this file.
version: LISVAP/LISFLOOD model version supported by ewatercycle.
parameterset_dir: the directory of the global parameter-set that can be obtained by
ewatercycle.parameter_sets.get_parameter_set
.
Generate convex hull shapefile and mask map:
For our example below, we want to generate a shapefile and a model mask for Doring catchment. The shapefile is passed to ESMValTool recipe whereas the model mask to LISVAP. An auxiliary LISFLOOD file called catchment_masks.nc
is available in eWaterCycle/recipes_auxiliary_datasets.
Here we provided a sample code to compute the convex hull of Doring and save it as a shape file. Note that you need to install the package geopandas
:
import numpy as np
import xarray as xr
import shapely as shply
from geopandas import GeoSeries
masks = xr.open_dataarray("./data/Lisvap/catchment_masks.nc").load()
buffer = 0.05 # degrees lat/lon
# Compute convex hull
lat, lon = [
masks[v].values[np.where(masks.loc["Doring"].values)[i]]
for v, i in zip(["lat", "lon"], [0, 1])
]
hull = GeoSeries(
[shply.geometry.Point(x, y) for x, y in zip(lon, lat)]
).unary_union.convex_hull.buffer(buffer)
# Save it to shapefile
GeoSeries(hull, crs="EPSG:4326").to_file("./data/Lisvap/Doring_convex.shp")
And here we provided a sample code to produce model mask of Doring and save it as a Netcdf file:
import xarray as xr
masks = xr.open_dataarray("./data/Lisvap/catchment_masks.nc").load()
doring = masks.sel(basin="Doring")
doring.to_netcdf("./data/Lisvap/model_mask_doring.nc")
More information is provided by era5-comparison study.
First, we get the global parameter-set of LISFLOOD for ERA5 as an example. parameter_set
provides useful information like directory
and supported_model_versions
:
3 In:
import ewatercycle.parameter_sets
4 In:
parameter_set = ewatercycle.parameter_sets.get_parameter_set(
"lisflood_global-masked_01degree_ERA5"
)
print(parameter_set)
Parameter set
-------------
name=lisflood_global-masked_01degree_ERA5
directory=/gpfs/work1/0/wtrcycle/parameter-sets/lisflood_global-masked_01degree
config=/gpfs/work1/0/wtrcycle/parameter-sets/lisflood_global-masked_01degree/settings_lisflood_ERA5.xml
doi=N/A
target_model=lisflood
supported_model_versions={'20.10'}
Second, we use ewatercycle.forcing.generate
together with model specific options for LISVAP:
5 In:
lisflood_forcing = ewatercycle.forcing.generate(
target_model="lisflood",
dataset="ERA5",
start_time="1990-01-01T00:00:00Z",
end_time="1990-12-31T00:00:00Z",
shape="./data/Lisvap/Doring_convex.shp",
model_specific_options=dict(
run_lisvap=dict(
lisvap_config=f"{parameter_set.directory}/settings_lisvap.xml",
mask_map="./data/Lisvap/model_mask_doring.nc",
version="20.10",
parameterset_dir=parameter_set.directory,
),
),
)
/gpfs/home2/fakhereh/mambaforge-pypy3/envs/ewatercycle/lib/python3.10/site-packages/xarray/core/indexing.py:1234: PerformanceWarning: Slicing is producing a large chunk. To accept the large
chunk and silence this warning, set the option
>>> with dask.config.set(**{'array.slicing.split_large_chunks': False}):
... array[indexer]
To avoid creating the large chunks, set the option
>>> with dask.config.set(**{'array.slicing.split_large_chunks': True}):
... array[indexer]
/gpfs/home2/fakhereh/mambaforge-pypy3/envs/ewatercycle/lib/python3.10/site-packages/xarray/core/indexing.py:1234: PerformanceWarning: Slicing is producing a large chunk. To accept the large
chunk and silence this warning, set the option
>>> with dask.config.set(**{'array.slicing.split_large_chunks': False}):
... array[indexer]
To avoid creating the large chunks, set the option
>>> with dask.config.set(**{'array.slicing.split_large_chunks': True}):
... array[indexer]
/gpfs/home2/fakhereh/mambaforge-pypy3/envs/ewatercycle/lib/python3.10/site-packages/xarray/core/indexing.py:1234: PerformanceWarning: Slicing is producing a large chunk. To accept the large
chunk and silence this warning, set the option
>>> with dask.config.set(**{'array.slicing.split_large_chunks': False}):
... array[indexer]
To avoid creating the large chunks, set the option
>>> with dask.config.set(**{'array.slicing.split_large_chunks': True}):
... array[indexer]
/gpfs/home2/fakhereh/mambaforge-pypy3/envs/ewatercycle/lib/python3.10/site-packages/xarray/core/indexing.py:1234: PerformanceWarning: Slicing is producing a large chunk. To accept the large
chunk and silence this warning, set the option
>>> with dask.config.set(**{'array.slicing.split_large_chunks': False}):
... array[indexer]
To avoid creating the large chunks, set the option
>>> with dask.config.set(**{'array.slicing.split_large_chunks': True}):
... array[indexer]
/gpfs/home2/fakhereh/mambaforge-pypy3/envs/ewatercycle/lib/python3.10/site-packages/xarray/core/indexing.py:1234: PerformanceWarning: Slicing is producing a large chunk. To accept the large
chunk and silence this warning, set the option
>>> with dask.config.set(**{'array.slicing.split_large_chunks': False}):
... array[indexer]
To avoid creating the large chunks, set the option
>>> with dask.config.set(**{'array.slicing.split_large_chunks': True}):
... array[indexer]
/gpfs/home2/fakhereh/mambaforge-pypy3/envs/ewatercycle/lib/python3.10/site-packages/xarray/core/indexing.py:1234: PerformanceWarning: Slicing is producing a large chunk. To accept the large
chunk and silence this warning, set the option
>>> with dask.config.set(**{'array.slicing.split_large_chunks': False}):
... array[indexer]
To avoid creating the large chunks, set the option
>>> with dask.config.set(**{'array.slicing.split_large_chunks': True}):
... array[indexer]
/gpfs/home2/fakhereh/mambaforge-pypy3/envs/ewatercycle/lib/python3.10/site-packages/xarray/core/indexing.py:1234: PerformanceWarning: Slicing is producing a large chunk. To accept the large
chunk and silence this warning, set the option
>>> with dask.config.set(**{'array.slicing.split_large_chunks': False}):
... array[indexer]
To avoid creating the large chunks, set the option
>>> with dask.config.set(**{'array.slicing.split_large_chunks': True}):
... array[indexer]
6 In:
print(lisflood_forcing)
eWaterCycle forcing
-------------------
start_time=1990-01-01T00:00:00Z
end_time=1990-12-31T00:00:00Z
directory=/gpfs/scratch1/shared/ewatercycle/recipe_lisflood_20220218_100026/work/diagnostic_daily/script/global
shape=/gpfs/home2/fakhereh/GitHub/ewatercycle/docs/examples/data/Lisvap/Doring_convex.shp
PrefixPrecipitation=lisflood_ERA5_Doring_convex_pr_1990_1990.nc
PrefixTavg=lisflood_ERA5_Doring_convex_tas_1990_1990.nc
PrefixE0=lisflood_ERA5_Doring_convex_e0_1990_1990.nc
PrefixES0=lisflood_ERA5_Doring_convex_es0_1990_1990.nc
PrefixET0=lisflood_ERA5_Doring_convex_et0_1990_1990.nc
In:
forcing_files = [
lisflood_forcing.PrefixPrecipitation,
lisflood_forcing.PrefixTavg,
lisflood_forcing.PrefixE0,
lisflood_forcing.PrefixES0,
lisflood_forcing.PrefixET0,
]
# Loading global dataset takes a few minutes
for file_name in forcing_files:
dataset = xr.load_dataset(f"{lisflood_forcing.directory}/{file_name}")
var = list(dataset.data_vars.keys())[0]
dataset[var].isel(time=1).sel(lon=slice(18, 22), lat=slice(-31, -34)).plot(
cmap="coolwarm", robust=True, size=5
)
Hype
Forcing for Hype is created using the ESMValTool recipe. It produces one file per each variable: temperature, and precipitation. You can set the start and end date.
3 In:
hype_forcing = ewatercycle.forcing.generate(
target_model="hype",
dataset="ERA5",
start_time="1990-01-01T00:00:00Z",
end_time="1990-12-31T00:00:00Z",
shape="./data/Rhine/Rhine.shp",
)
print(hype_forcing)
WARNING:esmvalcore._recipe:Missing data for fx variable 'areacella' of dataset ERA5
WARNING:esmvalcore._recipe:Missing data for fx variable 'areacello' of dataset ERA5
WARNING:esmvalcore._recipe:Missing data for fx variable 'areacella' of dataset ERA5
WARNING:esmvalcore._recipe:Missing data for fx variable 'areacello' of dataset ERA5
WARNING:esmvalcore._recipe:Missing data for fx variable 'areacella' of dataset ERA5
WARNING:esmvalcore._recipe:Missing data for fx variable 'areacello' of dataset ERA5
WARNING:esmvalcore._recipe:Missing data for fx variable 'areacella' of dataset ERA5
WARNING:esmvalcore._recipe:Missing data for fx variable 'areacello' of dataset ERA5
eWaterCycle forcing
-------------------
start_time=1990-01-01T00:00:00Z
end_time=1990-12-31T00:00:00Z
directory=/home/sarah/temp/esmvaltool_output/recipe_hype_20220607_123122/work/hype/script/ERA5
shape=/home/sarah/GitHub/ewatercycle/docs/examples/data/Rhine/Rhine.shp
Pobs=Pobs.txt
TMAXobs=TMAXobs.txt
TMINobs=TMINobs.txt
Tobs=Tobs.txt
In:
Running LISFLOOD model using eWaterCycle package (on SURF Research Cloud)
This notebooks shows how to run LISFLOOD model. Please note that the lisflood-grpc4bmi docker image in eWaterCycle is compatible only with forcing data and parameter set on eWaterCycle infrastructure like a server on the SURF Research Cloud. More information about data, configuration and installation instructions can be found in the System setup in the eWaterCycle documentation.
1 In:
import logging
import warnings
logger = logging.getLogger("grpc4bmi")
logger.setLevel(logging.WARNING)
warnings.filterwarnings("ignore", category=UserWarning)
2 In:
import pandas as pd
import ewatercycle.forcing
import ewatercycle.models
import ewatercycle.parameter_sets
Load forcing data
For this example notebook, lisflood_ERA-Interim_*_1990_1990.nc
data are copied from /projects/0/wtrcycle/comparison/forcing/lisflood
to /scratch/shared/ewatercycle/lisflood_example/lisflood_forcing_data
. Also the lisvap output files ‘e0’, ‘es0’ and ‘et0’ are generated and stored in the same directory. These data are made by running ESMValTool recipe and lisvap. We can now use those files to run the Lisflood model.
3 In:
forcing = ewatercycle.forcing.load_foreign(
target_model="lisflood",
directory="/mnt/data/forcing/lisflood_ERA5_1990_global-masked",
start_time="1990-01-01T00:00:00Z",
end_time="1990-12-31T00:00:00Z",
forcing_info={
"PrefixPrecipitation": "lisflood_ERA5_pr_1990_1990.nc",
"PrefixTavg": "lisflood_ERA5_tas_1990_1990.nc",
"PrefixE0": "lisflood_ERA5_e0_1990_1990.nc",
"PrefixES0": "lisflood_ERA5_es0_1990_1990.nc",
"PrefixET0": "lisflood_ERA5_et0_1990_1990.nc",
},
)
print(forcing)
eWaterCycle forcing
-------------------
start_time=1990-01-01T00:00:00Z
end_time=1990-12-31T00:00:00Z
directory=/mnt/data/forcing/lisflood_ERA5_1990_global-masked
shape=None
PrefixPrecipitation=lisflood_ERA5_pr_1990_1990.nc
PrefixTavg=lisflood_ERA5_tas_1990_1990.nc
PrefixE0=lisflood_ERA5_e0_1990_1990.nc
PrefixES0=lisflood_ERA5_es0_1990_1990.nc
PrefixET0=lisflood_ERA5_et0_1990_1990.nc
Load parameter set
This example uses parameter set from SURF dCache storage.
4 In:
parameterset = ewatercycle.parameter_sets.ParameterSet(
name="Lisflood01degree_masked",
directory="/mnt/data/parameter-sets/lisflood_global-masked_01degree",
config="/mnt/data/parameter-sets/lisflood_global-masked_01degree/settings_lisflood_ERA5.xml",
target_model="lisflood",
)
print(parameterset)
Parameter set
-------------
name=Lisflood01degree_masked
directory=/mnt/data/parameter-sets/lisflood_global-masked_01degree
config=/mnt/data/parameter-sets/lisflood_global-masked_01degree/settings_lisflood_ERA5.xml
doi=N/A
target_model=lisflood
supported_model_versions=set()
Set up the model
To create the model object, we need to select a version.
5 In:
ewatercycle.models.Lisflood.available_versions
5 Out:
('20.10',)
6 In:
model = ewatercycle.models.Lisflood(
version="20.10", parameter_set=parameterset, forcing=forcing
)
print(model)
Model version 20.10 is not explicitly listed in the supported model versions of this parameter set. This can lead to compatibility issues.
eWaterCycle Lisflood
-------------------
Version = 20.10
Parameter set =
Parameter set
-------------
name=Lisflood01degree_masked
directory=/mnt/data/parameter-sets/lisflood_global-masked_01degree
config=/mnt/data/parameter-sets/lisflood_global-masked_01degree/settings_lisflood_ERA5.xml
doi=N/A
target_model=lisflood
supported_model_versions=set()
Forcing =
eWaterCycle forcing
-------------------
start_time=1990-01-01T00:00:00Z
end_time=1990-12-31T00:00:00Z
directory=/mnt/data/forcing/lisflood_ERA5_1990_global-masked
shape=None
PrefixPrecipitation=lisflood_ERA5_pr_1990_1990.nc
PrefixTavg=lisflood_ERA5_tas_1990_1990.nc
PrefixE0=lisflood_ERA5_e0_1990_1990.nc
PrefixES0=lisflood_ERA5_es0_1990_1990.nc
PrefixET0=lisflood_ERA5_et0_1990_1990.nc
7 In:
model.parameters
7 Out:
[('IrrigationEfficiency', '0.75'),
('MaskMap', '/data/input/areamaps/model_mask'),
('start_time', '1990-01-01T00:00:00Z'),
('end_time', '1990-12-31T00:00:00Z')]
Setup model with model_mask, IrrigationEfficiency of 0.8 instead of 0.75 and an earlier end time, making total model time just 1 month.
8 In:
model_mask = "/mnt/data/climate-data/aux/LISFLOOD/model_mask.nc"
config_file, config_dir = model.setup(
IrrigationEfficiency="0.8", end_time="1990-1-31T00:00:00Z", MaskMap=model_mask
)
print(config_file)
print(config_dir)
Running /mnt/data/singularity-images/ewatercycle-lisflood-grpc4bmi_20.10.sif singularity container on port 41619
/home/vagrant/ewatercycle/docs/examples/ewatercycle_output/lisflood_20210930_093520/lisflood_setting.xml
/home/vagrant/ewatercycle/docs/examples/ewatercycle_output/lisflood_20210930_093520
9 In:
model.parameters
9 Out:
[('IrrigationEfficiency', '0.8'),
('MaskMap', '/mnt/data/climate-data/aux/LISFLOOD/model_mask'),
('start_time', '1990-01-01T00:00:00Z'),
('end_time', '1990-01-31T00:00:00Z')]
Initialize the model with the config file:
10 In:
model.initialize(config_file)
Get model variable names
11 In:
model.output_var_names
11 Out:
('Discharge',)
Run the model
Store simulated values at one target location until model end time. In this example, we use the coordinates of Merrimack observation station as the target coordinates.
12 In:
target_longitude = [-71.35]
target_latitude = [42.64]
target_discharge = []
time_range = []
end_time = model.end_time
while model.time < end_time:
model.update()
target_discharge.append(
model.get_value_at_coords(
"Discharge", lon=target_longitude, lat=target_latitude
)[0]
)
time_range.append(model.time_as_datetime.date())
print(model.time_as_isostr)
1990-01-03T00:00:00Z
1990-01-04T00:00:00Z
1990-01-05T00:00:00Z
1990-01-06T00:00:00Z
1990-01-07T00:00:00Z
1990-01-08T00:00:00Z
1990-01-09T00:00:00Z
1990-01-10T00:00:00Z
1990-01-11T00:00:00Z
1990-01-12T00:00:00Z
1990-01-13T00:00:00Z
1990-01-14T00:00:00Z
1990-01-15T00:00:00Z
1990-01-16T00:00:00Z
1990-01-17T00:00:00Z
1990-01-18T00:00:00Z
1990-01-19T00:00:00Z
1990-01-20T00:00:00Z
1990-01-21T00:00:00Z
1990-01-22T00:00:00Z
1990-01-23T00:00:00Z
1990-01-24T00:00:00Z
1990-01-25T00:00:00Z
1990-01-26T00:00:00Z
1990-01-27T00:00:00Z
1990-01-28T00:00:00Z
1990-01-29T00:00:00Z
1990-01-30T00:00:00Z
1990-01-31T00:00:00Z
Store simulated values for all locations of the model grid at end time.
13 In:
discharge = model.get_value_as_xarray("Discharge")
14 In:
model.finalize()
Inspect the results
The discharge time series at Merrimack observation station:
15 In:
simulated_target_discharge = pd.DataFrame(
{"simulation": target_discharge}, index=pd.to_datetime(time_range)
)
simulated_target_discharge.plot(figsize=(12, 8))
15 Out:
<AxesSubplot:>

The lisflood output has a global extent. In this example, we plot the discharge values in Merrimack catchment and at the last time step.
16 In:
lc = discharge.coords["longitude"]
la = discharge.coords["latitude"]
discharge_map = discharge.loc[
dict(longitude=lc[(lc > -73) & (lc < -70)], latitude=la[(la > 42) & (la < 45)])
].plot(robust=True, cmap="GnBu", figsize=(12, 8))
discharge_map.axes.scatter(
target_longitude, target_latitude, s=250, c="r", marker="x", lw=2
)
16 Out:
<matplotlib.collections.PathCollection at 0x7fa2c01215b0>

In:
Running MARRMoT M01 model using eWaterCycle package
This notebooks shows how to run MARRMoT M01 model using an example use-case. More information about data, configuration and installation instructions can be found in the System setup in the eWaterCycle documentation.
1 In:
import warnings
warnings.filterwarnings("ignore", category=UserWarning)
1 In:
import pandas as pd
import ewatercycle.forcing
import ewatercycle.models
Load forcing data
To download the example forcing file BMI_testcase_m01_BuffaloRiver_TN_USA.mat
, see this instruction.
2 In:
forcing = ewatercycle.forcing.load_foreign(
"marrmot",
directory=".",
start_time="1989-01-01T00:00:00Z",
end_time="1992-12-31T00:00:00Z",
forcing_info={"forcing_file": "BMI_testcase_m01_BuffaloRiver_TN_USA.mat"},
)
print(forcing)
eWaterCycle forcing
-------------------
start_time=1989-01-01T00:00:00Z
end_time=1992-12-31T00:00:00Z
directory=/home/sarah/GitHub/ewatercycle/docs/examples
shape=None
forcing_file=BMI_testcase_m01_BuffaloRiver_TN_USA.mat
Set up the model
To create the model object, we need to select a version.
3 In:
ewatercycle.models.MarrmotM01.available_versions
3 Out:
('2020.11',)
4 In:
model = ewatercycle.models.MarrmotM01(version="2020.11", forcing=forcing)
print(model)
eWaterCycle MarrmotM01
-------------------
Version = 2020.11
Parameter set =
None
Forcing =
eWaterCycle forcing
-------------------
start_time=1989-01-01T00:00:00Z
end_time=1992-12-31T00:00:00Z
directory=/home/sarah/GitHub/ewatercycle/docs/examples
shape=None
forcing_file=BMI_testcase_m01_BuffaloRiver_TN_USA.mat
5 In:
model.parameters
5 Out:
[('maximum_soil_moisture_storage', 10.0),
('initial_soil_moisture_storage', 5.0),
('solver',
Solver(name='createOdeApprox_IE', resnorm_tolerance=array([0.1]), resnorm_maxiter=array([6.]))),
('start time', '1989-01-01T00:00:00Z'),
('end time', '1992-12-31T00:00:00Z')]
Setup model with maximum soil moisture storage of 12.0 instead of 10.0 and an earlier end time, making total model time just 1 month.
6 In:
cfg_file, cfg_dir = model.setup(
maximum_soil_moisture_storage=12.0,
end_time="1989-02-01T00:00:00Z",
)
print(cfg_file)
print(cfg_dir)
/home/sarah/GitHub/ewatercycle/docs/examples/marrmot_20210712_135130/marrmot-m01_config.mat
/home/sarah/GitHub/ewatercycle/docs/examples/marrmot_20210712_135130
7 In:
model.parameters
7 Out:
[('maximum_soil_moisture_storage', 12.0),
('initial_soil_moisture_storage', 5.0),
('solver',
Solver(name='createOdeApprox_IE', resnorm_tolerance=array([0.1]), resnorm_maxiter=array([6.]))),
('start time', '1989-01-01T00:00:00Z'),
('end time', '1989-02-01T00:00:00Z')]
Initialize the model with the config file:
8 In:
model.initialize(cfg_file)
Get model variable names, only flux_out_Q
is supported for now.
9 In:
model.output_var_names
9 Out:
('P',
'T',
'Ep',
'S(t)',
'par',
'sol_resnorm_tolerance',
'sol_resnorm_maxiter',
'flux_out_Q',
'flux_out_Ea',
'wb')
Run the model
10 In:
discharge = []
time_range = []
end_time = model.end_time
while model.time < end_time:
model.update()
discharge.append(model.get_value("flux_out_Q")[0])
time_range.append(model.time_as_datetime.date())
print(model.time_as_isostr)
1989-01-02T00:00:00Z
1989-01-03T00:00:00Z
1989-01-04T00:00:00Z
1989-01-05T00:00:00Z
1989-01-06T00:00:00Z
1989-01-07T00:00:00Z
1989-01-08T00:00:00Z
1989-01-09T00:00:00Z
1989-01-10T00:00:00Z
1989-01-11T00:00:00Z
1989-01-12T00:00:00Z
1989-01-13T00:00:00Z
1989-01-14T00:00:00Z
1989-01-15T00:00:00Z
1989-01-16T00:00:00Z
1989-01-17T00:00:00Z
1989-01-18T00:00:00Z
1989-01-19T00:00:00Z
1989-01-20T00:00:00Z
1989-01-21T00:00:00Z
1989-01-22T00:00:00Z
1989-01-23T00:00:00Z
1989-01-24T00:00:00Z
1989-01-25T00:00:00Z
1989-01-26T00:00:00Z
1989-01-27T00:00:00Z
1989-01-28T00:00:00Z
1989-01-29T00:00:00Z
1989-01-30T00:00:00Z
1989-01-31T00:00:00Z
1989-02-01T00:00:00Z
11 In:
model.finalize()
Inspect the results
12 In:
simulated_discharge = pd.DataFrame(
{"simulation": discharge}, index=pd.to_datetime(time_range)
)
13 In:
simulated_discharge.plot(figsize=(12, 8))
13 Out:
<AxesSubplot:>

Running MARRMoT M14 model using eWaterCycle package
This notebooks shows how to run MARRMoT M14 model using an example use-case. More information about data, configuration and installation instructions can be found in the System setup in the eWaterCycle documentation.
1 In:
import warnings
warnings.filterwarnings("ignore", category=UserWarning)
1 In:
import pandas as pd
import ewatercycle.forcing
import ewatercycle.models
Load forcing data
To download the example forcing file BMI_testcase_m01_BuffaloRiver_TN_USA.mat
, see this instruction.
2 In:
forcing = ewatercycle.forcing.load_foreign(
"marrmot",
directory=".",
start_time="1989-01-01T00:00:00Z",
end_time="1992-12-31T00:00:00Z",
forcing_info={"forcing_file": "BMI_testcase_m01_BuffaloRiver_TN_USA.mat"},
)
print(forcing)
eWaterCycle forcing
-------------------
start_time=1989-01-01T00:00:00Z
end_time=1992-12-31T00:00:00Z
directory=/home/sarah/GitHub/ewatercycle/docs/examples
shape=None
forcing_file=BMI_testcase_m01_BuffaloRiver_TN_USA.mat
Set up the model
To create the model object, we need to select a version.
3 In:
ewatercycle.models.MarrmotM14.available_versions
3 Out:
('2020.11',)
4 In:
model = ewatercycle.models.MarrmotM14(version="2020.11", forcing=forcing)
print(model)
The length of parameters in forcing /home/sarah/GitHub/ewatercycle/docs/examples/BMI_testcase_m01_BuffaloRiver_TN_USA.mat does not match the length of M14 parameters that is seven.
The length of initial stores in forcing /home/sarah/GitHub/ewatercycle/docs/examples/BMI_testcase_m01_BuffaloRiver_TN_USA.mat does not match the length of M14 iniatial stores that is two.
eWaterCycle MarrmotM14
-------------------
Version = 2020.11
Parameter set =
None
Forcing =
eWaterCycle forcing
-------------------
start_time=1989-01-01T00:00:00Z
end_time=1992-12-31T00:00:00Z
directory=/home/sarah/GitHub/ewatercycle/docs/examples
shape=None
forcing_file=BMI_testcase_m01_BuffaloRiver_TN_USA.mat
5 In:
model.parameters
5 Out:
[('maximum_soil_moisture_storage', 1000.0),
('threshold_flow_generation_evap_change', 0.5),
('leakage_saturated_zone_flow_coefficient', 0.5),
('zero_deficit_base_flow_speed', 100.0),
('baseflow_coefficient', 0.5),
('gamma_distribution_chi_parameter', 4.25),
('gamma_distribution_phi_parameter', 2.5),
('initial_upper_zone_storage', 900.0),
('initial_saturated_zone_storage', 900.0),
('solver',
Solver(name='createOdeApprox_IE', resnorm_tolerance=array([0.1]), resnorm_maxiter=array([6.]))),
('start time', '1989-01-01T00:00:00Z'),
('end time', '1992-12-31T00:00:00Z')]
Setup model with maximum soil moisture storage of 12.0 instead of 10.0 and an earlier end time, making total model time just 1 month.
6 In:
cfg_file, cfg_dir = model.setup(
maximum_soil_moisture_storage=12.0,
end_time="1989-02-01T00:00:00Z",
)
print(cfg_file)
print(cfg_dir)
/home/sarah/GitHub/ewatercycle/docs/examples/marrmot_20210712_135152/marrmot-m14_config.mat
/home/sarah/GitHub/ewatercycle/docs/examples/marrmot_20210712_135152
7 In:
model.parameters
7 Out:
[('maximum_soil_moisture_storage', 12.0),
('threshold_flow_generation_evap_change', 0.5),
('leakage_saturated_zone_flow_coefficient', 0.5),
('zero_deficit_base_flow_speed', 100.0),
('baseflow_coefficient', 0.5),
('gamma_distribution_chi_parameter', 4.25),
('gamma_distribution_phi_parameter', 2.5),
('initial_upper_zone_storage', 900.0),
('initial_saturated_zone_storage', 900.0),
('solver',
Solver(name='createOdeApprox_IE', resnorm_tolerance=array([0.1]), resnorm_maxiter=array([6.]))),
('start time', '1989-01-01T00:00:00Z'),
('end time', '1989-02-01T00:00:00Z')]
Initialize the model with the config file:
8 In:
model.initialize(cfg_file)
Get model variable names, only flux_out_Q
is supported for now.
9 In:
model.output_var_names
9 Out:
('P',
'T',
'Ep',
'S(t)',
'par',
'sol_resnorm_tolerance',
'sol_resnorm_maxiter',
'flux_out_Q',
'flux_out_Ea',
'wb')
Run the model
10 In:
discharge = []
time_range = []
end_time = model.end_time
while model.time < end_time:
model.update()
discharge.append(model.get_value("flux_out_Q")[0])
time_range.append(model.time_as_datetime.date())
print(model.time_as_isostr)
1989-01-02T00:00:00Z
1989-01-03T00:00:00Z
1989-01-04T00:00:00Z
1989-01-05T00:00:00Z
1989-01-06T00:00:00Z
1989-01-07T00:00:00Z
1989-01-08T00:00:00Z
1989-01-09T00:00:00Z
1989-01-10T00:00:00Z
1989-01-11T00:00:00Z
1989-01-12T00:00:00Z
1989-01-13T00:00:00Z
1989-01-14T00:00:00Z
1989-01-15T00:00:00Z
1989-01-16T00:00:00Z
1989-01-17T00:00:00Z
1989-01-18T00:00:00Z
1989-01-19T00:00:00Z
1989-01-20T00:00:00Z
1989-01-21T00:00:00Z
1989-01-22T00:00:00Z
1989-01-23T00:00:00Z
1989-01-24T00:00:00Z
1989-01-25T00:00:00Z
1989-01-26T00:00:00Z
1989-01-27T00:00:00Z
1989-01-28T00:00:00Z
1989-01-29T00:00:00Z
1989-01-30T00:00:00Z
1989-01-31T00:00:00Z
1989-02-01T00:00:00Z
11 In:
model.finalize()
Inspect the results
12 In:
simulated_discharge = pd.DataFrame(
{"simulation": discharge}, index=pd.to_datetime(time_range)
)
13 In:
simulated_discharge.plot(figsize=(12, 8))
13 Out:
<AxesSubplot:>

PCRGlobWB example use case
This example shows how the PCRGlobWB model can be used within the eWaterCycle system. It is based on the example use case from https://github.com/UU-Hydro/PCR-GLOBWB_input_example.
This example use case assumes that the ewatercycle platform has been installed and configured on your system. See our system setup documentation for instructions if this is not the case.
1 In:
# This cell is only used to suppress some distracting output messages
import warnings
warnings.filterwarnings("ignore", category=UserWarning)
2 In:
import matplotlib.pyplot as plt
from cartopy import crs
from cartopy import feature as cfeature
import ewatercycle.forcing
import ewatercycle.models
import ewatercycle.parameter_sets
Loading a parameter set
A set of (example) parameter sets come pre-installed on the eWaterCycle system (see system setup if this is not the case).
3 In:
ewatercycle.parameter_sets.available_parameter_sets()
3 Out:
('lisflood_fraser', 'pcrglobwb_rhinemeuse_30min', 'wflow_rhine_sbm_nc')
Existing parametersets can easily be loaded:
4 In:
parameter_set = ewatercycle.parameter_sets.get_parameter_set(
"pcrglobwb_rhinemeuse_30min"
)
print(parameter_set)
Parameter set
-------------
name=pcrglobwb_rhinemeuse_30min
directory=/home/peter/ewatercycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min
config=/home/peter/ewatercycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min/setup_natural_test.ini
doi=N/A
target_model=pcrglobwb
supported_model_versions={'setters'}
It is also possible to load a custom parameterset by passing in the relevant parameters directly:
5 In:
custom_parameter_set = ewatercycle.parameter_sets.ParameterSet(
name="custom_parameter_set",
directory="/home/peter/ewatercycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min",
config="/home/peter/ewatercycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min/setup_natural_test.ini",
target_model="pcrglobwb",
supported_model_versions={"setters"},
)
Load forcing data
For this example case, the forcing is already included in the parameter set and configured correctly. Therefore in principle this step can be skipped. However, for the purpose of illustration, we show how the forcing would be loaded using the ewatercycle.forcing
module, as if it came from another source. To learn about forcing generation, see our preprocessing examples.
6 In:
forcing = ewatercycle.forcing.load_foreign(
target_model="pcrglobwb",
start_time="2001-01-01T00:00:00Z",
end_time="2010-12-31T00:00:00Z",
directory="./parameter-sets/pcrglobwb_rhinemeuse_30min/forcing",
shape=None, # if available, it can be used e.g. for plotting
forcing_info=dict(
# model-specific options
precipitationNC="precipitation_2001to2010.nc",
temperatureNC="temperature_2001to2010.nc",
),
)
print(forcing)
Forcing data for PCRGlobWB
--------------------------
Directory: /home/peter/ewatercycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min/forcing
Start time: 2001-01-01T00:00:00Z
End time: 2010-12-31T00:00:00Z
Shapefile: None
Additional information for model config:
- temperatureNC: temperature_2001to2010.nc
- precipitationNC: precipitation_2001to2010.nc
Setting up the model
Note that the model version and the parameterset versions should be compatible.
7 In:
ewatercycle.models.PCRGlobWB.available_versions
7 Out:
('setters',)
8 In:
pcrglob = ewatercycle.models.PCRGlobWB(
version="setters", parameter_set=parameter_set, forcing=forcing
)
print(pcrglob)
eWaterCycle PCRGlobWB
-------------------
Version = setters
Parameter set =
Parameter set
-------------
name=pcrglobwb_rhinemeuse_30min
directory=/home/peter/ewatercycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min
config=/home/peter/ewatercycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min/setup_natural_test.ini
doi=N/A
target_model=pcrglobwb
supported_model_versions={'setters'}
Forcing =
Forcing data for PCRGlobWB
--------------------------
Directory: /home/peter/ewatercycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min/forcing
Start time: 2001-01-01T00:00:00Z
End time: 2010-12-31T00:00:00Z
Shapefile: None
Additional information for model config:
- temperatureNC: temperature_2001to2010.nc
- precipitationNC: precipitation_2001to2010.nc
eWaterCycle exposes a selected set of configurable parameters. These can be modified in the setup()
method.
9 In:
pcrglob.parameters
9 Out:
[('start_time', '2001-01-01T00:00:00Z'),
('end_time', '2001-01-01T00:00:00Z'),
('routing_method', 'accuTravelTime'),
('max_spinups_in_years', '20')]
Calling setup()
will start up a docker or singularity container. Be careful with calling it multiple times!
10 In:
cfg_file, cfg_dir = pcrglob.setup(
end_time="2001-02-28T00:00:00Z", max_spinups_in_years=5
)
cfg_file, cfg_dir
Running /home/peter/ewatercycle/ewatercycle/ewatercycle-pcrg-grpc4bmi-setters.sif singularity container on port 50639
10 Out:
('/home/peter/ewatercycle/ewatercycle/docs/examples/pcrglobwb_20210714_141432/pcrglobwb_ewatercycle.ini',
'/home/peter/ewatercycle/ewatercycle/docs/examples/pcrglobwb_20210714_141432')
11 In:
pcrglob.parameters
11 Out:
[('start_time', '2001-01-01T00:00:00Z'),
('end_time', '2001-02-28T00:00:00Z'),
('routing_method', 'accuTravelTime'),
('max_spinups_in_years', '5')]
Note that the parameters have been changed. A new config file which incorporates these updated parameters has been generated as well. If you want to see or modify any additional model settings, you can acces this file directly. When you’re ready, pass the path to the config file to initialize()
.
12 In:
pcrglob.initialize(cfg_file)
Running the model
Simply running the model from start to end is straightforward. At each time step we can retrieve information from the model.
13 In:
while pcrglob.time < pcrglob.end_time:
print(pcrglob.time_as_isostr, end="\r")
pcrglob.update()
2001-02-27T00:00:00Z
Interacting with the model
PCRGlobWB exposes many variables. Just a few of them are shown here:
14 In:
list(pcrglob.output_var_names)[-15:-5]
14 Out:
('total_abstraction',
'livestockWaterWithdrawalVolume',
'desalination_source_abstraction',
'discharge',
'temperature',
'upper_soil_transpiration',
'snow_water_equivalent',
'total_runoff',
'transpiration_from_irrigation',
'fraction_of_surface_water')
Model fields can be fetched as xarray objects (or as flat numpy arrays using get_value()
):
15 In:
da = pcrglob.get_value_as_xarray("discharge")
da.thin(5) # only show every 5th value in each dim
15 Out:
<xarray.DataArray 'discharge' (latitude: 3, longitude: 4)> array([[ nan, nan, nan, nan], [ nan, 74.54685211, 10.38944435, nan], [ nan, 188.07923889, nan, nan]]) Coordinates: * longitude (longitude) float64 3.75 6.25 8.75 11.25 * latitude (latitude) float64 46.25 48.75 51.25 time object 2001-02-28 00:00:00 Attributes: units: m3.s-1
- latitude: 3
- longitude: 4
- nan nan nan nan nan 74.55 10.39 nan nan 188.1 nan nan
array([[ nan, nan, nan, nan], [ nan, 74.54685211, 10.38944435, nan], [ nan, 188.07923889, nan, nan]])
- longitude(longitude)float643.75 6.25 8.75 11.25
array([ 3.75, 6.25, 8.75, 11.25])
- latitude(latitude)float6446.25 48.75 51.25
array([46.25, 48.75, 51.25])
- time()object2001-02-28 00:00:00
array(cftime.DatetimeGregorian(2001, 2, 28, 0, 0, 0, 0), dtype=object)
- units :
- m3.s-1
Xarray makes it very easy to plot the data. In the figure below, we add three points that we will use to illustrate that we can also access individual grid cells.
16 In:
fig = plt.figure(dpi=120)
ax = fig.add_subplot(111, projection=crs.PlateCarree())
da.plot(ax=ax, cmap="GnBu")
# Overlay ocean and coastines
ax.add_feature(cfeature.OCEAN)
ax.add_feature(cfeature.RIVERS, color="k")
ax.coastlines()
# Add some verification points
target_longitudes = [7.8, 10.2, 11]
target_latitudes = [50.3, 49.8, 47]
ax.scatter(target_longitudes, target_latitudes, s=250, c="r", marker="x", lw=2)
16 Out:
<matplotlib.collections.PathCollection at 0x7f636aa5c4f0>

We can get (or set) the values at custom points as well:
17 In:
pcrglob.get_value_at_coords("discharge", lon=target_longitudes, lat=target_latitudes)
17 Out:
array([713.2911377 , 84.76369476, nan])
Cleaning up
Models usually perform some “wrap up tasks” at the end of a model run, such as writing the last outputs to disk and releasing memory. In the case of eWaterCycle, another important teardown task is destroying the docker or singularity container in which the model was running. This can free up a lot of resources on your system. Therefore it is good practice to always call finalize()
when you’re done with an experiment.
18 In:
pcrglob.finalize()
Running Wflow using the ewatercycle system
This notebooks shows how to run Wflow model using an example use-case. More information about data, configuration and installation instructions can be found in the System setup chapter in the eWaterCycle documentation.
1 In:
import logging
import warnings
warnings.filterwarnings("ignore", category=UserWarning)
logging.basicConfig(level=logging.WARN)
2 In:
import ewatercycle.forcing
import ewatercycle.models
import ewatercycle.parameter_sets
Setting up the model
The model needs a parameter set and forcing. The parameter set can be gotten from the available parameters sets on the system and the forcing can derived from the parameter set.
3 In:
parameter_set = ewatercycle.parameter_sets.get_parameter_set("wflow_rhine_sbm_nc")
print(parameter_set)
Parameter set
-------------
name=wflow_rhine_sbm_nc
directory=/home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/wflow_rhine_sbm_nc
config=/home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/wflow_rhine_sbm_nc/wflow_sbm_NC.ini
doi=N/A
target_model=wflow
supported_model_versions={'2020.1.2', '2020.1.1'}
4 In:
forcing = ewatercycle.forcing.load_foreign(
directory=str(parameter_set.directory),
target_model=parameter_set.target_model,
start_time="1991-01-01T00:00:00Z",
end_time="1991-12-31T00:00:00Z",
forcing_info=dict(
# Additional information about the external forcing data needed for the model configuration
netcdfinput="inmaps.nc",
Precipitation="/P",
EvapoTranspiration="/PET",
Temperature="/TEMP",
),
)
print(forcing)
Forcing data for Wflow
----------------------
Directory: /home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/wflow_rhine_sbm_nc
Start time: 1991-01-01T00:00:00Z
End time: 1991-12-31T00:00:00Z
Shapefile: None
Additional information for model config:
- netcdfinput: inmaps.nc
- Precipitation: /P
- Temperature: /TEMP
- EvapoTranspiration: /PET
- Inflow: None
Pick a version of Wflow model, so the right model code can be executed which understands the parameter set and forcing.
5 In:
ewatercycle.models.Wflow.available_versions
5 Out:
('2020.1.1', '2020.1.2')
6 In:
model = ewatercycle.models.Wflow(
version="2020.1.2", parameter_set=parameter_set, forcing=forcing
)
WARNING:ewatercycle.models.wflow:Config file from parameter set is missing API section, adding section
WARNING:ewatercycle.models.wflow:Config file from parameter set is missing RiverRunoff option in API section, added it with value '2, m/s option'
7 In:
print(model)
eWaterCycle Wflow
-------------------
Version = 2020.1.2
Parameter set =
Parameter set
-------------
name=wflow_rhine_sbm_nc
directory=/home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/wflow_rhine_sbm_nc
config=/home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/wflow_rhine_sbm_nc/wflow_sbm_NC.ini
doi=N/A
target_model=wflow
supported_model_versions={'2020.1.2', '2020.1.1'}
Forcing =
Forcing data for Wflow
----------------------
Directory: /home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/wflow_rhine_sbm_nc
Start time: 1991-01-01T00:00:00Z
End time: 1991-12-31T00:00:00Z
Shapefile: None
Additional information for model config:
- netcdfinput: inmaps.nc
- Precipitation: /P
- Temperature: /TEMP
- EvapoTranspiration: /PET
- Inflow: None
The pre-configured parameters are shown below and can be overwritten with setup()
8 In:
model.parameters
8 Out:
[('start_time', '1991-01-01T00:00:00Z'), ('end_time', '1991-12-31T00:00:00Z')]
9 In:
cfg_file, cfg_dir = model.setup(end_time="1991-02-28T00:00:00Z")
10 In:
print(cfg_file)
print(cfg_dir)
/home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/output/wflow_20211008_084304/wflow_ewatercycle.ini
/home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/output/wflow_20211008_084304
The config file can be edited, but for now we will initialize the model with the config file as is
11 In:
model.initialize(cfg_file)
Running the model
12 In:
while model.time < model.end_time:
model.update()
print(model.time_as_isostr)
1991-01-01T00:00:00Z
1991-01-02T00:00:00Z
1991-01-03T00:00:00Z
1991-01-04T00:00:00Z
1991-01-05T00:00:00Z
1991-01-06T00:00:00Z
1991-01-07T00:00:00Z
1991-01-08T00:00:00Z
1991-01-09T00:00:00Z
1991-01-10T00:00:00Z
1991-01-11T00:00:00Z
1991-01-12T00:00:00Z
1991-01-13T00:00:00Z
1991-01-14T00:00:00Z
1991-01-15T00:00:00Z
1991-01-16T00:00:00Z
1991-01-17T00:00:00Z
1991-01-18T00:00:00Z
1991-01-19T00:00:00Z
1991-01-20T00:00:00Z
1991-01-21T00:00:00Z
1991-01-22T00:00:00Z
1991-01-23T00:00:00Z
1991-01-24T00:00:00Z
1991-01-25T00:00:00Z
1991-01-26T00:00:00Z
1991-01-27T00:00:00Z
1991-01-28T00:00:00Z
1991-01-29T00:00:00Z
1991-01-30T00:00:00Z
1991-01-31T00:00:00Z
1991-02-01T00:00:00Z
1991-02-02T00:00:00Z
1991-02-03T00:00:00Z
1991-02-04T00:00:00Z
1991-02-05T00:00:00Z
1991-02-06T00:00:00Z
1991-02-07T00:00:00Z
1991-02-08T00:00:00Z
1991-02-09T00:00:00Z
1991-02-10T00:00:00Z
1991-02-11T00:00:00Z
1991-02-12T00:00:00Z
1991-02-13T00:00:00Z
1991-02-14T00:00:00Z
1991-02-15T00:00:00Z
1991-02-16T00:00:00Z
1991-02-17T00:00:00Z
1991-02-18T00:00:00Z
1991-02-19T00:00:00Z
1991-02-20T00:00:00Z
1991-02-21T00:00:00Z
1991-02-22T00:00:00Z
1991-02-23T00:00:00Z
1991-02-24T00:00:00Z
1991-02-25T00:00:00Z
1991-02-26T00:00:00Z
1991-02-27T00:00:00Z
1991-02-28T00:00:00Z
Inspect the results
The RiverRunnoff values of the current model state can be fetched as a xarray dataset.
13 In:
da = model.get_value_as_xarray("RiverRunoff")
da
13 Out:
<xarray.DataArray 'RiverRunoff' (latitude: 169, longitude: 187)> array([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]]) Coordinates: * longitude (longitude) float64 5.227 5.264 5.3 5.337 ... 11.97 12.01 12.05 * latitude (latitude) float64 45.89 45.93 45.97 46.0 ... 51.98 52.02 52.05 time object 1991-02-28 00:00:00 Attributes: units: m/s
- latitude: 169
- longitude: 187
- 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
array([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]])
- longitude(longitude)float645.227 5.264 5.3 ... 12.01 12.05
array([ 5.227163, 5.26383 , 5.300497, 5.337163, 5.37383 , 5.410497, 5.447163, 5.48383 , 5.520497, 5.557163, 5.59383 , 5.630497, 5.667163, 5.70383 , 5.740497, 5.777164, 5.81383 , 5.850497, 5.887163, 5.92383 , 5.960497, 5.997163, 6.03383 , 6.070497, 6.107163, 6.14383 , 6.180497, 6.217164, 6.25383 , 6.290497, 6.327163, 6.36383 , 6.400496, 6.437163, 6.47383 , 6.510497, 6.547163, 6.58383 , 6.620497, 6.657163, 6.69383 , 6.730497, 6.767163, 6.80383 , 6.840497, 6.877163, 6.91383 , 6.950497, 6.987164, 7.02383 , 7.060497, 7.097163, 7.13383 , 7.170496, 7.207163, 7.24383 , 7.280497, 7.317163, 7.35383 , 7.390497, 7.427163, 7.46383 , 7.500497, 7.537163, 7.57383 , 7.610497, 7.647163, 7.68383 , 7.720497, 7.757164, 7.79383 , 7.830497, 7.867163, 7.90383 , 7.940496, 7.977163, 8.01383 , 8.050497, 8.087163, 8.12383 , 8.160497, 8.197164, 8.23383 , 8.270496, 8.307163, 8.34383 , 8.380497, 8.417163, 8.45383 , 8.490497, 8.527164, 8.56383 , 8.600496, 8.637163, 8.67383 , 8.710497, 8.747164, 8.78383 , 8.820497, 8.857163, 8.89383 , 8.930496, 8.967163, 9.00383 , 9.040497, 9.077164, 9.11383 , 9.150496, 9.187163, 9.22383 , 9.260497, 9.297163, 9.33383 , 9.370497, 9.407164, 9.44383 , 9.480496, 9.517163, 9.55383 , 9.590497, 9.627163, 9.66383 , 9.700497, 9.737164, 9.77383 , 9.810496, 9.847163, 9.88383 , 9.920497, 9.957163, 9.99383 , 10.030497, 10.067163, 10.10383 , 10.140496, 10.177163, 10.21383 , 10.250497, 10.287164, 10.32383 , 10.360497, 10.397163, 10.43383 , 10.470497, 10.507163, 10.54383 , 10.580497, 10.617164, 10.65383 , 10.690496, 10.727163, 10.76383 , 10.800497, 10.837163, 10.87383 , 10.910497, 10.947164, 10.98383 , 11.020496, 11.057163, 11.09383 , 11.130497, 11.167163, 11.20383 , 11.240497, 11.277164, 11.31383 , 11.350496, 11.387163, 11.42383 , 11.460497, 11.497164, 11.53383 , 11.570497, 11.607163, 11.64383 , 11.680496, 11.717163, 11.75383 , 11.790497, 11.827164, 11.86383 , 11.900496, 11.937163, 11.97383 , 12.010497, 12.047163])
- latitude(latitude)float6445.89 45.93 45.97 ... 52.02 52.05
array([45.894268, 45.930935, 45.967602, 46.004265, 46.040932, 46.077599, 46.114265, 46.150932, 46.187599, 46.224266, 46.260933, 46.2976 , 46.334267, 46.370934, 46.4076 , 46.444267, 46.480934, 46.517601, 46.554268, 46.590935, 46.627602, 46.664268, 46.700932, 46.737598, 46.774265, 46.810932, 46.847599, 46.884266, 46.920933, 46.9576 , 46.994267, 47.030933, 47.0676 , 47.104267, 47.140934, 47.177601, 47.214268, 47.250935, 47.287601, 47.324268, 47.360935, 47.397598, 47.434265, 47.470932, 47.507599, 47.544266, 47.580933, 47.617599, 47.654266, 47.690933, 47.7276 , 47.764267, 47.800934, 47.837601, 47.874268, 47.910934, 47.947601, 47.984268, 48.020935, 48.057598, 48.094265, 48.130932, 48.167599, 48.204266, 48.240932, 48.277599, 48.314266, 48.350933, 48.3876 , 48.424267, 48.460934, 48.497601, 48.534267, 48.570934, 48.607601, 48.644268, 48.680935, 48.717602, 48.754265, 48.790932, 48.827599, 48.864265, 48.900932, 48.937599, 48.974266, 49.010933, 49.0476 , 49.084267, 49.120934, 49.1576 , 49.194267, 49.230934, 49.267601, 49.304268, 49.340935, 49.377602, 49.414268, 49.450932, 49.487598, 49.524265, 49.560932, 49.597599, 49.634266, 49.670933, 49.7076 , 49.744267, 49.780933, 49.8176 , 49.854267, 49.890934, 49.927601, 49.964268, 50.000935, 50.037601, 50.074268, 50.110935, 50.147598, 50.184265, 50.220932, 50.257599, 50.294266, 50.330933, 50.367599, 50.404266, 50.440933, 50.4776 , 50.514267, 50.550934, 50.587601, 50.624268, 50.660934, 50.697601, 50.734268, 50.770935, 50.807598, 50.844265, 50.880932, 50.917599, 50.954266, 50.990932, 51.027599, 51.064266, 51.100933, 51.1376 , 51.174267, 51.210934, 51.247601, 51.284267, 51.320934, 51.357601, 51.394268, 51.430935, 51.467602, 51.504265, 51.540932, 51.577599, 51.614265, 51.650932, 51.687599, 51.724266, 51.760933, 51.7976 , 51.834267, 51.870934, 51.9076 , 51.944267, 51.980934, 52.017601, 52.054268])
- time()object1991-02-28 00:00:00
array(cftime.DatetimeGregorian(1991, 2, 28, 0, 0, 0, 0), dtype=object)
- units :
- m/s
14 In:
print(da)
<xarray.DataArray 'RiverRunoff' (latitude: 169, longitude: 187)>
array([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]])
Coordinates:
* longitude (longitude) float64 5.227 5.264 5.3 5.337 ... 11.97 12.01 12.05
* latitude (latitude) float64 45.89 45.93 45.97 46.0 ... 51.98 52.02 52.05
time object 1991-02-28 00:00:00
Attributes:
units: m/s
15 In:
qm = da.plot(robust=True, cmap="GnBu", figsize=(10, 8))
# Add some verification points
target_longitudes = [8.4, 10, 11]
target_latitudes = [50, 50.15, 49]
# Add some crosses to check that 'get_value_at_coords' works correctly below
qm.axes.scatter(target_longitudes, target_latitudes, s=250, c="r", marker="x", lw=2)
15 Out:
<matplotlib.collections.PathCollection at 0x7fc7fd4816d0>

Instead of getting the whole spatial grid, you can also get RiverRunoff values at some coordinates (red crosses in above plot).
16 In:
model.get_value_at_coords("RiverRunoff", lon=target_longitudes, lat=target_latitudes)
16 Out:
array([200.49531555, 44.60787582, 0. ])
We are done with the model so let’s clean it up.
17 In:
model.finalize()
In:
Running Hype model using eWaterCycle package
This notebook shows how to run Hype model using a demo use-case.
15 In:
import warnings
warnings.filterwarnings("ignore", category=UserWarning)
16 In:
import pandas as pd
import ewatercycle.models
import ewatercycle.parameter_sets
Load parameter set
The parameter set demo.zip
should be downloaded from https://sourceforge.net/projects/hype/files/release_hype_5_6_2/ and unzipped.
2 In:
parameter_set_dir = "<path where demo.zip was extracted to>"
parameter_set = ewatercycle.parameter_sets.ParameterSet(
name="hype_demo",
directory=parameter_set_dir,
config=parameter_set_dir + "/info.txt",
target_model="hype",
)
Setting up the model
Note that the model version and the parameterset versions should be compatible.
3 In:
ewatercycle.models.Hype.available_versions
3 Out:
('feb2021',)
4 In:
model = ewatercycle.models.Hype(version="feb2021", parameter_set=parameter_set)
eWaterCycle exposes a selected set of configurable parameters. These can be modified in the setup()
method.
5 In:
model.parameters
5 Out:
[('start_time', '1961-01-01T00:00:00Z'),
('end_time', '1963-12-31T00:00:00Z'),
('crit_time', '1962-01-01T00:00:00Z')]
Calling setup()
will start up a docker or singularity container. Be careful with calling it multiple times!
6 In:
cfg_file, cfg_dir = model.setup(end_time="1962-06-30T00:00:00Z")
cfg_file, cfg_dir
6 Out:
('/tmp/hype_20220607_121055/info.txt', '/tmp/hype_20220607_121055')
7 In:
model.parameters
7 Out:
[('start_time', '1961-01-01T00:00:00Z'),
('end_time', '1962-06-30T00:00:00Z'),
('crit_time', '1962-01-01T00:00:00Z')]
Note that the parameters have been changed. A new config file which incorporates these updated parameters has been generated as well. If you want to see or modify any additional model settings, you can acces this file directly. When you’re ready, pass the path to the config file to initialize()
.
8 In:
model.initialize(cfg_file)
Running the model
Simply running the model from start to end is straightforward. At each time step we can retrieve information from the model.
9 In:
discharge = []
time_range = []
end_time = model.end_time
while model.time < end_time:
model.update()
# The demo parameter set has a single sub catchment so store first value of array
discharge.append(model.get_value("comp outflow olake")[0])
time_range.append(model.time_as_datetime.date())
print(model.time_as_isostr, end="\r")
1962-06-30T00:00:00Z
Interacting with the model
Hype model exposes many variables.
10 In:
model.output_var_names
10 Out:
('comp outflow olake',
'rec outflow olake',
'air temperature',
'corrected air temper',
'precipitation',
'corr precipitation',
'subbasin evaporation',
'computed runoff',
'computed soil water')
Hype is a lumped sub-basin model so there are values per subbasin. The outflow of all sub-basins can be requested with:
11 In:
da = model.get_value("comp outflow olake")
da
11 Out:
array([6.71861417e-05])
The model has some info about the sub-basins:
12 In:
[
model.bmi.get_grid_size(1),
model.bmi.get_grid_rank(1),
model.bmi.get_grid_type(1),
model.bmi.get_grid_shape(1),
model.bmi.get_grid_x(1),
model.bmi.get_grid_y(1),
]
12 Out:
[1, 1, 'unstructured', (1,), array([0.]), array([0.])]
Inspect the results
17 In:
simulated_discharge = pd.DataFrame(
{"simulation": discharge}, index=pd.to_datetime(time_range)
)
18 In:
simulated_discharge.plot(figsize=(12, 8))
18 Out:
<AxesSubplot:>

Cleaning up
Models usually perform some “wrap up tasks” at the end of a model run, such as writing the last outputs to disk and releasing memory. In the case of eWaterCycle, another important teardown task is destroying the docker or singularity container in which the model was running. This can free up a lot of resources on your system. Therefore it is good practice to always call finalize()
when you’re done with an experiment.
10 In:
model.finalize()
In:
Brute force irrigation experiment
This example notebook shows how the eWaterCycle system can be used to quickly assess the impact of irrigation on river discharge. We will manually overwrite the soil moisture values in an experiment with the PCRGlobWB model, to mimick the effect on irrigation. Obviously, this is not a realistic scenario - the eWaterCycle developers are not accountable for any consequences of implementing a real irrigation system after this example.
1 In:
# This cell is only used to suppress some distracting output messages
import warnings
warnings.filterwarnings("ignore", category=UserWarning)
2 In:
import matplotlib.pyplot as plt
import pandas as pd
from cartopy import crs
from cartopy import feature as cfeature
import ewatercycle.models
import ewatercycle.parameter_sets
We will run 2 versions of the same model: 1. A reference run with the default setup 2. An irrigation experiment where we will overwrite soil moisture values
We will set up the models with identical parameters and settings. We will use a standard dataset with global parameters on 5 and 30 minutes resolution. The example parameter sets also include forcing data.
3 In:
merrimack_parameterset = ewatercycle.parameter_sets.ParameterSet(
name="custom_parameter_set",
directory="/mnt/data/examples/technical_paper/pcr-globwb/input",
config="./pcrglobwb_merrimack.ini",
target_model="pcrglobwb",
doi="10.5281/zenodo.1045339",
supported_model_versions={"setters"},
)
print(merrimack_parameterset)
Parameter set
-------------
name=custom_parameter_set
directory=/mnt/data/examples/technical_paper/pcr-globwb/input
config=/mnt/home/user37/ewatercycle/docs/examples/pcrglobwb_merrimack.ini
doi=10.5281/zenodo.1045339
target_model=pcrglobwb
supported_model_versions={'setters'}
We’ll track a grid cell nearby a GRDC station with the following coordinates:
4 In:
grdc_latitude = 42.6459
grdc_longitude = -71.2984
Reference experiment
For the purpose of illustration, we start by running the reference experiment. Then, in the irrigation experiment, we can focus on the differences with respect to the reference experiment.
5 In:
# Instantiate the model instance
reference = ewatercycle.models.PCRGlobWB(
version="setters", parameter_set=merrimack_parameterset
)
# Create experiment folder, set up the model configuration,
# and start the container in which the model will run
reference_config, reference_dir = reference.setup()
Initialize the model inside the container. Depending on your system this may take a few minutes, log messages will start to appear in the output directory
6 In:
reference.initialize(reference_config)
Create an empty dataframe to store the modelled discharge
7 In:
time = pd.date_range(reference.start_time_as_isostr, reference.end_time_as_isostr)
timeseries = pd.DataFrame(
index=pd.Index(time, name="time"), columns=["reference", "experiment"]
)
timeseries.head()
7 Out:
reference | experiment | |
---|---|---|
time | ||
2002-01-01 00:00:00+00:00 | NaN | NaN |
2002-01-02 00:00:00+00:00 | NaN | NaN |
2002-01-03 00:00:00+00:00 | NaN | NaN |
2002-01-04 00:00:00+00:00 | NaN | NaN |
2002-01-05 00:00:00+00:00 | NaN | NaN |
8 In:
while reference.time < reference.end_time:
reference.update()
# Track discharge at station location
discharge_at_station = reference.get_value_at_coords(
"discharge", lat=[grdc_latitude], lon=[grdc_longitude]
)
time = reference.time_as_isostr
timeseries["reference"][time] = discharge_at_station[0]
# Show progress
print(time, end="\r") # "\r" clears the output before printing the next timestamp
2002-12-31T00:00:00Z
Intermediate insights
Before we continue with the experiment, let’s have a look at the intermediate results. First of all, notice that the reference column in our timeseries dataframe has been filled.
9 In:
timeseries.head()
9 Out:
reference | experiment | |
---|---|---|
time | ||
2002-01-01 00:00:00+00:00 | 71.991348 | NaN |
2002-01-02 00:00:00+00:00 | 78.788757 | NaN |
2002-01-03 00:00:00+00:00 | 79.178329 | NaN |
2002-01-04 00:00:00+00:00 | 79.046112 | NaN |
2002-01-05 00:00:00+00:00 | 78.232491 | NaN |
We can also make a map of discharge at the last model step
10 In:
# Use matplotlib to make the figure slightly nicer
fig = plt.figure(dpi=120)
ax = fig.add_subplot(111, projection=crs.PlateCarree())
# Plotting the model field is a one-liner
reference.get_value_as_xarray("discharge").plot(ax=ax, cmap="GnBu")
# Also plot the station location
ax.scatter(grdc_longitude, grdc_latitude, s=25, c="r")
# Overlay ocean and coastines
ax.add_feature(cfeature.OCEAN, zorder=2)
ax.add_feature(cfeature.RIVERS, zorder=2, color="k")
ax.coastlines(zorder=3)
10 Out:
<cartopy.mpl.feature_artist.FeatureArtist at 0x7fadd0837cd0>

You can see that the grdc location indeed represents a cell that we would identify as a river.
We can also have a quick look at the discharge timeseries we have tracked, to see if it makes any sense.
11 In:
timeseries.plot()
11 Out:
<AxesSubplot:xlabel='time'>

Running the irrigation experiment
Before we initialize the experiment, let’s use the reference model to illustrate the concept of what we will do.
We will fetch the soil moisture field and overwrite a part of it so that the soil will be fully saturated.
12 In:
soil_moisture = reference.get_value_as_xarray("upper_soil_saturation_degree")
# Copy the field and manually overwrite a random part of the domain
irrigated_soil_moisture = soil_moisture.copy()
irrigated_soil_moisture[31:41, 18:28] = 1
Let’s visualize the difference
13 In:
fig = plt.figure(figsize=(10, 5), dpi=120)
left_axes = fig.add_subplot(121, projection=crs.PlateCarree())
right_axes = fig.add_subplot(122, projection=crs.PlateCarree())
soil_moisture.plot(ax=left_axes, cmap="GnBu", vmin=0.3, vmax=1)
irrigated_soil_moisture.plot(ax=right_axes, cmap="GnBu", vmin=0.3, vmax=1)
# Decoration
left_axes.set_title("Reference")
right_axes.set_title("Irrigated patch")
for axes in [left_axes, right_axes]:
axes.add_feature(cfeature.OCEAN, zorder=2)
axes.add_feature(cfeature.RIVERS, zorder=2, color="k")
axes.coastlines(zorder=3)

From here on we will do exactly the same as before, except that we’ll add three extra lines to overwrite soil moisture at every time step.
14 In:
experiment = ewatercycle.models.PCRGlobWB(
version="setters", parameter_set=merrimack_parameterset
)
experiment_config, experiment_dir = experiment.setup()
15 In:
experiment.initialize(experiment_config)
# this may take a few minutes, log messages will start to appear in the output directory.
16 In:
while experiment.time < experiment.end_time:
experiment.update()
# Overwrite soil moisture field
soil_moisture = experiment.get_value_as_xarray(
"upper_soil_saturation_degree",
)
soil_moisture[31:41, 18:28] = 1
experiment.set_value("upper_soil_saturation_degree", soil_moisture.values.flatten())
# Track discharge at station location
discharge_at_station = experiment.get_value_at_coords(
"discharge", lat=[grdc_latitude], lon=[grdc_longitude]
)
time = experiment.time_as_isostr
timeseries["experiment"][time] = discharge_at_station[0]
# Show progress
print(time, end="\r") # "\r" clears the output before printing the next timestamp
2002-12-31T00:00:00Z
Final analysis
17 In:
fig, ax = plt.subplots(dpi=120)
timeseries.plot(ax=ax)
ax.set_title("Increased discharge due to irrigation")
17 Out:
Text(0.5, 1.0, 'Increased discharge due to irrigation')

Clean up
It is good practice to remove model instances once you’re done with an experiment. This will free up resources on the system.
18 In:
reference.finalize()
experiment.finalize()
Migrate from HPC to Cluster (Snellius) guide
The HPC node jupyter.ewatercycle.org can be used for small test experiments, to do actual work you will need to run your notebook/script on the cluster (Snellius). On Snellius the forcing data is already present and many users can run jobs at the same time without interfering each other.
Familiarize yourself with Linux by reading this simple guide:
Migration Preparation
1. Create Github repository
Start by creating a Github repository to store (only) your code by following these guides:
https://docs.github.com/en/github/getting-started-with-github/set-up-git
https://docs.github.com/en/github/getting-started-with-github/create-a-repo
2. Create Conda environment.yml (not required)
For ease of transfer it can be helpful to create a environment.yml file. This file contains a list of all the packages you use for running code. This is good practice because it allows users of your Github repository to quickly install the necessary package requirements.
3. Copy files from HPC to Snellius
To copy files from the eWaterCycle HPC to Snellius the following command example can be used:
scp -r {YourUserNameOnTheHPC}@jupyter.ewatercycle.org:/mnt/{YourUserNameOnTheHPC}/{PathToFolder}/ /home/{YourUserNameOnTheSnellius}/{PathToFolder}/
When prompted, enter your eWaterCycle HPC password.
Login to Snellius
1. VPN Connection
Cluster computer hosting institutes have a strict policy on which IP-addresses are allowed to connect with the Cluster (Snellius). For this reason you need to first establish a VPN connection to your University or Research Institute that has a whitelisted IP-address.
2. MobaXterm
To connects with Snellius a SSH client is required. One such free client is MobaXterm and can be downloaded here: https://mobaxterm.mobatek.net/.
After installation open the client and click on the session tab (top left), click on SSH, at remote host fill in “snellius.surf.nl”, tick the specify username box, fill in your Snellius username and click OK (bottom). Fill in the snellius password when prompted.
3. Login Node & Compute Node
Once you are logged in you are on the login node. This node should not be used to run scripts as it is only a portal to communicate with the compute nodes running on the background (the actual computers). The compute nodes are where you will do the calculations. We communicate with compute nodes using Bash (.sh) scripts. This will be explained later.
4. Home Directory & Scratch Directory
When you login you are directed to your Home Directory:
/home/{YourUserNameOnTheSnellius}/
The Home Directory has slower diskspeeds than the Scratch Directory. The Scratch Directory needs to be created using the following commands:
cd /scratch-shared/
mkdir {YourUserNameOnTheSnellius}
You can now access the Scratch Directory at /scratch/shared/{YourUserNameOnTheSnellius}/
. Best practice is to modify your code such that it first copies all the required files (excluding code) to the Scratch Directory, followed by running the code, after completion copying the files back to the Home Directory, and cleaning up the Scratch Directory.
First Run preparations
1. Clone Github repository
Clone Github repository containing scripts using:
git clone https://github.com/example_user/example_repo
2. Install MiniConda
Go to home directory:
cd /home/username/
Download MiniConda:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
Install MiniConda:
bash Miniconda3-latest-Linux-x86_64.sh
Restart the connection with Snellius
conda update conda
3. Create Conda environment
Create a Conda enviroment and install required packages following the description:
Make sure that Jupyter Lab is installed in the Conda environment:
wget https://raw.githubusercontent.com/eWaterCycle/ewatercycle/main/environment.yml
conda install mamba -n base -c conda-forge -y
mamba env create --file environment.yml
conda activate ewatercycle
conda install -c conda-forge jupyterlab
Install eWatercycle package:
pip install ewatercycle
4. Create Singularity Container
On Snellius, Docker requires root access and can therefore not be used. Singularity is similar to, and integrates well with Docker. It also requires root access, but it is pre-installed on the compute nodes on Snellius.
The first step to run the model on a compute node is thus to use singularity to create a Singularity image (.sif
file) based on the Docker image. This is done with (note the srun
command to access the compute node):
srun -N 1 -t 40 -p short singularity build --disable-cache ewatercycle-wflow-grpc4bmi.sif docker://ewatercycle/wflow-grpc4bmi:latest
This is an example for the wflow_sbm model, change to the correct Docker container:
docker://ewatercycle/{model}-grpc4bmi:{version}
5. Adjust code to run Singularity container
Code should be adjusted to run Singularity instead of Docker following:
from grpc4bmi.bmi_client_singularity import BmiClientSingularity
model = BmiClientSingularity(image='ewatercycle-wflow-grpc4bmi.sif', input_dirs=[input_dir], work_dir=work_dir)
...
6. Adjust code to use Scratch directory
Before running the model copy the model instance to the scratch directory:
/scratch-shared/{YourUsernameOnTheSnellius}/
Run the model from this directory and copy the output back to the home directory:
/home/{YourUsernameOnTheSnellius}/
Cleanup files in the scratch directory.
Submitting Jupyter Job on Cluster node
Here we briefly explain general SBATCH parameters and how to launch a Jupyter Lab environment on Snellius. Start by opening a text editor on Snellius (e.g. nano
) or (easier) your local machine (e.g. notepad). Copy the following text inside your text editor, edit the Conda environment name, and save as run_jupyter_on_snellius.sh (make sure the extension is .sh
):
#!/bin/bash
# Serve a jupyter lab environment from a compute node on Snellius
# usage: sbatch run_jupyter_on_compute_node.sh
# SLURM settings
#SBATCH -J jupyter_lab
#SBATCH -t 09:00:00
#SBATCH -N 1
#SBATCH -p normal
#SBATCH --output=slurm_%j.out
#SBATCH --error=slurm_%j.out
# Use an appropriate conda environment
. ~/miniconda3/etc/profile.d/conda.sh
conda activate {YourEnvironmentName}
# Some security: stop script on error and undefined variables
set -euo pipefail
# Specify (random) port to serve the notebook
port=8123
host=$(hostname -s)
# Print command to create ssh tunnel in log file
echo -e "
Command to create ssh tunnel (run from another terminal session on your local machine):
ssh -L ${port}:${host}:${port} $(whoami)@snellius.surf.nl
Below, jupyter will print a number of addresses at which the notebook is served.
Due to the way the tunnel is set up, only the latter option will work.
It's the one that looks like
http://127.0.0.1:${port}/?token=<long_access_token_very_important_to_copy_as_well>
Copy this address in your local browser and you're good to go
Starting notebooks server
**************************************************
"
# Start the jupyter lab session
jupyter lab --no-browser --port ${port} --ip=${host}
Explanation of SBATCH Parameters
#SBATCH -J jupyter_lab
Here you can set the job name.
#SBATCH -t 09:00:00
Here you specify job runtime. On the Snellius we have a budget, each half hour cpu runtime costs 1 point on the budget. A Node consists of 24 cores meaning that the specified runtime (9 hours) costs 24*2*9 points on the budget.
#SBATCH -N 1
Specifies the amount of nodes used by the run, keep at default value of 1.
#SBATCH -p normal
Specifies the type of Node, keep at default value of “normal”.
#SBATCH --output=slurm_%j.out
Specifies the location and name of the job log file.
More information on SBATCH parameters can be found here: https://servicedesk.surfsara.nl/wiki/display/WIKI/Creating+and+running+jobs
Specifying job runtime
Good practice for calculating job runtime is by for example running a model first for 1 year, calculate the time it takes. Multiply it by the total amount of years for your study. Add a time buffer of around 10-20 percent.
For example: 1 year takes 2 hours, total run is 10 years, 20 hours total, add time buffer, estimated runtime equals 22-24 hours.
Running the bash (.sh) script
Enter this command to run the bash script:
sbatch run_jupyter_on_snellius.sh
(If you get DOS and UNIX linebreak errors, run the following command:)
dos2unix run_jupyter_on_snellius.sh
Job control
To view which jobs are running you can enter:
squeue -u {YourUserNameOnTheSnellius}
To cancel a running job you can enter:
scancel {jobID}
More information on job control can be found here: https://userinfo.surfsara.nl/systems/lisa/user-guide/creating-and-running-jobs#interacting
Launching Jupyter Lab on Cluster Node
1. Open Slurm output log file
Open slurm output log file by double clicking in the file browser or by using a text editor (
nano
) and read the output carefully.
2. Create ssh tunnel between local machine and cluster
To create a ssh connection between your local machine and the cluster you need to open a command prompt interface on your local machine. For example PowerShell
or cmd
on Windows.
copy the line
ssh -L ${port}:${host}:${port} $(whoami)@snellius.surf.nl
from the slurm log file (not the bash script) into the command prompt and run.
3. Connect through browser
Open a browser (e.g. Chrome) and go to the url:
localhost:8123/lab
4. Enter the access token
Copy the access token from the slurm otput log file and paste in the browser at access token or password.
You have now succesfully launched a Jupyter Lab environment on a cluster node.
Observations
The eWaterCycle platform supports observations relevant for calibrating and validating models. We currently support USGS and GRDC river discharge observations.
USGS
The U.S. Geological Survey Water Services provides public discharge data for a large number of US based stations. In eWaterCycle we make use of the USGS web service to automatically retrieve this data. The Discharge timestamp is corrected to the UTC timezone. Units are converted from cubic feet per second to cubic meter per second.
GRDC
The Global Runoff Data Centre provides discharge data for a large number of stations around the world. In eWaterCycle we support GRDC data. This is not downloaded automatically, but required to be present on the infrastructure where the eWaterCycle platform is deployed. By special permission from GRDC our own instance contains data from the ArcticHYCOS and GCOS/GTN-H, GTN-R projects.
ewatercycle package
Subpackages
ewatercycle.analysis package
- ewatercycle.analysis.hydrograph(discharge: pandas.DataFrame, *, reference: str, precipitation: Optional[pandas.DataFrame] = None, dpi: Optional[int] = None, title: str = 'Hydrograph', discharge_units: str = 'm$^3$ s$^{-1}$', precipitation_units: str = 'mm day$^{-1}$', figsize: Tuple[float, float] = (10, 10), filename: Optional[Union[PathLike, str]] = None, nbars: Optional[int] = None, **kwargs) Tuple[matplotlib.pyplot.Figure, Tuple[matplotlib.pyplot.Axes, matplotlib.pyplot.Axes]]
Plot a hydrograph.
This utility function makes it convenient to create a hydrograph from a set of discharge data from a pandas.DataFrame. A column must be marked as the reference, so that the agreement metrics can be calculated.
Optionally, the corresponding precipitation data can be plotted for comparison.
- Parameters
discharge (pd.DataFrame) – Dataframe containing time series of discharge data to be plotted.
reference (str) – Name of the reference data, must correspond to a column in the discharge dataframe. Metrics are calculated between the reference column and each of the other columns.
precipitation (pd.DataFrame, optional) – Optional dataframe containing time series of precipitation data to be plotted from the top of the hydrograph.
dpi (int, optional) – DPI for the plot.
title (str, optional) – Title of the hydrograph.
discharge_units (str, optional) – Units for the discharge data.
precipitation_units (str, optional) – Units for the precipitation data.
figsize ((float, float), optional) – With, height of the plot in inches.
filename (str or Path, optional) – If specified, a copy of the plot will be saved to this path.
nbars (Int, optional) – Number of bars to use for downsampling precipitation.
**kwargs – Options to pass to the matplotlib plotting function
- Returns
fig (matplotlib.figure.Figure)
ax, ax_tbl (tuple of matplotlib.axes.Axes)
ewatercycle.config package
Config
Configuration of eWaterCycle is done via the
Config
object. The global configuration can be
imported from the eWaterCycle
module as CFG
:
>>> from ewatercycle import CFG
>>> CFG
Config({'container_engine': None,
'grdc_location': None,
'output_dir': None,
'singularity_dir': None,
'wflow.docker_image': None,
'wflow.singularity_image': None})
By default all values are initialized as None
.
CFG
is essentially a python dictionary with a few extra
functions, similar to matplotlib.rcParams
. This means that values can
be updated like this:
>>> CFG['output_dir'] = '~/output'
>>> CFG['output_dir']
PosixPath('/home/user/output')
Notice that CFG
automatically converts the path to an
instance of pathlib.Path
and expands the home directory. All values entered
into the config are validated to prevent mistakes, for example, it will warn you
if you make a typo in the key:
>>> CFG['output_directory'] = '~/output'
InvalidConfigParameter: `output_directory` is not a valid config parameter.
Or, if the value entered cannot be converted to the expected type:
>>> CFG['output_dir'] = 123
InvalidConfigParameter: Key `output_dir`: Expected a path, but got 123
By default, the config is loaded from the default location (i.e.
~/.config/ewatercycle/ewatercycle.yaml
). If it does not exist, it falls back
to the default values. to load a different file:
>>> CFG.load_from_file('~/my-config.yml')
Or to reload the current config:
>>> CFG.reload()
- ewatercycle.config.CFG
eWaterCycle configuration object.
The configuration is loaded from:
~/$XDG_CONFIG_HOME/ewatercycle/ewatercycle.yaml
~/.config/ewatercycle/ewatercycle.yaml
/etc/ewatercycle.yaml
Fall back to empty configuration
The ewatercycle.yaml
is formatted in YAML and could for example look like:
grdc_location: /data/grdc
container_engine: singularity
singularity_dir: /data/singularity-images
output_dir: /scratch
# Created with cd /data/singularity-images &&
# singularity pull docker://ewatercycle/wflow-grpc4bmi:2020.1.1
wflow.singularity_images: wflow-grpc4bmi_2020.1.1.sif
wflow.docker_images: ewatercycle/wflow-grpc4bmi:2020.1.1
- class ewatercycle.config.Config(*args, **kwargs)
Bases:
ValidatedConfig
Configuration object.
Do not instantiate this class directly, but use
ewatercycle.CFG
instead.- save_to_file(config_file: Optional[Union[PathLike, str]] = None)
Write conf object to a file.
- Parameters
config_file – File to write configuration object to. If not given then will try to use CFG[‘ewatercycle_config’] location and if CFG[‘ewatercycle_config’] is not set then will use the location in users home directory.
ewatercycle.forcing package
- ewatercycle.forcing.load(directory: str)
Load previously generated or imported forcing data.
- Parameters
directory – forcing data directory; must contain ewatercycle_forcing.yaml file
Returns: Forcing object
- ewatercycle.forcing.load_foreign(target_model, start_time: str, end_time: str, directory: str = '.', shape: Optional[str] = None, forcing_info: Optional[Dict] = None)
Load existing forcing data generated from an external source.
- Parameters
target_model – Name of the hydrological model for which the forcing will be used
start_time – Start time of forcing in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’.
end_time – End time of forcing in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’.
directory – forcing data directory
shape – Path to a shape file. Used for spatial selection.
forcing_info – Dictionary with model-specific information about forcing data. See below for the available options for each model.
- Returns
Forcing object
Examples
For Marrmot
from ewatercycle.forcing import load_foreign forcing = load_foreign('marmot', directory='/data/marrmot-forcings-case1', start_time='1989-01-02T00:00:00Z', end_time='1999-01-02T00:00:00Z', forcing_info={ 'forcing_file': 'marrmot-1989-1999.mat' })
For LisFlood
from ewatercycle.forcing import load_foreign forcing = load_foreign(target_model='lisflood', directory='/data/lisflood-forcings-case1', start_time='1989-01-02T00:00:00Z', end_time='1999-01-02T00:00:00Z', forcing_info={ 'PrefixPrecipitation': 'tp.nc', 'PrefixTavg': 'ta.nc', 'PrefixE0': 'e.nc', 'PrefixES0': 'es.nc', 'PrefixET0': 'et.nc' })
Model-specific forcing info:
- Hype
Pobs (str) – Input file for precipitation data.
TMAXobs (str) – Input file for maximum temperature data.
TMINobs (str) – Input file for minimum temperature data.
Tobs (str) – Input file for temperature data.
- Lisflood
PrefixPrecipitation – Path to a NetCDF or pcraster file with precipitation data
PrefixTavg – Path to a NetCDF or pcraster file with average temperature data
PrefixE0 – Path to a NetCDF or pcraster file with potential evaporation rate from open water surface data
PrefixES0 – Path to a NetCDF or pcraster file with potential evaporation rate from bare soil surface data
PrefixET0 – Path to a NetCDF or pcraster file with potential (reference) evapotranspiration rate data
- Marrmot
forcing_file – Matlab file that contains forcings for Marrmot models. See format forcing file in model implementation.
- Pcrglobwb
precipitationNC (str) – Input file for precipitation data.
temperatureNC (str) – Input file for temperature data.
- Wflow
netcdfinput (str) – Path to forcing file.”
Precipitation (str) – Variable name of precipitation data in input file.
EvapoTranspiration (str) – Variable name of evapotranspiration data in input file.
Temperature (str) – Variable name of temperature data in input file.
Inflow (str) – Variable name of inflow data in input file.
- ewatercycle.forcing.generate(target_model: str, dataset: str, start_time: str, end_time: str, shape: str, directory: Optional[str] = None, model_specific_options: Optional[Dict] = None)
Generate forcing data with ESMValTool.
- Parameters
target_model – Name of the model
dataset – Name of the source dataset. See
datasets
.start_time – Start time of forcing in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’.
end_time – End time of forcing in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’.
shape – Path to a shape file. Used for spatial selection.
directory – Directory in which forcing should be written. If not given will create timestamped directory.
model_specific_options – Dictionary with model-specific recipe settings. See below for the available options for each model.
- Returns
Forcing object
Model-specific options that can be passed to generate:
- Hype
None – Hype does not have model-specific generate options.
- Lisflood
target_grid (dict) – the
target_grid
should be adict
with the following keys:start_longitude
: longitude at the center of the first grid cell.end_longitude
: longitude at the center of the last grid cell.step_longitude
: constant longitude distance between grid cell centers.start_latitude
: latitude at the center of the first grid cell.end_latitude
: longitude at the center of the last grid cell.step_latitude
: constant latitude distance between grid cell centers.
Make sure the target grid matches up with the grid in the mask_map and files in parameterset_dir. Also the shape should be within the target grid.
If not given will guestimate target grid from shape using a 0.1x0.1 grid with 0.05 offset.
run_lisvap (dict) – Lisvap specification. Default is None. If lisvap should be run then give a dictionary with following key/value pairs:
lisvap_config: Name of Lisvap configuration file.
- mask_map: A mask for the spatial selection.
This file should have same extent and resolution as parameter-set.
- version: LISVAP/LISFLOOD model version supported by ewatercycle.
Pick from
available_versions
.
- parameterset_dir: Directory of the parameter set.
Directory should contains the Lisvap config file and files the config points to.
- Marrmot
None – Marrmot does not have model-specific generate options.
- Pcrglobwb
start_time_climatology (str) – Start time for the climatology data
end_time_climatology (str) – End time for the climatology data
extract_region (dict) – Region specification, dictionary must contain start_longitude, end_longitude, start_latitude, end_latitude
- Wflow
dem_file (str) – Name of the dem_file to use. Also defines the basin param.
extract_region (dict) – Region specification, dictionary must contain start_longitude, end_longitude, start_latitude, end_latitude
Submodules
Supported datasets for ESMValTool recipes.
Currently supported: ERA5 and ERA-Interim.
ewatercycle.models package
Submodules
- class ewatercycle.models.abstract.AbstractModel(version: str, parameter_set: Optional[ParameterSet] = None, forcing: Optional[ForcingT] = None)
Bases:
Generic
[ForcingT
]Abstract class of a eWaterCycle model.
- available_versions: ClassVar[Tuple[str, ...]] = ()
Versions of model that are available in this class
- bmi: basic_modeling_interface.Bmi
Basic Modeling Interface object
- abstract setup(*args, **kwargs) Tuple[str, str]
Performs model setup.
Creates config file and config directory
Start bmi container and store as self.bmi
- Parameters
*args – Positional arguments. Sub class should specify each arg.
**kwargs – Named arguments. Sub class should specify each arg.
- Returns
Path to config file and path to config directory
- initialize(config_file: str) None
Initialize the model.
- Parameters
config_file – Name of initialization file.
- get_value(name: str) numpy.ndarray
Get a copy of values of the given variable.
- Parameters
name – Name of variable
- get_value_at_coords(name, lat: Iterable[float], lon: Iterable[float]) numpy.ndarray
Get a copy of values of the given variable at lat/lon coordinates.
- Parameters
name – Name of variable
lat – Latitudinal value
lon – Longitudinal value
- set_value(name: str, value: numpy.ndarray) None
Specify a new value for a model variable.
- Parameters
name – Name of variable
value – The new value for the specified variable.
- set_value_at_coords(name: str, lat: Iterable[float], lon: Iterable[float], values: numpy.ndarray) None
Specify a new value for a model variable at at lat/lon coordinates.
- Parameters
name – Name of variable
lat – Latitudinal value
lon – Longitudinal value
values – The new value for the specified variable.
- abstract get_value_as_xarray(name: str) xarray.DataArray
Get a copy values of the given variable as xarray DataArray.
The xarray object also contains coordinate information and additional attributes such as the units.
Args: name: Name of the variable
- property start_time_as_isostr: str
Start time of the model.
In UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’.
- property end_time_as_isostr: str
End time of the model.
In UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’.
- class ewatercycle.models.hype.Hype(version: str, parameter_set: ParameterSet, forcing: Optional[HypeForcing] = None)
Bases:
AbstractModel
[HypeForcing
]eWaterCycle implementation of Hype hydrological model.
Model documentation at http://www.smhi.net/hype/wiki/doku.php .
- Parameters
version – pick a version from
available_versions
parameter_set – instance of
ParameterSet
.forcing – ewatercycle forcing container; see
ewatercycle.forcing
.
- available_versions: ClassVar[Tuple[str, ...]] = ('feb2021',)
Versions of model that are available in this class
- setup(start_time: Optional[str] = None, end_time: Optional[str] = None, crit_time: Optional[str] = None, cfg_dir: Optional[str] = None) Tuple[str, str]
Configure model run.
Creates config file and config directory based on the forcing variables and time range.
Start bmi container and store as
bmi
- Parameters
start_time – Start time of model in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’. If not given then forcing start time is used.
end_time – End time of model in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’. If not given then forcing end time is used.
crit_time – Start date for the output of results and calculations of criteria. e.g. ‘YYYY-MM-DDTHH:MM:SSZ’. If not given then start_time is used.
cfg_dir – a run directory given by user or created for user.
- Returns
Path to config file and path to config directory
- get_value_as_xarray(name: str) xarray.DataArray
Get value as xarray
- Parameters
name – Name of value to retrieve.
- Returns
Xarray with values for each sub catchment
- forcing: Optional[ForcingT]
- bmi: Bmi
Basic Modeling Interface object
eWaterCycle wrapper around Lisflood BMI.
- class ewatercycle.models.lisflood.Lisflood(version: str, parameter_set: ParameterSet, forcing: LisfloodForcing)
Bases:
AbstractModel
[LisfloodForcing
]eWaterCycle implementation of Lisflood hydrological model.
- Parameters
version – pick a version for which an grpc4bmi docker image is available.
parameter_set – LISFLOOD input files. Any included forcing data will be ignored.
forcing – a LisfloodForcing object.
Example
See examples/lisflood.ipynb in ewatercycle repository
- available_versions: ClassVar[Tuple[str, ...]] = ('20.10',)
Versions for which ewatercycle grpc4bmi docker images are available.
- setup(IrrigationEfficiency: Optional[str] = None, start_time: Optional[str] = None, end_time: Optional[str] = None, MaskMap: Optional[str] = None, cfg_dir: Optional[str] = None) Tuple[str, str]
Configure model run.
Creates config file and config directory based on the forcing variables and time range.
Start bmi container and store as
bmi
- Parameters
IrrigationEfficiency – Field application irrigation efficiency. max 1, ~0.90 drip irrigation, ~0.75 sprinkling
start_time – Start time of model in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’. If not given then forcing start time is used.
end_time – End time of model in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’. If not given then forcing end time is used.
MaskMap – Mask map to use instead of one supplied in parameter set. Path to a NetCDF or pcraster file with same dimensions as parameter set map files and a boolean variable.
cfg_dir – a run directory given by user or created for user.
- Returns
Path to config file and path to config directory
- forcing: Optional[ForcingT]
- bmi: Bmi
Basic Modeling Interface object
eWaterCycle wrapper around Marrmot BMI.
- class ewatercycle.models.marrmot.Solver(name: str = 'createOdeApprox_IE', resnorm_tolerance: float = 0.1, resnorm_maxiter: float = 6.0)
Bases:
object
Container for properties of the solver.
For current implementations see here.
- class ewatercycle.models.marrmot.MarrmotM01(version: str, forcing: MarrmotForcing)
Bases:
AbstractModel
[MarrmotForcing
]eWaterCycle implementation of Marrmot Collie River 1 (traditional bucket) model.
It sets MarrmotM01 parameter with an initial value that is the mean value of the range specfied in model parameter range file.
- Parameters
version – pick a version for which an ewatercycle grpc4bmi docker image is available. forcing: a MarrmotForcing object. If forcing file contains parameter and other settings, those are used and can be changed in
setup()
.
Example
See examples/marrmotM01.ipynb in ewatercycle repository
- model_name = 'm_01_collie1_1p_1s'
Name of model in Matlab code.
- available_versions: ClassVar[Tuple[str, ...]] = ('2020.11',)
Versions for which ewatercycle grpc4bmi docker images are available.
- setup(maximum_soil_moisture_storage: Optional[float] = None, initial_soil_moisture_storage: Optional[float] = None, start_time: Optional[str] = None, end_time: Optional[str] = None, solver: Optional[Solver] = None, cfg_dir: Optional[str] = None, delay: int = 0) Tuple[str, str]
Configure model run.
Creates config file and config directory based on the forcing variables and time range
Start bmi container and store as
bmi
- Parameters
maximum_soil_moisture_storage –
in mm. Range is specfied in model parameter range file.
initial_soil_moisture_storage – in mm.
start_time – Start time of model in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’. If not given then forcing start time is used.
end_time – End time of model in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’. If not given then forcing end time is used.
solver – Solver settings
cfg_dir – a run directory given by user or created for user.
delay – Number of seconds to wait before communicating with model. Increase the delay when model takes a while to start.
- Returns
Path to config file and path to config directory
- forcing: Optional[ForcingT]
- bmi: Bmi
Basic Modeling Interface object
- class ewatercycle.models.marrmot.MarrmotM14(version: str, forcing: MarrmotForcing)
Bases:
AbstractModel
[MarrmotForcing
]eWaterCycle implementation of Marrmot Top Model hydrological model.
It sets MarrmotM14 parameter with an initial value that is the mean value of the range specfied in model parameter range file.
- Parameters
version – pick a version for which an ewatercycle grpc4bmi docker image is available.
forcing – a MarrmotForcing object. If forcing file contains parameter and other settings, those are used and can be changed in
setup()
.
Example
See examples/marrmotM14.ipynb in ewatercycle repository
- model_name = 'm_14_topmodel_7p_2s'
Name of model in Matlab code.
- available_versions: ClassVar[Tuple[str, ...]] = ('2020.11',)
Versions for which ewatercycle grpc4bmi docker images are available.
- setup(maximum_soil_moisture_storage: Optional[float] = None, threshold_flow_generation_evap_change: Optional[float] = None, leakage_saturated_zone_flow_coefficient: Optional[float] = None, zero_deficit_base_flow_speed: Optional[float] = None, baseflow_coefficient: Optional[float] = None, gamma_distribution_chi_parameter: Optional[float] = None, gamma_distribution_phi_parameter: Optional[float] = None, initial_upper_zone_storage: Optional[float] = None, initial_saturated_zone_storage: Optional[float] = None, start_time: Optional[str] = None, end_time: Optional[str] = None, solver: Optional[Solver] = None, cfg_dir: Optional[str] = None, delay: int = 0) Tuple[str, str]
Configure model run.
Creates config file and config directory based on the forcing variables and time range
Start bmi container and store as
bmi
- Parameters
maximum_soil_moisture_storage –
in mm. Range is specfied in model parameter range file. threshold_flow_generation_evap_change.
leakage_saturated_zone_flow_coefficient – in mm/d.
zero_deficit_base_flow_speed – in mm/d.
baseflow_coefficient – in mm-1.
gamma_distribution_chi_parameter. –
gamma_distribution_phi_parameter. –
initial_upper_zone_storage – in mm.
initial_saturated_zone_storage – in mm.
start_time – Start time of model in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’. If not given then forcing start time is used.
end_time – End time of model in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’. If not given then forcing end time is used. solver: Solver settings
cfg_dir – a run directory given by user or created for user.
delay – Number of seconds to wait before communicating with model. Increase the delay when model takes a while to start.
- Returns
Path to config file and path to config directory
- forcing: Optional[ForcingT]
- bmi: Bmi
Basic Modeling Interface object
eWaterCycle wrapper around PCRGlobWB BMI.
- class ewatercycle.models.pcrglobwb.PCRGlobWB(version: str, parameter_set: ParameterSet, forcing: Optional[PCRGlobWBForcing] = None)
Bases:
AbstractModel
[PCRGlobWBForcing
]eWaterCycle implementation of PCRGlobWB hydrological model.
- Parameters
version – pick a version from
available_versions
parameter_set – instance of
ParameterSet
.forcing – ewatercycle forcing container; see
ewatercycle.forcing
.
- available_versions: ClassVar[Tuple[str, ...]] = ('setters',)
Versions of model that are available in this class
- setup(cfg_dir: Optional[str] = None, **kwargs) Tuple[str, str]
Start model inside container and return config file and work dir.
- Parameters
cfg_dir – a run directory given by user or created for user.
**kwargs – Use
parameters()
to see the current values configurable options for this model,
Returns: Path to config file and work dir
- forcing: Optional[ForcingT]
- bmi: Bmi
Basic Modeling Interface object
eWaterCycle wrapper around WFlow BMI.
- class ewatercycle.models.wflow.Wflow(version: str, parameter_set: ParameterSet, forcing: Optional[WflowForcing] = None)
Bases:
AbstractModel
[WflowForcing
]Create an instance of the Wflow model class.
- Parameters
version – pick a version from
available_versions
parameter_set – instance of
ParameterSet
.forcing – instance of
WflowForcing
or None. If None, it is assumed that forcing is included with the parameter_set.
- available_versions: ClassVar[Tuple[str, ...]] = ('2020.1.1', '2020.1.2', '2020.1.3')
Show supported WFlow versions in eWaterCycle
- setup(cfg_dir: Optional[str] = None, **kwargs) Tuple[str, str]
Start the model inside a container and return a valid config file.
- Parameters
cfg_dir – a run directory given by user or created for user.
**kwargs (optional, dict) – see
parameters
for all configurable model parameters.
- Returns
Path to config file and working directory
- forcing: Optional[ForcingT]
- bmi: Bmi
Basic Modeling Interface object
ewatercycle.observation package
Submodules
Global Runoff Data Centre module.
- ewatercycle.observation.grdc.get_grdc_data(station_id: str, start_time: str, end_time: str, parameter: str = 'Q', data_home: Optional[str] = None, column: str = 'streamflow') Tuple[pandas.core.frame.DataFrame, Dict[str, Union[str, int, float]]]
Get river discharge data from Global Runoff Data Centre (GRDC).
Requires the GRDC daily data files in a local directory. The GRDC daily data files can be ordered at https://www.bafg.de/GRDC/EN/02_srvcs/21_tmsrs/riverdischarge_node.html
- Parameters
station_id – The station id to get. The station id can be found in the catalogues at https://www.bafg.de/GRDC/EN/02_srvcs/21_tmsrs/212_prjctlgs/project_catalogue_node.html
start_time – Start time of model in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’.
end_time – End time of model in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’.
parameter – optional. The parameter code to get, e.g. (‘Q’) discharge, cubic meters per second.
data_home – optional. The directory where the daily grdc data is located. If left out will use the grdc_location in the eWaterCycle configuration file.
column – optional. Name of column in dataframe. Default: “streamflow”.
- Returns
grdc data in a dataframe and metadata.
Examples
from ewatercycle.observation.grdc import get_grdc_data df, meta = get_grdc_data('6335020', '2000-01-01T00:00Z', '2001-01-01T00:00Z') df.describe() streamflow count 4382.000000 mean 2328.992469 std 1190.181058 min 881.000000 25% 1550.000000 50% 2000.000000 75% 2730.000000 max 11300.000000 meta {'grdc_file_name': '/home/myusername/git/eWaterCycle/ewatercycle/6335020_Q_Day.Cmd.txt', 'id_from_grdc': 6335020, 'file_generation_date': '2019-03-27', 'river_name': 'RHINE RIVER', 'station_name': 'REES', 'country_code': 'DE', 'grdc_latitude_in_arc_degree': 51.756918, 'grdc_longitude_in_arc_degree': 6.395395, 'grdc_catchment_area_in_km2': 159300.0, 'altitude_masl': 8.0, 'dataSetContent': 'MEAN DAILY DISCHARGE (Q)', 'units': 'm³/s', 'time_series': '1814-11 - 2016-12', 'no_of_years': 203, 'last_update': '2018-05-24', 'nrMeasurements': 'NA', 'UserStartTime': '2000-01-01T00:00Z', 'UserEndTime': '2001-01-01T00:00Z', 'nrMissingData': 0}
- ewatercycle.observation.usgs.get_usgs_data(station_id, start_date, end_date, parameter='00060', cache_dir=None)
Get river discharge data from the USGS REST web service.
See U.S. Geological Survey Water Services (USGS)
- Parameters
station_id (str) – The station id to get
start_date (str) – String for start date in the format: ‘YYYY-MM-dd’, e.g. ‘1980-01-01’
end_date (str) – String for start date in the format: ‘YYYY-MM-dd’, e.g. ‘2018-12-31’
parameter (str) – The parameter code to get, e.g. (‘00060’) discharge, cubic feet per second
cache_dir (str) – Directory where files retrieved from the web service are cached. If set to None then USGS_DATA_HOME env var will be used as cache directory.
Examples
>>> from ewatercycle.observation.usgs import get_usgs_data >>> data = get_usgs_data('03109500', '2000-01-01', '2000-12-31', cache_dir='.') >>> data <xarray.Dataset> Dimensions: (time: 8032) Coordinates: * time (time) datetime64[ns] 2000-01-04T05:00:00 ... 2000-12-23T04:00:00 Data variables: Streamflow (time) float32 8.296758 10.420501 ... 10.647034 11.694747 Attributes: title: USGS Data from streamflow data station: Little Beaver Creek near East Liverpool OH stationid: 03109500 location: (40.6758974, -80.5406244)
ewatercycle.parameter_sets package
- ewatercycle.parameter_sets.available_parameter_sets(target_model: Optional[str] = None) Tuple[str, ...]
List available parameter sets on this machine.
- Parameters
target_model – Filter parameter sets on a model name
Returns: Names of available parameter sets on current machine.
- ewatercycle.parameter_sets.get_parameter_set(name: str) ParameterSet
Get parameter set object available on this machine so it can be used in a model.
- Parameters
name – Name of parameter set
Returns: Parameter set object that can be used in an ewatercycle model constructor.
- ewatercycle.parameter_sets.download_parameter_sets(zenodo_doi: str, target_model: str, config: str)
- ewatercycle.parameter_sets.example_parameter_sets() Dict[str, ExampleParameterSet]
Lists the available example parameter sets.
They can be downloaded with
download_example_parameter_sets()
.
- ewatercycle.parameter_sets.download_example_parameter_sets(skip_existing=True)
Downloads all of the example parameter sets and adds them to the config_file.
Downloads to parameterset_dir directory defined in
ewatercycle.config.CFG
.- Parameters
skip_existing – When true will not download any parameter set which already has a local directory. When false will raise ValueError exception when parameter set already exists.
Submodules
- class ewatercycle.parameter_sets.default.ParameterSet(name: str, directory: str, config: str, doi='N/A', target_model='generic', supported_model_versions: Optional[Set[str]] = None)
Bases:
object
Container object for parameter set options.
- directory
Location on disk where files of parameter set are stored. If Path is relative then relative to CFG[‘parameterset_dir’].
- Type
Path
- config
Model configuration file which uses files from
directory
. If Path is relative then relative to CFG[‘parameterset_dir’].- Type
Path
ewatercycle.parametersetdb package
Documentation about ewatercycle_parametersetdb
- class ewatercycle.parametersetdb.ParameterSet(df: AbstractCopier, cfg: AbstractConfig)
Bases:
object
- save_datafiles(target)
Saves datafiles to target directory
- Parameters
target – Path of target directory
- save_config(target)
Saves config file as target filename
- Parameters
target – filename of config file
- ewatercycle.parametersetdb.build_from_urls(config_format, config_url, datafiles_format, datafiles_url) ParameterSet
Construct ParameterSet based on urls
- Parameters
config_format – Format of file found at config url
config_url – Url of config file
datafiles_format – Method to stage datafiles url
datafiles_url – Source url of datafiles
Submodules
- class ewatercycle.parametersetdb.config.CaseConfigParser(defaults=None, dict_type=<class 'collections.OrderedDict'>, allow_no_value=False, *, delimiters=('=', ':'), comment_prefixes=('#', ';'), inline_comment_prefixes=None, strict=True, empty_lines_in_values=True, default_section='DEFAULT', interpolation=<object object>, converters=<object object>)
Bases:
ConfigParser
Case sensitive config parser See https://stackoverflow.com/questions/1611799/preserve-case-in-configparser
- optionxform(optionstr)
- ewatercycle.parametersetdb.config.fetch(url)
Fetches text of url
- class ewatercycle.parametersetdb.config.IniConfig(source)
Bases:
AbstractConfig
Config container where config is read/saved in ini format.
- save(target)
- Parameters
target – File path to save config to
Returns:
- class ewatercycle.parametersetdb.config.YamlConfig(source)
Bases:
AbstractConfig
Config container where config is read/saved in yaml format
- yaml = <ruamel.yaml.main.YAML object>
- save(target)
- Parameters
target – File path to save config to
Returns:
- class ewatercycle.parametersetdb.config.XmlConfig(source)
Bases:
AbstractConfig
Config container where config is read/saved in xml format.
- save(target)
Save xml to file.
- Parameters
target – file to save to
- class ewatercycle.parametersetdb.datafiles.SubversionCopier(source: str)
Bases:
AbstractCopier
Uses subversion export to copy files from source to target
- save(target)
Saves datafiles to target directory
- Parameters
target – Directory where to save the datafiles
Returns:
- class ewatercycle.parametersetdb.datafiles.SymlinkCopier(source: str)
Bases:
AbstractCopier
Creates symlink from source to target
- save(target)
Saves datafiles to target directory
- Parameters
target – Directory where to save the datafiles
Returns:
Submodules
ewatercycle.util module
- ewatercycle.util.find_closest_point(grid_longitudes: Iterable[float], grid_latitudes: Iterable[float], point_longitude: float, point_latitude: float) Tuple[int, int]
Find closest grid cell to a point based on Geographical distances.
- Parameters
grid_longitudes – 1d array of model grid longitudes in degrees
grid_latitudes – 1d array of model grid latitudes in degrees
point_longitude – longitude in degrees of target coordinate
point_latitude – latitude in degrees of target coordinate
- Returns
index of closest grid point in the original longitude array idx_lat: index of closest grid point in the original latitude array
- Return type
idx_lon
- ewatercycle.util.geographical_distances(point_longitude: float, point_latitude: float, lon_vectors: numpy.ndarray, lat_vectors: numpy.ndarray, radius=6373.0) numpy.ndarray
It uses Spherical Earth projected to a plane formula: https://en.wikipedia.org/wiki/Geographical_distance
- Parameters
point_longitude – longitude in degrees of target coordinate
point_latitude – latitude in degrees of target coordinate
lon_vectors – 1d array of longitudes in degrees
lat_vectors – 1d array of latitudes in degrees
radius – Radius of a sphere in km. Default is Earths approximate radius.
- Returns
array of geographical distance of point to all vector members
- Return type
distances
- ewatercycle.util.get_time(time_iso: str) datetime
Return a datetime in UTC.
Convert a date string in ISO format to a datetime and check if it is in UTC.
- ewatercycle.util.get_extents(shapefile: Any, pad=0) Dict[str, float]
Get lat/lon extents from shapefile and add padding.
- Parameters
shapefile – Path to shapfile
pad – Optional padding
- Returns
Dict with start_longitude, start_latitude, end_longitude, end_latitude
- ewatercycle.util.fit_extents_to_grid(extents, step=0.1, offset=0.05, ndigits=2) Dict[str, float]
Get lat/lon extents fitted to a grid.
- Parameters
extents – Dict with start_longitude, start_latitude, end_longitude, end_latitude
step – Distance between to grid cells
offset – Offset to pad with after rounding extent to step.
ndigits – Number of digits to return
- Returns
Dict with start_longitude, start_latitude, end_longitude, end_latitude
- ewatercycle.util.data_files_from_recipe_output(recipe_output: esmvalcore.experimental.recipe_output.RecipeOutput) Tuple[str, Dict[str, str]]
Get data files from a ESMVaLTool recipe output
Expects first diagnostic task to produce files with single var each.
- Parameters
recipe_output – ESMVaLTool recipe output
- Returns
Tuple with directory of files and a dict where key is cmor short name and value is relative path to NetCDF file
- ewatercycle.util.to_absolute_path(input_path: str, parent: Optional[Path] = None, must_exist: bool = False, must_be_in_parent=True) Path
Parse input string as
pathlib.Path
object.- Parameters
input_path – Input string path that can be a relative or absolute path.
parent – Optional parent path of the input path
must_exist – Optional argument to check if the input path exists.
must_be_in_parent – Optional argument to check if the input path is subpath of parent path
- Returns
The input path that is an absolute path and a
pathlib.Path
object.
- ewatercycle.util.reindex(source_file: str, var_name: str, mask_file: str, target_file: str)
Conform the input file onto the indexes of a mask file, writing the results to the target file.
- Parameters
source_file – Input string path of the file that needs to be reindexed.
var_name – Variable name in the source_file dataset.
mask_file – Input string path of the mask file.
target_file – Output string path of the
reindexed. (file that is) –