Welcome to ewatercycle’s documentation!

The eWaterCycle Python package brings together many components from the eWaterCycle project. An overall goal of this project is to make hydrological modelling fully reproducible, open, and FAIR.

Modelled after PyMT, it enables interactively running a model from a Python environment like so:

from ewatercycle.models import Wflow
model = Wflow(version="2020.1.1", parameterset=example_parameter_set, forcing=example_forcing)
cfg_file, cfg_dir = model.setup(end_time="2020-01-01T00:00:00Z")
model.initialize(cfg_file)

output = []
while model.time < model.end_time:
    model.update()
    discharge = model.get_value_at_coords("RiverRunoff", lat=[52.3], lon=[5.2])
    output.append(discharge)

To learn how to use the package, see the User guide and example pages.

Typically the eWaterCycle platform is deployed on a system that can be accessed through the browser via JupyterHub, and comes preconfigured with readily available parameter sets, meteorological forcing data, model images, etcetera. This makes it possible for researchers to quickly run an experiment without the hassle of installing a model or creating suitable input data. To learn more about the system setup, read our System setup page.

In general eWaterCycle tries to strike a balance between making it easy to use standard available elements of an experiment (datasets, models, analysis algorithms), and supplying custom elements. This does mean that a simple usecase sometimes requires slightly more lines of code than strictly nescessary, for the sake of making it easy to adapt this code to more complex and/or custom usecases.

Glossary

To avoid miscommunication, here we define explicitly what we mean by some terms that are commonly used throughout this documentation.

  • Experiment: A notebook running one or more hydrological models and producing a scientific result.

  • Model: Software implementation of an algorithm. Note this excludes data required for this model.

  • Forcing: all time dependent data needed to run a model, and that is not impacted by the model.

  • Model Parameters: fixed parameters (depth of river, land use, irrigation channels, dams). Considered constant during a model run.

  • Parameter Set: File based collection of parameters for a certain model, resolution, and possibly area.

  • Model instance: single running instance of a model, including all data required, and with a current state.

1         In:
# Suppress distracting outputs in these examples
import logging
import warnings

warnings.filterwarnings("ignore", category=UserWarning)
logger = logging.getLogger("esmvalcore")
logger.setLevel(logging.WARNING)

User guide

This user manual will explain how the eWaterCycle Python package can be used to perform hydrological experiments. We will walk through the following chapters:

  • parameter sets

  • forcing data

  • model instances

  • using observations

  • analysis

Each of these chapters correspond to a so-called “subpackage” of eWaterCycle Python package. Before we continue, however, we will briefly explain the configuration file.

Configuration

To be able to find all needed data and models eWaterCycle comes with a configuration object. This configuration contains system settings for eWaterCycle (which container technology to use, where is the data located, etc). In general these should not need to be changed by the user for a specific experiment, and ideally a user would never need to touch this configuration on a properly managed system. However, it is good to know that it is there.

You can see the default configuration on your system like so:

2         In:
from ewatercycle import CFG

CFG
2       Out:
Config({'container_engine': 'singularity',
        'ewatercycle_config': PosixPath('/home/fakhereh/.config/ewatercycle/ewatercycle.yaml'),
        'grdc_location': PosixPath('/projects/0/wtrcycle/comparison/GRDC/GRDC_GCOSGTN-H_27_03_2019'),
        'output_dir': PosixPath('/scratch/shared/ewatercycle/user_guide'),
        'parameter_sets': {'lisflood_fraser': {'config': 'lisflood_fraser/settings_lat_lon-Run.xml',
                                               'directory': 'lisflood_fraser',
                                               'doi': 'N/A',
                                               'supported_model_versions': {'20.10'},
                                               'target_model': 'lisflood'},
                           'pcrglobwb_rhinemeuse_30min': {'config': 'pcrglobwb_rhinemeuse_30min/setup_natural_test.ini',
                                                          'directory': 'pcrglobwb_rhinemeuse_30min',
                                                          'doi': 'https://doi.org/10.5281/zenodo.1045339',
                                                          'supported_model_versions': {'setters'},
                                                          'target_model': 'pcrglobwb'},
                           'wflow_rhine_sbm_nc': {'config': 'wflow_rhine_sbm_nc/wflow_sbm_NC.ini',
                                                  'directory': 'wflow_rhine_sbm_nc',
                                                  'doi': 'N/A',
                                                  'supported_model_versions': {'2020.1.1'},
                                                  'target_model': 'wflow'}},
        'parameterset_dir': PosixPath('/scratch/shared/ewatercycle/user_guide'),
        'singularity_dir': PosixPath('/scratch/shared/ewatercycle/user_guide')})

Note: a path on the local filesystem is always denoted as “dir” (short for directory), instead of folder, path, or location. Especially location can be confusing in the context of geospatial modeling.

It is also possible to store and load custom configuration files. For more information, see system setup

Parameter sets

Parameter sets are an essential part of many hydrological models, and for the eWaterCycle package as well.

3         In:
import ewatercycle.parameter_sets

The default system setup includes a number of example parameter sets that can be used directly. System administrators can also add available parameter sets that are globally availble to all users. In the future, we’re hoping to add functionality to fetch new parameter sets using a DOI as well.

To see the available parameter sets:

4         In:
ewatercycle.parameter_sets.available_parameter_sets()
4       Out:
('lisflood_fraser', 'pcrglobwb_rhinemeuse_30min', 'wflow_rhine_sbm_nc')

Since most parameter sets are model specific, you can filter the results as well:

5         In:
ewatercycle.parameter_sets.available_parameter_sets(target_model="wflow")
5       Out:
('wflow_rhine_sbm_nc',)

Once you have found a suitable parameter set, you can load it and see some more details:

6         In:
parameter_set = ewatercycle.parameter_sets.get_parameter_set("wflow_rhine_sbm_nc")
print(parameter_set)
Parameter set
-------------
name=wflow_rhine_sbm_nc
directory=/scratch/shared/ewatercycle/user_guide/wflow_rhine_sbm_nc
config=/scratch/shared/ewatercycle/user_guide/wflow_rhine_sbm_nc/wflow_sbm_NC.ini
doi=N/A
target_model=wflow
supported_model_versions={'2020.1.1'}

or you can access individual attributes of the parameter sets

7         In:
parameter_set.supported_model_versions
7       Out:
{'2020.1.1'}

Should you wish to configure your own parameter set (e.g. for PCRGlobWB in this case), this is also possible:

8         In:
custom_parameter_set = ewatercycle.parameter_sets.ParameterSet(
    name="custom_parameter_set",
    directory="~/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min",
    config="~/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min/setup_natural_test.ini",
    target_model="pcrglobwb",
    doi="https://doi.org/10.5281/zenodo.1045339",
    supported_model_versions={"setters"},
)

As you can see, an eWaterCycle parameter set is defined fully by a directory and a configuration file. The configuration file typically informs the model about the structure of the parameter set (e.g. “what is the filename of the land use data”). It is possible to change these settings later, when setting up the model.

Forcing data

eWaterCycle can load or generate forcing data for a model using the forcing module.

9         In:
import ewatercycle.forcing
Existing forcing from external source

We first show how existing forcing data can be loaded with eWaterCycle. The wflow example parameter set already includes forcing data that was generated manually by the scientists at Deltares.

10         In:
forcing = ewatercycle.forcing.load_foreign(
    directory=str(parameter_set.directory),
    target_model="wflow",
    start_time="1991-01-01T00:00:00Z",
    end_time="1991-12-31T00:00:00Z",
    shape=None,
    forcing_info=dict(
        # Additional information about the external forcing data needed for the model configuration
        netcdfinput="inmaps.nc",
        Precipitation="/P",
        EvapoTranspiration="/PET",
        Temperature="/TEMP",
    ),
)
print(forcing)
Forcing data for Wflow
----------------------
Directory: /scratch/shared/ewatercycle/user_guide/wflow_rhine_sbm_nc
Start time: 1991-01-01T00:00:00Z
End time: 1991-12-31T00:00:00Z
Shapefile: None
Additional information for model config:
  - netcdfinput: inmaps.nc
  - Precipitation: /P
  - Temperature: /TEMP
  - EvapoTranspiration: /PET
  - Inflow: None

As you can see, the forcing consists of a generic part which is the same for all eWaterCycle models, and a model-specific part (forcing_info). If you’re familiar with wflow, you might recognize that the model-specific settings map directly to wflow configuration settings.

Generating forcing data

In most cases, you will not have access to tailor-made forcing data, and manually pre-processing existing datasets can be quite a pain. eWaterCycle includes a forcing generator that can do all the required steps to go from the available datasets (ERA5, ERA-Interim, etc) to whatever format the models require. This is done through ESMValTool recipes. For some models (e.g. lisflood) additional computations are done, as some steps require data and/or code that is not available to ESMValTool.

Apart from some standard parameters (start time, datasets, etc.), the forcing generator sometimes requires additional model-specific options. For our wflow example case, we need to pass the DEM file to the ESMValTool recipe as well. All model-specific options are listed in the API documentation.

ESMValTool configuration

As eWaterCycle relies on ESMValTool for processing forcing data, configuration for forcing is mostly defered to the esmvaltool configuration file. What ESMValTool configuration file to use can be specified in the system setup.

11         In:
forcing = ewatercycle.forcing.generate(
    target_model="wflow",
    dataset="ERA5",
    start_time="1990-01-01T00:00:00Z",
    end_time="1990-01-31T00:00:00Z",
    shape="~/GitHub/ewatercycle/docs/examples/data/Rhine/Rhine.shp",
    model_specific_options={
        "dem_file": "/scratch-shared/ewatercycle/user_guide/wflow_rhine_sbm_nc/staticmaps/wflow_dem.map",
    },
)
print(forcing)
Forcing data for Wflow
----------------------
Directory: /scratch/shared/ewatercycle/user_guide/recipe_wflow_20210720_122543/work/wflow_daily/script
Start time: 1990-01-01T00:00:00Z
End time: 1990-01-31T00:00:00Z
Shapefile: /nfs/home2/fakhereh/GitHub/ewatercycle/docs/examples/data/Rhine/Rhine.shp
Additional information for model config:
  - netcdfinput: wflow_ERA5_Rhine_1990_1990.nc
  - Precipitation: /pr
  - Temperature: /tas
  - EvapoTranspiration: /pet
  - Inflow: None

Generated forcing is automatically saved to the ESMValTool output directory. A yaml file is stored there as well, such that you can easily reload the forcing later without having to generate it again.

ewatercycle_forcing.yaml:

!WflowForcing
start_time: '1990-01-01T00:00:00Z'
end_time: '1990-12-31T00:00:00Z'
shape:
netcdfinput: wflow_ERA5_Rhine_1990_1990.nc
Precipitation: /pr
EvapoTranspiration: /pet
Temperature: /tas
Inflow:
12         In:
reloaded_forcing = ewatercycle.forcing.load(
    directory="/scratch/shared/ewatercycle/user_guide/recipe_wflow_20210720_122543/work/wflow_daily/script"
)

Models

13         In:
import ewatercycle.models

eWaterCycle currently integrates the following models:

and we’re expecting to add more models soon. The process for adding new models is documented in Adding models

Model versions

To help with reproducibility the version of a model must always be specified when creating a model instance. The available versions can be seen like so:

14         In:
import ewatercycle.models

ewatercycle.models.Wflow.available_versions
14       Out:
('2020.1.1',)
Creating, setting up, and initializing a model instance

The way models are created, setup, and initialized matches PyMT as much as possible. There are three steps:

  • instantiate (create a python object that represents the model)

  • setup (create a container with the right model, directories, and configuration files)

  • initialize (start the model inside the container)

To a new user, these steps can be confusing as they seem to be related to “starting a model”. However, you will see that there are some useful things that we can do in between each of these steps. As a side effect, splitting these steps also makes it easier to run a lot of models in parallel (e.g. for calibration). Experience tells us that you will quickly get used to it.

When a model instance is created, we have to specify the version and pass in a suitable parameter set and forcing.

15         In:
model_instance = ewatercycle.models.Wflow(
    version="2020.1.1", parameter_set=parameter_set, forcing=forcing
)
WARNING:ewatercycle.models.wflow:Config file from parameter set is missing API section, adding section
WARNING:ewatercycle.models.wflow:Config file from parameter set is missing RiverRunoff option in API section, added it with value '2, m/s option'

In some specific cases the parameter set (e.g. for marrmot) or the forcing (e.g. when it is already included in the parameter set) is not needed.

Most models have a variety of parameters that can be set. An opiniated subset of these parameters is exposed through the eWaterCycle API. We focus on those settings that are relevant from a scientific point of view and prefer to hide technical settings. These parameters and their default values can be inspected as follows:

16         In:
model_instance.parameters
16       Out:
[('start_time', '1990-01-01T00:00:00Z'), ('end_time', '1990-01-31T00:00:00Z')]

The start date and end date are automatically set based on the forcing data.

Alternative values for each of these parameters can be passed on to the setup function:

17         In:
cfg_file, cfg_dir = model_instance.setup(end_time="1990-12-15T00:00:00Z")

The setup function does the following:

  • Create a config directory which serves as the current working directory for the mode instance

  • Creates a configuration file in this directory based on the settings

  • Starts a container with the requested model version and access to the forcing and parameter sets.

  • Input is mounted read-only, the working directory is mounted read-write (if a model cannot cope with inputs outside the working directory, the input will be copied).

  • Setup will complain about incompatible model version, parameter_set, and forcing.

After setup but before initialize everything is good-to-go, but nothing has been done yet. This is an opportunity to inspect the generated configuration file, and make any changes manually that could not be done through the setup method.

To modify the config file: print the path, open it in an editor, and save:

18         In:
print(cfg_file)
/scratch/shared/ewatercycle/user_guide/wflow_20210720_122650/wflow_ewatercycle.ini

Once you’re happy with the setup, it is time to initialize the model. You’ll have to pass in the config file, even if you’ve not made any changes:

19         In:
model_instance.initialize(cfg_file)  # for some models, this step can take some time
Running (and interacting with) a model

A model instance can be controlled by calling functions for running a single timestep (update), setting variables, and getting variables. Besides the rather lowlevel BMI functions like get_value and set_value, we also added convenience functions such as get_value_as_xarray, get_value_at_coords, time_as_datetime, and time_as_isostr. These make it even more pleasant to interact with the model.

For example, to run our model instance from start to finish, fetching the value of variable discharge at the location of a grdc station:

20         In:
grdc_latitude = 51.756918
grdc_longitude = 6.395395
21         In:
output = []
while model_instance.time < model_instance.end_time:
    model_instance.update()

    discharge = model_instance.get_value_at_coords(
        "RiverRunoff", lon=[grdc_longitude], lat=[grdc_latitude]
    )[0]
    output.append(discharge)

    # Here you could do whatever you like, e.g. update soil moisture values before doing the next timestep.

    print(
        model_instance.time_as_isostr, end="\r"
    )  # "\r" clears the output before printing the next timestamp
1990-12-15T00:00:00Z

We can also get the entire model field at a single time step. To simply plot it:

22         In:
model_instance.get_value_as_xarray("RiverRunoff").plot()
22       Out:
<matplotlib.collections.QuadMesh at 0x2b3c1beff520>
_images/user_guide_47_1.png

If you want to know which variables are available, you can use

23         In:
model_instance.output_var_names
23       Out:
('RiverRunoff',)
Destroying the model

A model instance running in a container can take up quite a bit of resources on the system. When you’re done with an experiment, it is good practice to always finalize the model. This will make sure the model properly performs any tear-down tasks and eventually the container will be destroyed.

24         In:
model_instance.finalize()

Observations

eWaterCycle also includes utilities to easily load observations. Currently, eWaterCycle systems provide access to GRDC and USGS data, and we’re hoping to expand this in the future.

          In:
import ewatercycle.observation.grdc

To load GRDC station data:

2         In:
grdc_station_id = "6335020"

observations, metadata = ewatercycle.observation.grdc.get_grdc_data(
    station_id=grdc_station_id,
    start_time="1990-01-01T00:00:00Z",  # or: model_instance.start_time_as_isostr
    end_time="1990-12-15T00:00:00Z",
    column="GRDC",
)

observations.head()
GRDC station 6335020 is selected. The river name is: RHINE RIVER.The coordinates are: (51.756918, 6.395395).The catchment area in km2 is: 159300.0. There are 0 missing values during 1990-01-01T00:00:00Z_1990-12-15T00:00:00Z at this station. See the metadata for more information.
2       Out:
GRDC
time
1990-01-01 2200.0
1990-01-02 1990.0
1990-01-03 1840.0
1990-01-04 1720.0
1990-01-05 1620.0

Since not all GRDC stations are complete, some information is stored in metadata to inform you about the data.

27         In:
print(metadata)
{'grdc_file_name': '/lustre1/0/wtrcycle/comparison/GRDC/GRDC_GCOSGTN-H_27_03_2019/6335020_Q_Day.Cmd.txt', 'id_from_grdc': 6335020, 'file_generation_date': '2019-03-27', 'river_name': 'RHINE RIVER', 'station_name': 'REES', 'country_code': 'DE', 'grdc_latitude_in_arc_degree': 51.756918, 'grdc_longitude_in_arc_degree': 6.395395, 'grdc_catchment_area_in_km2': 159300.0, 'altitude_masl': 8.0, 'dataSetContent': 'MEAN DAILY DISCHARGE (Q)', 'units': 'm³/s', 'time_series': '1814-11 - 2016-12', 'no_of_years': 203, 'last_update': '2018-05-24', 'nrMeasurements': 'NA', 'UserStartTime': '1990-01-01T00:00:00Z', 'UserEndTime': '1990-12-15T00:00:00Z', 'nrMissingData': 0}

Analysis

To easily analyse model output, eWaterCycle also includes an analysis module.

28         In:
import ewatercycle.analysis

For example, we will plot a hydrograph of the model run and GRDC observations. To this end, we combine the two timeseries in a single dataframe

29         In:
combined_discharge = observations
combined_discharge["wflow"] = output
30         In:
ewatercycle.analysis.hydrograph(
    discharge=combined_discharge,
    reference="GRDC",
)
30       Out:
(<Figure size 720x720 with 2 Axes>,
 (<AxesSubplot:title={'center':'Hydrograph'}, xlabel='time', ylabel='Discharge (m$^3$ s$^{-1}$)'>,
  <AxesSubplot:>))
_images/user_guide_62_1.png

System setup

To use eWaterCycle package you need to setup the system with software and data.

This chapter is for system administrators or Research Software Engineers who need to set up a system for the eWaterCycle platform.

The setup steps:

  1. Conda environment

  2. Install ewatercycle package

  3. Configure ESMValTool

  4. Download climate data

  5. Install container engine

  6. Configure ewatercycle

  7. Model container images

  8. Download example parameter sets

  9. Prepare other parameter sets

  10. Download example forcing

  11. Download observation data

Conda environment

The eWaterCycle Python package uses a lot of geospatial dependencies which can be installed using Conda package management system.

Install Conda by using the miniconda installer.

After conda is installed you can install the software dependencies with a conda environment file.

wget https://raw.githubusercontent.com/eWaterCycle/ewatercycle/main/environment.yml
conda install mamba -n base -c conda-forge -y
mamba env create --file environment.yml
conda activate ewatercycle

Do not forget that any terminal or Jupyter kernel should activate the conda environment before the eWaterCycle Python package can be used.

Install eWaterCycle package

The Python package can be installed using pip

pip install ewatercycle

Configure ESMValTool

ESMValTool is used to generate forcing (temperature, precipitation, etc.) files from climate data for hydrological models. The ESMValTool has been installed as a dependency of the package.

See https://docs.esmvaltool.org/en/latest/quickstart/configuration.html how configure ESMValTool.

Download climate data

The ERA5 and ERA-Interim data can be used to generate forcings.

ERA5

To download ERA5 data files you can use the era5cli tool.

pip install era5cli

Follow instructions to get access to data.

As an example, the hourly ERA5 data for the years 1990 and 1991 and for variables pr, psl, tas, taxmin, tasmax, tdps, uas, vas, rsds, rsdt and fx orog are downloaded as:

cd <ESMValTool ERA5 raw directory for example /projects/0/wtrcycle/comparison/rawobs/Tier3/ERA5/1>
era5cli hourly --startyear 1990 --endyear 1991 --variables total_precipitation
era5cli hourly --startyear 1990 --endyear 1991 --variables mean_sea_level_pressure
era5cli hourly --startyear 1990 --endyear 1991 --variables 2m_temperature
era5cli hourly --startyear 1990 --endyear 1991 --variables minimum_2m_temperature_since_previous_post_processing
era5cli hourly --startyear 1990 --endyear 1991 --variables maximum_2m_temperature_since_previous_post_processing
era5cli hourly --startyear 1990 --endyear 1991 --variables 2m_dewpoint_temperature
era5cli hourly --startyear 1990 --endyear 1991 --variables 10m_u_component_of_wind
era5cli hourly --startyear 1990 --endyear 1991 --variables 10m_v_component_of_wind
era5cli hourly --startyear 1990 --endyear 1991 --variables surface_solar_radiation_downwards
era5cli hourly --startyear 1990 --endyear 1991 --variables toa_incident_solar_radiation
era5cli hourly --startyear 1990 --endyear 1991 --variables orography
cd -

The hourly data needs need be converted to daily using a ESMValTool recipe

esmvaltool run cmorizers/recipe_era5.yml
ERA-Interim

ERA-Interim has been superseeded by ERA5, but could be useful for reproduction studies and its smaller size. The ERA-Interim data files can be downloaded at https://www.ecmwf.int/en/forecasts/datasets/reanalysis-datasets/era-interim

Or you can use the download_era_interim.py script to download ERA-Interim data files. See first lines of script for documentation. The files should be downloaded to the ESMValTool ERA-Interim raw directory for example /projects/0/wtrcycle/comparison/rawobs/Tier3/ERA-Interim.

The ERA5-Interim raw data files need to be cmorized using script:

cmorize_obs -o ERA-Interim

Install container engine

In eWaterCycle package, the hydrological models are run in containers with engines like Singularity or Docker. At least Singularity or Docker should be installed.

Installing a container engine requires root permission on the machine.

Singularity

Install Singularity using instructions.

Docker

Install Docker using instructions. Docker should be configured so it can be called without sudo

Configure eWaterCycle

The eWaterCycle package simplifies the API by reading some of the directories and settings from a configuration file.

The configuration can be set in Python with

import logging
logging.basicConfig(level=logging.INFO)
import ewatercycle
import ewatercycle.parameter_sets
# Which container engine is used to run the hydrological models
ewatercycle.CFG['container_engine'] = 'singularity'  # or 'docker'
# If container_engine==singularity then where can the singularity images files (*.sif) be found.
ewatercycle.CFG['singularity_dir'] = './singularity-images'
# Directory in which output of model runs is stored. Each model run will generate a sub directory inside output_dir
ewatercycle.CFG['output_dir'] = './'
# Where can GRDC observation files (<station identifier>_Q_Day.Cmd.txt) be found.
ewatercycle.CFG['grdc_location'] = './grdc-observations'
# Where can parameters sets prepared by the system administator be found
ewatercycle.CFG['parameterset_dir'] = './parameter-sets'
# Where is the configuration saved or loaded from
ewatercycle.CFG['ewatercycle_config'] = './ewatercycle.yaml'

and then written to disk with

ewatercycle.CFG.save_to_file()

Later it can be loaded by using:

ewatercycle.CFG.load_from_file('./ewatercycle.yaml')

To make the ewatercycle configuration load by default for current user it should be copied to ~/.config/ewatercycle/ewatercycle.yaml .

To make the ewatercycle configuration available to all users on the system it should be copied to /etc/ewatercycle.yaml .

Configuration file for Cartesius system

Users part of the eWaterCycle project can use the following configurations on the Cartesius system of SURFSara:

container_engine: singularity
singularity_dir: /projects/0/wtrcycle/singularity-images
output_dir: /scratch/shared/ewatercycle
grdc_location: /projects/0/wtrcycle/GRDC
parameterset_dir: /projects/0/wtrcycle/parameter-sets
Configuration file for ewatecycle Jupyter machine

Users can use the following configurations on systems constructed with eWaterCycle application on SURF Research Cloud:

container_engine: singularity
singularity_dir: /mnt/data/singularity-images
output_dir: /scratch
grdc_location: /mnt/data/GRDC
parameterset_dir: /mnt/data/parameter-sets

Model container images

As hydrological models run in containers, their container images should be made available on the system.

The names of the images can be found in the ewatercycle.models.* classes.

Docker

Docker images will be downloaded with docker pull:

docker pull ewatercycle/lisflood-grpc4bmi:20.10
docker pull ewatercycle/marrmot-grpc4bmi:2020.11
docker pull ewatercycle/pcrg-grpc4bmi:setters
docker pull ewatercycle/wflow-grpc4bmi:2020.1.1
Singularity

Singularity images should be stored in configured directory (ewatercycle.CFG['singularity_dir']) and can build from Docker with:

cd {ewatercycle.CFG['singularity_dir']}
singularity build ewatercycle-lisflood-grpc4bmi_20.10.sif docker://ewatercycle/lisflood-grpc4bmi:20.10
singularity build ewatercycle-marrmot-grpc4bmi_2020.11.sif docker://ewatercycle/marrmot-grpc4bmi:2020.11
singularity build ewatercycle-pcrg-grpc4bmi_setters.sif docker://ewatercycle/pcrg-grpc4bmi:setters
singularity build ewatercycle-wflow-grpc4bmi_2020.1.1.sif docker://ewatercycle/wflow-grpc4bmi:2020.1.1
cd -

Download example parameter sets

To quickly run the models it is advised to setup a example parameter sets for each model.

ewatercycle.parameter_sets.download_example_parameter_sets()
INFO:ewatercycle.parameter_sets._example:Downloading example parameter set wflow_rhine_sbm_nc to /home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/wflow_rhine_sbm_nc...
INFO:ewatercycle.parameter_sets._example:Download complete.
INFO:ewatercycle.parameter_sets._example:Adding parameterset wflow_rhine_sbm_nc to ewatercycle.CFG...
INFO:ewatercycle.parameter_sets._example:Downloading example parameter set pcrglobwb_rhinemeuse_30min to /home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min...
INFO:ewatercycle.parameter_sets._example:Download complete.
INFO:ewatercycle.parameter_sets._example:Adding parameterset pcrglobwb_rhinemeuse_30min to ewatercycle.CFG...
INFO:ewatercycle.parameter_sets._example:Downloading example parameter set lisflood_fraser to /home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/lisflood_fraser...
INFO:ewatercycle.parameter_sets._example:Download complete.
INFO:ewatercycle.parameter_sets._example:Adding parameterset lisflood_fraser to ewatercycle.CFG...
INFO:ewatercycle.parameter_sets:3 example parameter sets were downloaded
INFO:ewatercycle.config._config_object:Config written to /home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/ewatercycle.yaml
INFO:ewatercycle.parameter_sets:Saved parameter sets to configuration file /home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/ewatercycle.yaml

Example parameter sets have been downloaded and added to the configuration file.

cat ./ewatercycle.yaml
container_engine: null
grdc_location: None
output_dir: None
parameter_sets:
  lisflood_fraser:
    config: lisflood_fraser/settings_lat_lon-Run.xml
    directory: lisflood_fraser
    doi: N/A
    supported_model_versions: !!set {'20.10': null}
    target_model: lisflood
  pcrglobwb_rhinemeuse_30min:
    config: pcrglobwb_rhinemeuse_30min/setup_natural_test.ini
    directory: pcrglobwb_rhinemeuse_30min
    doi: N/A
    supported_model_versions: !!set {setters: null}
    target_model: pcrglobwb
  wflow_rhine_sbm_nc:
    config: wflow_rhine_sbm_nc/wflow_sbm_NC.ini
    directory: wflow_rhine_sbm_nc
    doi: N/A
    supported_model_versions: !!set {2020.1.1: null}
    target_model: wflow
parameterset_dir: /home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets
singularity_dir: None
ewatercycle.parameter_sets.available_parameter_sets()
('lisflood_fraser', 'pcrglobwb_rhinemeuse_30min', 'wflow_rhine_sbm_nc')
parameter_set = ewatercycle.parameter_sets.get_parameter_set('pcrglobwb_rhinemeuse_30min')
print(parameter_set)
Parameter set
-------------
name=pcrglobwb_rhinemeuse_30min
directory=/home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min
config=/home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min/setup_natural_test.ini
doi=N/A
target_model=pcrglobwb
supported_model_versions={'setters'}

The parameter_set variable can be passed to a model class constructor.

Prepare other parameter sets

The example parameter sets downloaded in the previous section are nice to show off the platform features but are a bit small. To perform more advanced experiments, additional parameter sets are needed. Users could use ewatercycle.parameter_sets.ParameterSet to construct parameter sets themselves. Or they can be made available via ewatercycle.parameter_sets.available_parameter_sets() and ewatercycle.parameter_sets.get_parameter_set() by extending the configuration file (ewatercycle.yaml).

A new parameter set should be added as a key/value pair in the parameter_sets map of the configuration file. The key should be a unique string on the current system. The value is a dictionary with the following items:

  • directory: Location on disk where files of the parameter set are stored. If Path is relative then relative to ewatercycle.CFG['parameterset_dir'].

  • config: Model configuration file which uses files from directory. If Path is relative then relative to ewatercycle.CFG['parameterset_dir'].

  • doi: Persistent identifier of the parameter set. For example a DOI for a Zenodo record.

  • target_model: Name of the model that parameter set can work with

  • supported_model_versions: Set of model versions that are supported by this parameter set. If not set then parameter set will be supported by all versions of model

For example the parameter set for PCR-GLOBWB from https://doi.org/10.5281/zenodo.1045339 after downloading and unpacking to /data/pcrglobwb2_input/ could be added with following config:

pcrglobwb_rhinemeuse_30min:
    directory: /data/pcrglobwb2_input/global_30min/
    config: /data/pcrglobwb2_input/global_30min/iniFileExample/setup_30min_non-natural.ini
    doi: https://doi.org/10.5281/zenodo.1045339
    target_model: pcrglobwb
    supported_model_versions: !!set {setters: null}

Download example forcing

To be able to run the Marrmot example notebooks you need a forcing file. You can use ewatercycle.forcing.generate() to make it or use an already prepared forcing file.

cd docs/examples
wget https://github.com/wknoben/MARRMoT/raw/master/BMI/Config/BMI_testcase_m01_BuffaloRiver_TN_USA.mat
cd -

Download observation data

Observation data is needed to calculate metrics of the model performance or plot a hydrograph . The ewatercycle package can use Global Runoff Data Centre (GRDC) or U.S. Geological Survey Water Services (USGS) data.

The GRDC daily data files can be ordered at https://www.bafg.de/GRDC/EN/02_srvcs/21_tmsrs/riverdischarge_node.html.

The GRDC files should be stored in ewatercycle.CFG['grdc_location'] directory.

Adding a model

Integrating a new model into the eWaterCycle system involves the following steps:

  • Create model as subclass of AbstractModel (ewatercycle/models/abstract.py)

  • Import model in ewatercycle/models/__init__.py

  • Add ewatercycle/forcing/<model>.py

  • Register model in ewatercycle/forcing/__init__.py:FORCING_CLASSES

  • Add model to docs/conf.py

  • Write example notebook

  • Write tests?

  • If model needs custom parameter set class add it in ewatercycle/parameter_sets/_<model name>.py

  • Add example parameter set in ewatercycle/parameter_sets/__init__.py

  • Add container image to setup guide

We will expand this documentation in due time.

Examples

image

Generate forcing in eWaterCycle with ESMValTool

This notebooks shows how to generate forcing data using ERA5 data and ESMValTool hydrological recipes. More information about data, configuration and installation instructions can be found in the System setup in the eWaterCycle documentation.

          In:
import logging
import warnings

warnings.filterwarnings("ignore", category=UserWarning)

logger = logging.getLogger("esmvalcore")
logger.setLevel(logging.WARNING)
2         In:
import xarray as xr

import ewatercycle.forcing
Wflow
Generate forcing

Forcing for Wflow is created using the ESMValTool recipe. It produces one file that contains three variables: temperature, precipitation, and potential evapotranspiration. You can set the start and end date, and the region. See eWaterCycle documentation for more information.

To download wflow_dem.map, see the instructions.

3         In:
wflow_forcing = ewatercycle.forcing.generate(
    target_model="wflow",
    dataset="ERA5",
    start_time="1990-01-01T00:00:00Z",
    end_time="1990-12-31T00:00:00Z",
    shape="./data/Rhine/Rhine.shp",
    model_specific_options={
        "dem_file": "./wflow_rhine_sbm_nc/staticmaps/wflow_dem.map",
    },
)
{'auxiliary_data_dir': PosixPath('/home/sarah/GitHub/ewatercycle/docs/examples'),
 'compress_netcdf': False,
 'config_developer_file': None,
 'config_file': PosixPath('/home/sarah/.esmvaltool/config-user.yml'),
 'drs': {'CMIP5': 'default', 'CMIP6': 'default'},
 'exit_on_warning': False,
 'log_level': 'debug',
 'max_parallel_tasks': 1,
 'output_dir': PosixPath('/home/sarah/temp/output'),
 'output_file_type': 'png',
 'plot_dir': PosixPath('/home/sarah/temp/output/recipe_wflow_20210713_095838/plots'),
 'preproc_dir': PosixPath('/home/sarah/temp/output/recipe_wflow_20210713_095838/preproc'),
 'profile_diagnostic': False,
 'remove_preproc_dir': True,
 'rootpath': {'OBS6': [PosixPath('/home/sarah/temp/ForRecipe')]},
 'run_dir': PosixPath('/home/sarah/temp/output/recipe_wflow_20210713_095838/run'),
 'save_intermediary_cubes': False,
 'work_dir': PosixPath('/home/sarah/temp/output/recipe_wflow_20210713_095838/work'),
 'write_netcdf': True,
 'write_plots': True}
7         In:
print(wflow_forcing)
Forcing data for Wflow
----------------------
Directory: /home/sarah/temp/output/recipe_wflow_20210713_095838/work/wflow_daily/script
Start time: 1990-01-01T00:00:00Z
End time: 1990-12-31T00:00:00Z
Shapefile: None
Additional information for model config:
  - netcdfinput: wflow_ERA5_Rhine_1990_1990.nc
  - Precipitation: /pr
  - Temperature: /tas
  - EvapoTranspiration: /pet
  - Inflow: None
Plot forcing
8         In:
dataset = xr.load_dataset(f"{wflow_forcing.directory}/{wflow_forcing.netcdfinput}")
print(dataset)
for var in ["pr", "tas", "pet"]:
    dataset[var].isel(time=1).plot(cmap="coolwarm", robust=True, size=5)
<xarray.Dataset>
Dimensions:    (bnds: 2, lat: 169, lon: 187, time: 365)
Coordinates:
  * time       (time) datetime64[ns] 1990-01-01T12:00:00 ... 1990-12-31T12:00:00
  * lat        (lat) float64 52.05 52.02 51.98 51.94 ... 46.0 45.97 45.93 45.89
  * lon        (lon) float64 5.227 5.264 5.3 5.337 ... 11.94 11.97 12.01 12.05
    height     float64 2.0
Dimensions without coordinates: bnds
Data variables:
    pr         (time, lat, lon) float32 0.2794 0.2794 0.2794 ... nan nan nan
    time_bnds  (time, bnds) datetime64[ns] 1990-01-01 1990-01-02 ... 1991-01-01
    lat_bnds   (lat, bnds) float64 52.07 52.04 52.04 52.0 ... 45.91 45.91 45.88
    lon_bnds   (lon, bnds) float64 5.209 5.245 5.245 5.282 ... 12.03 12.03 12.07
    tas        (time, lat, lon) float32 0.09246 0.07101 0.03317 ... nan nan nan
    pet        (time, lat, lon) float32 0.5102 0.5103 0.5106 ... nan nan nan
Attributes:
    Conventions:  CF-1.7
    provenance:   <?xml version='1.0' encoding='ASCII'?>\n<prov:document xmln...
    software:     Created with ESMValTool v2.2.0
    caption:      Forcings for the wflow hydrological model.
_images/examples_generate_forcing_9_1.png
_images/examples_generate_forcing_9_2.png
_images/examples_generate_forcing_9_3.png
PCRGlobWB
Generate forcing

Forcing for PCRGlobWB is created using the ESMValTool recipe. It produces one file per each variable: temperature, and precipitation. You can set the start and end date, and the region. See eWaterCycle documentation for more information.

3         In:
pcrglobwb_forcing = ewatercycle.forcing.generate(
    target_model="pcrglobwb",
    dataset="ERA5",
    start_time="1990-01-01T00:00:00Z",
    end_time="1990-12-31T00:00:00Z",
    shape="./data/Rhine/Rhine.shp",
    model_specific_options={
        "start_time_climatology": "1990-01-01T00:00:00Z",
        "end_time_climatology": "1990-01-01T00:00:00Z",
    },
)
{'auxiliary_data_dir': PosixPath('/home/sarah/GitHub/ewatercycle/docs/examples'),
 'compress_netcdf': False,
 'config_developer_file': None,
 'config_file': PosixPath('/home/sarah/.esmvaltool/config-user.yml'),
 'drs': {'CMIP5': 'default', 'CMIP6': 'default'},
 'exit_on_warning': False,
 'log_level': 'debug',
 'max_parallel_tasks': 1,
 'output_dir': PosixPath('/home/sarah/temp/output'),
 'output_file_type': 'png',
 'plot_dir': PosixPath('/home/sarah/temp/output/recipe_pcrglobwb_20210714_152509/plots'),
 'preproc_dir': PosixPath('/home/sarah/temp/output/recipe_pcrglobwb_20210714_152509/preproc'),
 'profile_diagnostic': False,
 'remove_preproc_dir': True,
 'rootpath': {'OBS6': [PosixPath('/home/sarah/temp/ForRecipe')]},
 'run_dir': PosixPath('/home/sarah/temp/output/recipe_pcrglobwb_20210714_152509/run'),
 'save_intermediary_cubes': False,
 'work_dir': PosixPath('/home/sarah/temp/output/recipe_pcrglobwb_20210714_152509/work'),
 'write_netcdf': True,
 'write_plots': True}
Shapefile /home/sarah/GitHub/ewatercycle/docs/examples/data/Rhine/Rhine.shp is not in forcing directory /home/sarah/temp/output/recipe_pcrglobwb_20210714_152509/work/diagnostic_daily/script. So, it won't be saved in /home/sarah/temp/output/recipe_pcrglobwb_20210714_152509/work/diagnostic_daily/script/ewatercycle_forcing.yaml.
4         In:
print(pcrglobwb_forcing)
Forcing data for PCRGlobWB
--------------------------
Directory: /home/sarah/temp/output/recipe_pcrglobwb_20210714_152509/work/diagnostic_daily/script
Start time: 1990-01-01T00:00:00Z
End time: 1990-12-31T00:00:00Z
Shapefile: /home/sarah/GitHub/ewatercycle/docs/examples/data/Rhine/Rhine.shp
Additional information for model config:
  - temperatureNC: pcrglobwb_OBS6_ERA5_reanaly_1_day_tas_1990-1990_Rhine.nc
  - precipitationNC: pcrglobwb_OBS6_ERA5_reanaly_1_day_pr_1990-1990_Rhine.nc
Plot forcing
8         In:
for file_name in [pcrglobwb_forcing.temperatureNC, pcrglobwb_forcing.precipitationNC]:
    dataset = xr.load_dataset(f"{pcrglobwb_forcing.directory}/{file_name}")
    print(dataset)
    print("------------------------")
    var = list(dataset.data_vars.keys())[0]
    dataset[var].isel(time=-1).plot(cmap="coolwarm", robust=True, size=5)
<xarray.Dataset>
Dimensions:    (bnds: 2, lat: 23, lon: 31, time: 730)
Coordinates:
  * time       (time) datetime64[ns] 1989-01-01 1989-01-02 ... 1990-12-31
  * lat        (lat) float32 52.0 51.75 51.5 51.25 ... 47.25 47.0 46.75 46.5
  * lon        (lon) float32 4.251 4.501 4.751 5.001 ... 11.0 11.25 11.5 11.75
    height     float64 2.0
Dimensions without coordinates: bnds
Data variables:
    tas        (time, lat, lon) float32 273.6 273.2 273.0 ... 271.6 268.9 267.0
    time_bnds  (time, bnds) datetime64[ns] 1988-12-31T12:00:00 ... 1990-12-31...
    lat_bnds   (lat, bnds) float32 51.88 52.12 51.62 51.88 ... 46.88 46.38 46.62
    lon_bnds   (lon, bnds) float32 4.125 4.375 4.375 4.625 ... 11.62 11.62 11.88
Attributes:
    comment:      Contains modified Copernicus Climate Change Service Informa...
    Conventions:  CF-1.7
    provenance:   <?xml version='1.0' encoding='ASCII'?>\n<prov:document xmln...
    software:     Created with ESMValTool v2.2.0
    caption:      Forcings for the PCR-GLOBWB hydrological model.
------------------------
<xarray.Dataset>
Dimensions:    (bnds: 2, lat: 23, lon: 31, time: 730)
Coordinates:
  * time       (time) datetime64[ns] 1989-01-01 1989-01-02 ... 1990-12-31
  * lat        (lat) float32 52.0 51.75 51.5 51.25 ... 47.25 47.0 46.75 46.5
  * lon        (lon) float32 4.251 4.501 4.751 5.001 ... 11.0 11.25 11.5 11.75
Dimensions without coordinates: bnds
Data variables:
    pr         (time, lat, lon) float32 9.197e-06 2.069e-05 ... 0.0002843
    time_bnds  (time, bnds) datetime64[ns] 1988-12-31T12:00:00 ... 1990-12-31...
    lat_bnds   (lat, bnds) float32 51.88 52.12 51.62 51.88 ... 46.88 46.38 46.62
    lon_bnds   (lon, bnds) float32 4.125 4.375 4.375 4.625 ... 11.62 11.62 11.88
Attributes:
    comment:      Contains modified Copernicus Climate Change Service Informa...
    Conventions:  CF-1.7
    provenance:   <?xml version='1.0' encoding='ASCII'?>\n<prov:document xmln...
    software:     Created with ESMValTool v2.2.0
    caption:      Forcings for the PCR-GLOBWB hydrological model.
------------------------
_images/examples_generate_forcing_14_1.png
_images/examples_generate_forcing_14_2.png
LISFLOOD
Generate forcing

Forcing for LISFLOOD is created using the ESMValTool recipe. It produces one file per each variable: temperature, precipitation, maximum temperature, minimum temperature, u component of wind, v component of wind, surface solar radiation downwards, and dewpoint temperature. Running LISVAP is not implemented yet. Therefore, LISFLOOD forcing data ‘e0’, ‘es0’ and ‘et0’ are not generated. However, the recipe creates LISVAP input data that can be found in lisflood_forcing.directory. You can set the start and end date, and the region. See eWaterCycle documentation for more information.

4         In:
lisflood_forcing = ewatercycle.forcing.generate(
    target_model="lisflood",
    dataset="ERA5",
    start_time="1990-01-01T00:00:00Z",
    end_time="1990-12-31T00:00:00Z",
    shape="./data/Rhine/Rhine.shp",
)
{'auxiliary_data_dir': PosixPath('/home/sarah/GitHub/ewatercycle/docs/examples'),
 'compress_netcdf': False,
 'config_developer_file': None,
 'config_file': PosixPath('/home/sarah/.esmvaltool/config-user.yml'),
 'drs': {'CMIP5': 'default', 'CMIP6': 'default'},
 'exit_on_warning': False,
 'log_level': 'debug',
 'max_parallel_tasks': 1,
 'output_dir': PosixPath('/home/sarah/temp/output'),
 'output_file_type': 'png',
 'plot_dir': PosixPath('/home/sarah/temp/output/recipe_lisflood_20210713_095903/plots'),
 'preproc_dir': PosixPath('/home/sarah/temp/output/recipe_lisflood_20210713_095903/preproc'),
 'profile_diagnostic': False,
 'remove_preproc_dir': True,
 'rootpath': {'OBS6': [PosixPath('/home/sarah/temp/ForRecipe')]},
 'run_dir': PosixPath('/home/sarah/temp/output/recipe_lisflood_20210713_095903/run'),
 'save_intermediary_cubes': False,
 'work_dir': PosixPath('/home/sarah/temp/output/recipe_lisflood_20210713_095903/work'),
 'write_netcdf': True,
 'write_plots': True}
The run_lisvap is False. Therefore, LISFLOOD forcing data 'e0', 'es0' and 'et0' are not generated. However, the recipe creates LISVAP input data that can be found in /home/sarah/temp/output/recipe_lisflood_20210713_095903/work/diagnostic_daily/script.
5         In:
print(lisflood_forcing)
eWaterCycle forcing
-------------------
start_time=1990-01-01T00:00:00Z
end_time=1990-12-31T00:00:00Z
directory=/home/sarah/temp/output/recipe_lisflood_20210713_095903/work/diagnostic_daily/script
shape=None
PrefixPrecipitation=lisflood_ERA5_Rhine_pr_1990_1990.nc
PrefixTavg=lisflood_ERA5_Rhine_tas_1990_1990.nc
PrefixE0=e0.nc
PrefixES0=es0.nc
PrefixET0=et0.nc
Plot forcing
6         In:
for file_name in [lisflood_forcing.PrefixTavg, lisflood_forcing.PrefixPrecipitation]:
    dataset = xr.load_dataset(f"{lisflood_forcing.directory}/{file_name}")
    var = list(dataset.data_vars.keys())[0]
    dataset[var].isel(time=1).plot(cmap="coolwarm", robust=True, size=5)
_images/examples_generate_forcing_19_0.png
_images/examples_generate_forcing_19_1.png

image

Running LISFLOOD model using eWaterCycle package (on Cartesius machine of SURFsara)

This notebooks shows how to run LISFLOOD model. Please note that the lisflood-grpc4bmi docker image in eWaterCycle is compatible only with forcing data and parameter set on Cartesius machine of SURFsara. More information about data, configuration and installation instructions can be found in the System setup in the eWaterCycle documentation.

1         In:
import logging
import warnings

logger = logging.getLogger("grpc4bmi")
logger.setLevel(logging.WARNING)

warnings.filterwarnings("ignore", category=UserWarning)
1         In:
import pandas as pd

import ewatercycle.forcing
import ewatercycle.models
import ewatercycle.parameter_sets
Load forcing data

For this example notebook, lisflood_ERA-Interim_*_1990_1990.nc data are copied from /projects/0/wtrcycle/comparison/forcing/lisflood to /scratch/shared/ewatercycle/lisflood_example/lisflood_forcing_data. Also the lisvap output files ‘e0’, ‘es0’ and ‘et0’ are generated and stored in the same directory. These data are made by running ESMValTool recipe and lisvap. We can now use those files to run the Lisflood model.

2         In:
forcing = ewatercycle.forcing.load_foreign(
    target_model="lisflood",
    directory="/scratch/shared/ewatercycle/lisflood_example/lisflood_forcing_data/",
    start_time="1990-01-01T00:00:00Z",
    end_time="1990-12-31T00:00:00Z",
    forcing_info={
        "PrefixPrecipitation": "lisflood_ERA-Interim_pr_1990_1990.nc",
        "PrefixTavg": "lisflood_ERA-Interim_tas_1990_1990.nc",
        "PrefixE0": "lisflood_e0_1990_1990.nc",
        "PrefixES0": "lisflood_es0_1990_1990.nc",
        "PrefixET0": "lisflood_et0_1990_1990.nc",
    },
)
print(forcing)
eWaterCycle forcing
-------------------
start_time=1990-01-01T00:00:00Z
end_time=1990-12-31T00:00:00Z
directory=/scratch/shared/ewatercycle/lisflood_example/lisflood_forcing_data
shape=None
PrefixPrecipitation=lisflood_ERA-Interim_pr_1990_1990.nc
PrefixTavg=lisflood_ERA-Interim_tas_1990_1990.nc
PrefixE0=lisflood_e0_1990_1990.nc
PrefixES0=lisflood_es0_1990_1990.nc
PrefixET0=lisflood_et0_1990_1990.nc
Load parameter set

This example uses parameter set on Cartesius machine of SURFsara.

3         In:
parameterset = ewatercycle.parameter_sets.ParameterSet(
    name="Lisflood01degree_masked",
    directory="/projects/0/wtrcycle/comparison/lisflood_input/Lisflood01degree_masked",
    config="/projects/0/wtrcycle/comparison/lisflood_input/settings_templates/settings_lisflood.xml",
    target_model="lisflood",
)
print(parameterset)
Parameter set
-------------
name=Lisflood01degree_masked
directory=/lustre1/0/wtrcycle/comparison/lisflood_input/Lisflood01degree_masked
config=/lustre1/0/wtrcycle/comparison/lisflood_input/settings_templates/settings_lisflood.xml
doi=N/A
target_model=lisflood
supported_model_versions=set()
Set up the model

To create the model object, we need to select a version.

4         In:
ewatercycle.models.Lisflood.available_versions
4       Out:
('20.10',)
5         In:
model = ewatercycle.models.Lisflood(
    version="20.10", parameter_set=parameterset, forcing=forcing
)
print(model)
Model version 20.10 is not explicitly listed in the supported model versions of this parameter set. This can lead to compatibility issues.
eWaterCycle Lisflood
-------------------
Version = 20.10
Parameter set =
  Parameter set
  -------------
  name=Lisflood01degree_masked
  directory=/lustre1/0/wtrcycle/comparison/lisflood_input/Lisflood01degree_masked
  config=/lustre1/0/wtrcycle/comparison/lisflood_input/settings_templates/settings_lisflood.xml
  doi=N/A
  target_model=lisflood
  supported_model_versions=set()
Forcing =
  eWaterCycle forcing
  -------------------
  start_time=1990-01-01T00:00:00Z
  end_time=1990-12-31T00:00:00Z
  directory=/scratch/shared/ewatercycle/lisflood_example/lisflood_forcing_data
  shape=None
  PrefixPrecipitation=lisflood_ERA-Interim_pr_1990_1990.nc
  PrefixTavg=lisflood_ERA-Interim_tas_1990_1990.nc
  PrefixE0=lisflood_e0_1990_1990.nc
  PrefixES0=lisflood_es0_1990_1990.nc
  PrefixET0=lisflood_et0_1990_1990.nc
6         In:
model.parameters
6       Out:
[('IrrigationEfficiency', '0.75'),
 ('MaskMap', '/data/input/areamaps/model_mask'),
 ('start_time', '1990-01-01T00:00:00Z'),
 ('end_time', '1990-12-31T00:00:00Z')]

Setup model with model_mask, IrrigationEfficiency of 0.8 instead of 0.75 and an earlier end time, making total model time just 1 month.

7         In:
model_mask = (
    "/projects/0/wtrcycle/comparison/recipes_auxiliary_datasets/LISFLOOD/model_mask.nc"
)

config_file, config_dir = model.setup(
    IrrigationEfficiency="0.8", end_time="1990-1-31T00:00:00Z", MaskMap=model_mask
)
print(config_file)
print(config_dir)
Running /scratch/shared/ewatercycle/lisflood_example/ewatercycle-lisflood-grpc4bmi_20.10.sif singularity container on port 58991
/scratch/shared/ewatercycle/lisflood_example/lisflood_20210713_121949/lisflood_setting.xml
/scratch/shared/ewatercycle/lisflood_example/lisflood_20210713_121949
8         In:
model.parameters
8       Out:
[('IrrigationEfficiency', '0.8'),
 ('MaskMap',
  '/lustre1/0/wtrcycle/comparison/recipes_auxiliary_datasets/LISFLOOD/model_mask'),
 ('start_time', '1990-01-01T00:00:00Z'),
 ('end_time', '1990-01-31T00:00:00Z')]

Initialize the model with the config file:

9         In:
model.initialize(config_file)

Get model variable names

10         In:
model.output_var_names
10       Out:
('Discharge',)
Run the model

Store simulated values at one target location until model end time. In this example, we use the coordinates of Merrimack observation station as the target coordinates.

11         In:
target_longitude = [-71.35]
target_latitude = [42.64]
target_discharge = []
time_range = []
end_time = model.end_time

while model.time < end_time:
    model.update()
    target_discharge.append(
        model.get_value_at_coords(
            "Discharge", lon=target_longitude, lat=target_latitude
        )[0]
    )
    time_range.append(model.time_as_datetime.date())
    print(model.time_as_isostr)
.Simulation started on 2021-07-13 14:20
1    1990-01-03T00:00:00Z
2    (estimated simulation end: 2021-07-13 14:28)1990-01-04T00:00:00Z
3    (estimated simulation end: 2021-07-13 14:25)1990-01-05T00:00:00Z
4    (estimated simulation end: 2021-07-13 14:24)1990-01-06T00:00:00Z
5    (estimated simulation end: 2021-07-13 14:23)1990-01-07T00:00:00Z
6    (estimated simulation end: 2021-07-13 14:23)1990-01-08T00:00:00Z
7    (estimated simulation end: 2021-07-13 14:23)1990-01-09T00:00:00Z
8    (estimated simulation end: 2021-07-13 14:23)1990-01-10T00:00:00Z
9    (estimated simulation end: 2021-07-13 14:22)1990-01-11T00:00:00Z
10    (estimated simulation end: 2021-07-13 14:22)1990-01-12T00:00:00Z
11    (estimated simulation end: 2021-07-13 14:22)1990-01-13T00:00:00Z
12    (estimated simulation end: 2021-07-13 14:22)1990-01-14T00:00:00Z
13    (estimated simulation end: 2021-07-13 14:22)1990-01-15T00:00:00Z
14    (estimated simulation end: 2021-07-13 14:22)1990-01-16T00:00:00Z
15    (estimated simulation end: 2021-07-13 14:22)1990-01-17T00:00:00Z
16    (estimated simulation end: 2021-07-13 14:22)1990-01-18T00:00:00Z
17    (estimated simulation end: 2021-07-13 14:22)1990-01-19T00:00:00Z
18    (estimated simulation end: 2021-07-13 14:22)1990-01-20T00:00:00Z
19    (estimated simulation end: 2021-07-13 14:22)1990-01-21T00:00:00Z
20    (estimated simulation end: 2021-07-13 14:22)1990-01-22T00:00:00Z
21    (estimated simulation end: 2021-07-13 14:22)1990-01-23T00:00:00Z
22    (estimated simulation end: 2021-07-13 14:22)1990-01-24T00:00:00Z
23    (estimated simulation end: 2021-07-13 14:22)1990-01-25T00:00:00Z
24    (estimated simulation end: 2021-07-13 14:22)1990-01-26T00:00:00Z
25    (estimated simulation end: 2021-07-13 14:22)1990-01-27T00:00:00Z
26    (estimated simulation end: 2021-07-13 14:22)1990-01-28T00:00:00Z
27    (estimated simulation end: 2021-07-13 14:22)1990-01-29T00:00:00Z
28    (estimated simulation end: 2021-07-13 14:22)1990-01-30T00:00:00Z
29    (estimated simulation end: 2021-07-13 14:22)1990-01-31T00:00:00Z

Store simulated values for all locations of the model grid at end time.

12         In:
discharge = model.get_value_as_xarray("Discharge")
23         In:
model.finalize()
Inspect the results

The discharge time series at Merrimack observation station:

13         In:
simulated_target_discharge = pd.DataFrame(
    {"simulation": target_discharge}, index=pd.to_datetime(time_range)
)
simulated_target_discharge.plot(figsize=(12, 8))
13       Out:
<AxesSubplot:>
_images/examples_lisflood_28_1.png

The lisflood output has a global extent. In this example, we plot the discharge values in Merrimack catchment and at the last time step.

22         In:
lc = discharge.coords["longitude"]
la = discharge.coords["latitude"]
discharge_map = discharge.loc[
    dict(longitude=lc[(lc > -73) & (lc < -70)], latitude=la[(la > 42) & (la < 45)])
].plot(robust=True, cmap="GnBu", figsize=(12, 8))
discharge_map.axes.scatter(
    target_longitude, target_latitude, s=250, c="r", marker="x", lw=2
)
22       Out:
<matplotlib.collections.PathCollection at 0x2ab893830370>
_images/examples_lisflood_30_1.png

image

Running MARRMoT M01 model using eWaterCycle package

This notebooks shows how to run MARRMoT M01 model using an example use-case. More information about data, configuration and installation instructions can be found in the System setup in the eWaterCycle documentation.

1         In:
import warnings

warnings.filterwarnings("ignore", category=UserWarning)
1         In:
import pandas as pd

import ewatercycle.forcing
import ewatercycle.models
Load forcing data

To download the example forcing file BMI_testcase_m01_BuffaloRiver_TN_USA.mat, see this instruction.

2         In:
forcing = ewatercycle.forcing.load_foreign(
    "marrmot",
    directory=".",
    start_time="1989-01-01T00:00:00Z",
    end_time="1992-12-31T00:00:00Z",
    forcing_info={"forcing_file": "BMI_testcase_m01_BuffaloRiver_TN_USA.mat"},
)
print(forcing)
eWaterCycle forcing
-------------------
start_time=1989-01-01T00:00:00Z
end_time=1992-12-31T00:00:00Z
directory=/home/sarah/GitHub/ewatercycle/docs/examples
shape=None
forcing_file=BMI_testcase_m01_BuffaloRiver_TN_USA.mat
Set up the model

To create the model object, we need to select a version.

3         In:
ewatercycle.models.MarrmotM01.available_versions
3       Out:
('2020.11',)
4         In:
model = ewatercycle.models.MarrmotM01(version="2020.11", forcing=forcing)
print(model)
eWaterCycle MarrmotM01
-------------------
Version = 2020.11
Parameter set =
  None
Forcing =
  eWaterCycle forcing
  -------------------
  start_time=1989-01-01T00:00:00Z
  end_time=1992-12-31T00:00:00Z
  directory=/home/sarah/GitHub/ewatercycle/docs/examples
  shape=None
  forcing_file=BMI_testcase_m01_BuffaloRiver_TN_USA.mat
5         In:
model.parameters
5       Out:
[('maximum_soil_moisture_storage', 10.0),
 ('initial_soil_moisture_storage', 5.0),
 ('solver',
  Solver(name='createOdeApprox_IE', resnorm_tolerance=array([0.1]), resnorm_maxiter=array([6.]))),
 ('start time', '1989-01-01T00:00:00Z'),
 ('end time', '1992-12-31T00:00:00Z')]

Setup model with maximum soil moisture storage of 12.0 instead of 10.0 and an earlier end time, making total model time just 1 month.

6         In:
cfg_file, cfg_dir = model.setup(
    maximum_soil_moisture_storage=12.0,
    end_time="1989-02-01T00:00:00Z",
)
print(cfg_file)
print(cfg_dir)
/home/sarah/GitHub/ewatercycle/docs/examples/marrmot_20210712_135130/marrmot-m01_config.mat
/home/sarah/GitHub/ewatercycle/docs/examples/marrmot_20210712_135130
7         In:
model.parameters
7       Out:
[('maximum_soil_moisture_storage', 12.0),
 ('initial_soil_moisture_storage', 5.0),
 ('solver',
  Solver(name='createOdeApprox_IE', resnorm_tolerance=array([0.1]), resnorm_maxiter=array([6.]))),
 ('start time', '1989-01-01T00:00:00Z'),
 ('end time', '1989-02-01T00:00:00Z')]

Initialize the model with the config file:

8         In:
model.initialize(cfg_file)

Get model variable names, only flux_out_Q is supported for now.

9         In:
model.output_var_names
9       Out:
('P',
 'T',
 'Ep',
 'S(t)',
 'par',
 'sol_resnorm_tolerance',
 'sol_resnorm_maxiter',
 'flux_out_Q',
 'flux_out_Ea',
 'wb')
Run the model
10         In:
discharge = []
time_range = []
end_time = model.end_time

while model.time < end_time:
    model.update()
    discharge.append(model.get_value("flux_out_Q")[0])
    time_range.append(model.time_as_datetime.date())
    print(model.time_as_isostr)
1989-01-02T00:00:00Z
1989-01-03T00:00:00Z
1989-01-04T00:00:00Z
1989-01-05T00:00:00Z
1989-01-06T00:00:00Z
1989-01-07T00:00:00Z
1989-01-08T00:00:00Z
1989-01-09T00:00:00Z
1989-01-10T00:00:00Z
1989-01-11T00:00:00Z
1989-01-12T00:00:00Z
1989-01-13T00:00:00Z
1989-01-14T00:00:00Z
1989-01-15T00:00:00Z
1989-01-16T00:00:00Z
1989-01-17T00:00:00Z
1989-01-18T00:00:00Z
1989-01-19T00:00:00Z
1989-01-20T00:00:00Z
1989-01-21T00:00:00Z
1989-01-22T00:00:00Z
1989-01-23T00:00:00Z
1989-01-24T00:00:00Z
1989-01-25T00:00:00Z
1989-01-26T00:00:00Z
1989-01-27T00:00:00Z
1989-01-28T00:00:00Z
1989-01-29T00:00:00Z
1989-01-30T00:00:00Z
1989-01-31T00:00:00Z
1989-02-01T00:00:00Z
11         In:
model.finalize()
Inspect the results
12         In:
simulated_discharge = pd.DataFrame(
    {"simulation": discharge}, index=pd.to_datetime(time_range)
)
13         In:
simulated_discharge.plot(figsize=(12, 8))
13       Out:
<AxesSubplot:>
_images/examples_MarrmotM01_24_1.png

image

Running MARRMoT M14 model using eWaterCycle package

This notebooks shows how to run MARRMoT M14 model using an example use-case. More information about data, configuration and installation instructions can be found in the System setup in the eWaterCycle documentation.

1         In:
import warnings

warnings.filterwarnings("ignore", category=UserWarning)
1         In:
import pandas as pd

import ewatercycle.forcing
import ewatercycle.models
Load forcing data

To download the example forcing file BMI_testcase_m01_BuffaloRiver_TN_USA.mat, see this instruction.

2         In:
forcing = ewatercycle.forcing.load_foreign(
    "marrmot",
    directory=".",
    start_time="1989-01-01T00:00:00Z",
    end_time="1992-12-31T00:00:00Z",
    forcing_info={"forcing_file": "BMI_testcase_m01_BuffaloRiver_TN_USA.mat"},
)
print(forcing)
eWaterCycle forcing
-------------------
start_time=1989-01-01T00:00:00Z
end_time=1992-12-31T00:00:00Z
directory=/home/sarah/GitHub/ewatercycle/docs/examples
shape=None
forcing_file=BMI_testcase_m01_BuffaloRiver_TN_USA.mat
Set up the model

To create the model object, we need to select a version.

3         In:
ewatercycle.models.MarrmotM14.available_versions
3       Out:
('2020.11',)
4         In:
model = ewatercycle.models.MarrmotM14(version="2020.11", forcing=forcing)
print(model)
The length of parameters in forcing /home/sarah/GitHub/ewatercycle/docs/examples/BMI_testcase_m01_BuffaloRiver_TN_USA.mat does not match the length of M14 parameters that is seven.
The length of initial stores in forcing /home/sarah/GitHub/ewatercycle/docs/examples/BMI_testcase_m01_BuffaloRiver_TN_USA.mat does not match the length of M14 iniatial stores that is two.
eWaterCycle MarrmotM14
-------------------
Version = 2020.11
Parameter set =
  None
Forcing =
  eWaterCycle forcing
  -------------------
  start_time=1989-01-01T00:00:00Z
  end_time=1992-12-31T00:00:00Z
  directory=/home/sarah/GitHub/ewatercycle/docs/examples
  shape=None
  forcing_file=BMI_testcase_m01_BuffaloRiver_TN_USA.mat
5         In:
model.parameters
5       Out:
[('maximum_soil_moisture_storage', 1000.0),
 ('threshold_flow_generation_evap_change', 0.5),
 ('leakage_saturated_zone_flow_coefficient', 0.5),
 ('zero_deficit_base_flow_speed', 100.0),
 ('baseflow_coefficient', 0.5),
 ('gamma_distribution_chi_parameter', 4.25),
 ('gamma_distribution_phi_parameter', 2.5),
 ('initial_upper_zone_storage', 900.0),
 ('initial_saturated_zone_storage', 900.0),
 ('solver',
  Solver(name='createOdeApprox_IE', resnorm_tolerance=array([0.1]), resnorm_maxiter=array([6.]))),
 ('start time', '1989-01-01T00:00:00Z'),
 ('end time', '1992-12-31T00:00:00Z')]

Setup model with maximum soil moisture storage of 12.0 instead of 10.0 and an earlier end time, making total model time just 1 month.

6         In:
cfg_file, cfg_dir = model.setup(
    maximum_soil_moisture_storage=12.0,
    end_time="1989-02-01T00:00:00Z",
)
print(cfg_file)
print(cfg_dir)
/home/sarah/GitHub/ewatercycle/docs/examples/marrmot_20210712_135152/marrmot-m14_config.mat
/home/sarah/GitHub/ewatercycle/docs/examples/marrmot_20210712_135152
7         In:
model.parameters
7       Out:
[('maximum_soil_moisture_storage', 12.0),
 ('threshold_flow_generation_evap_change', 0.5),
 ('leakage_saturated_zone_flow_coefficient', 0.5),
 ('zero_deficit_base_flow_speed', 100.0),
 ('baseflow_coefficient', 0.5),
 ('gamma_distribution_chi_parameter', 4.25),
 ('gamma_distribution_phi_parameter', 2.5),
 ('initial_upper_zone_storage', 900.0),
 ('initial_saturated_zone_storage', 900.0),
 ('solver',
  Solver(name='createOdeApprox_IE', resnorm_tolerance=array([0.1]), resnorm_maxiter=array([6.]))),
 ('start time', '1989-01-01T00:00:00Z'),
 ('end time', '1989-02-01T00:00:00Z')]

Initialize the model with the config file:

8         In:
model.initialize(cfg_file)

Get model variable names, only flux_out_Q is supported for now.

9         In:
model.output_var_names
9       Out:
('P',
 'T',
 'Ep',
 'S(t)',
 'par',
 'sol_resnorm_tolerance',
 'sol_resnorm_maxiter',
 'flux_out_Q',
 'flux_out_Ea',
 'wb')
Run the model
10         In:
discharge = []
time_range = []
end_time = model.end_time

while model.time < end_time:
    model.update()
    discharge.append(model.get_value("flux_out_Q")[0])
    time_range.append(model.time_as_datetime.date())
    print(model.time_as_isostr)
1989-01-02T00:00:00Z
1989-01-03T00:00:00Z
1989-01-04T00:00:00Z
1989-01-05T00:00:00Z
1989-01-06T00:00:00Z
1989-01-07T00:00:00Z
1989-01-08T00:00:00Z
1989-01-09T00:00:00Z
1989-01-10T00:00:00Z
1989-01-11T00:00:00Z
1989-01-12T00:00:00Z
1989-01-13T00:00:00Z
1989-01-14T00:00:00Z
1989-01-15T00:00:00Z
1989-01-16T00:00:00Z
1989-01-17T00:00:00Z
1989-01-18T00:00:00Z
1989-01-19T00:00:00Z
1989-01-20T00:00:00Z
1989-01-21T00:00:00Z
1989-01-22T00:00:00Z
1989-01-23T00:00:00Z
1989-01-24T00:00:00Z
1989-01-25T00:00:00Z
1989-01-26T00:00:00Z
1989-01-27T00:00:00Z
1989-01-28T00:00:00Z
1989-01-29T00:00:00Z
1989-01-30T00:00:00Z
1989-01-31T00:00:00Z
1989-02-01T00:00:00Z
11         In:
model.finalize()
Inspect the results
12         In:
simulated_discharge = pd.DataFrame(
    {"simulation": discharge}, index=pd.to_datetime(time_range)
)
13         In:
simulated_discharge.plot(figsize=(12, 8))
13       Out:
<AxesSubplot:>
_images/examples_MarrmotM14_24_1.png

ewatercycle logo

PCRGlobWB example use case

This example shows how the PCRGlobWB model can be used within the eWaterCycle system. It is based on the example use case from https://github.com/UU-Hydro/PCR-GLOBWB_input_example.

This example use case assumes that the ewatercycle platform has been installed and configured on your system. See our system setup documentation for instructions if this is not the case.

1         In:
# This cell is only used to suppress some distracting output messages
import warnings

warnings.filterwarnings("ignore", category=UserWarning)
2         In:
import matplotlib.pyplot as plt
from cartopy import crs
from cartopy import feature as cfeature

import ewatercycle.forcing
import ewatercycle.models
import ewatercycle.parameter_sets
Loading a parameter set

A set of (example) parameter sets come pre-installed on the eWaterCycle system (see system setup if this is not the case).

3         In:
ewatercycle.parameter_sets.available_parameter_sets()
3       Out:
('lisflood_fraser', 'pcrglobwb_rhinemeuse_30min', 'wflow_rhine_sbm_nc')

Existing parametersets can easily be loaded:

4         In:
parameter_set = ewatercycle.parameter_sets.get_parameter_set(
    "pcrglobwb_rhinemeuse_30min"
)
print(parameter_set)
Parameter set
-------------
name=pcrglobwb_rhinemeuse_30min
directory=/home/peter/ewatercycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min
config=/home/peter/ewatercycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min/setup_natural_test.ini
doi=N/A
target_model=pcrglobwb
supported_model_versions={'setters'}

It is also possible to load a custom parameterset by passing in the relevant parameters directly:

5         In:
custom_parameter_set = ewatercycle.parameter_sets.ParameterSet(
    name="custom_parameter_set",
    directory="/home/peter/ewatercycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min",
    config="/home/peter/ewatercycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min/setup_natural_test.ini",
    target_model="pcrglobwb",
    supported_model_versions={"setters"},
)
Load forcing data

For this example case, the forcing is already included in the parameter set and configured correctly. Therefore in principle this step can be skipped. However, for the purpose of illustration, we show how the forcing would be loaded using the ewatercycle.forcing module, as if it came from another source. To learn about forcing generation, see our preprocessing examples.

6         In:
forcing = ewatercycle.forcing.load_foreign(
    target_model="pcrglobwb",
    start_time="2001-01-01T00:00:00Z",
    end_time="2010-12-31T00:00:00Z",
    directory="./parameter-sets/pcrglobwb_rhinemeuse_30min/forcing",
    shape=None,  # if available, it can be used e.g. for plotting
    forcing_info=dict(
        # model-specific options
        precipitationNC="precipitation_2001to2010.nc",
        temperatureNC="temperature_2001to2010.nc",
    ),
)
print(forcing)
Forcing data for PCRGlobWB
--------------------------
Directory: /home/peter/ewatercycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min/forcing
Start time: 2001-01-01T00:00:00Z
End time: 2010-12-31T00:00:00Z
Shapefile: None
Additional information for model config:
  - temperatureNC: temperature_2001to2010.nc
  - precipitationNC: precipitation_2001to2010.nc
Setting up the model

Note that the model version and the parameterset versions should be compatible.

7         In:
ewatercycle.models.PCRGlobWB.available_versions
7       Out:
('setters',)
8         In:
pcrglob = ewatercycle.models.PCRGlobWB(
    version="setters", parameter_set=parameter_set, forcing=forcing
)
print(pcrglob)
eWaterCycle PCRGlobWB
-------------------
Version = setters
Parameter set =
  Parameter set
  -------------
  name=pcrglobwb_rhinemeuse_30min
  directory=/home/peter/ewatercycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min
  config=/home/peter/ewatercycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min/setup_natural_test.ini
  doi=N/A
  target_model=pcrglobwb
  supported_model_versions={'setters'}
Forcing =
  Forcing data for PCRGlobWB
  --------------------------
  Directory: /home/peter/ewatercycle/ewatercycle/docs/examples/parameter-sets/pcrglobwb_rhinemeuse_30min/forcing
  Start time: 2001-01-01T00:00:00Z
  End time: 2010-12-31T00:00:00Z
  Shapefile: None
  Additional information for model config:
    - temperatureNC: temperature_2001to2010.nc
    - precipitationNC: precipitation_2001to2010.nc

eWaterCycle exposes a selected set of configurable parameters. These can be modified in the setup() method.

9         In:
pcrglob.parameters
9       Out:
[('start_time', '2001-01-01T00:00:00Z'),
 ('end_time', '2001-01-01T00:00:00Z'),
 ('routing_method', 'accuTravelTime'),
 ('max_spinups_in_years', '20')]

Calling setup() will start up a docker or singularity container. Be careful with calling it multiple times!

10         In:
cfg_file, cfg_dir = pcrglob.setup(
    end_time="2001-02-28T00:00:00Z", max_spinups_in_years=5
)
cfg_file, cfg_dir
Running /home/peter/ewatercycle/ewatercycle/ewatercycle-pcrg-grpc4bmi-setters.sif singularity container on port 50639
10       Out:
('/home/peter/ewatercycle/ewatercycle/docs/examples/pcrglobwb_20210714_141432/pcrglobwb_ewatercycle.ini',
 '/home/peter/ewatercycle/ewatercycle/docs/examples/pcrglobwb_20210714_141432')
11         In:
pcrglob.parameters
11       Out:
[('start_time', '2001-01-01T00:00:00Z'),
 ('end_time', '2001-02-28T00:00:00Z'),
 ('routing_method', 'accuTravelTime'),
 ('max_spinups_in_years', '5')]

Note that the parameters have been changed. A new config file which incorporates these updated parameters has been generated as well. If you want to see or modify any additional model settings, you can acces this file directly. When you’re ready, pass the path to the config file to initialize().

12         In:
pcrglob.initialize(cfg_file)
Running the model

Simply running the model from start to end is straightforward. At each time step we can retrieve information from the model.

13         In:
while pcrglob.time < pcrglob.end_time:
    print(pcrglob.time_as_isostr, end="\r")
    pcrglob.update()
2001-02-27T00:00:00Z
Interacting with the model

PCRGlobWB exposes many variables. Just a few of them are shown here:

14         In:
list(pcrglob.output_var_names)[-15:-5]
14       Out:
('total_abstraction',
 'livestockWaterWithdrawalVolume',
 'desalination_source_abstraction',
 'discharge',
 'temperature',
 'upper_soil_transpiration',
 'snow_water_equivalent',
 'total_runoff',
 'transpiration_from_irrigation',
 'fraction_of_surface_water')

Model fields can be fetched as xarray objects (or as flat numpy arrays using get_value()):

15         In:
da = pcrglob.get_value_as_xarray("discharge")
da.thin(5)  # only show every 5th value in each dim
15       Out:
<xarray.DataArray 'discharge' (latitude: 3, longitude: 4)>
array([[         nan,          nan,          nan,          nan],
       [         nan,  74.54685211,  10.38944435,          nan],
       [         nan, 188.07923889,          nan,          nan]])
Coordinates:
  * longitude  (longitude) float64 3.75 6.25 8.75 11.25
  * latitude   (latitude) float64 46.25 48.75 51.25
    time       object 2001-02-28 00:00:00
Attributes:
    units:    m3.s-1

Xarray makes it very easy to plot the data. In the figure below, we add three points that we will use to illustrate that we can also access individual grid cells.

16         In:
fig = plt.figure(dpi=120)
ax = fig.add_subplot(111, projection=crs.PlateCarree())
da.plot(ax=ax, cmap="GnBu")

# Overlay ocean and coastines
ax.add_feature(cfeature.OCEAN)
ax.add_feature(cfeature.RIVERS, color="k")
ax.coastlines()

# Add some verification points
target_longitudes = [7.8, 10.2, 11]
target_latitudes = [50.3, 49.8, 47]
ax.scatter(target_longitudes, target_latitudes, s=250, c="r", marker="x", lw=2)
16       Out:
<matplotlib.collections.PathCollection at 0x7f636aa5c4f0>
_images/examples_pcrglobwb_30_1.png

We can get (or set) the values at custom points as well:

17         In:
pcrglob.get_value_at_coords("discharge", lon=target_longitudes, lat=target_latitudes)
17       Out:
array([713.2911377 ,  84.76369476,          nan])
Cleaning up

Models usually perform some “wrap up tasks” at the end of a model run, such as writing the last outputs to disk and releasing memory. In the case of eWaterCycle, another important teardown task is destroying the docker or singularity container in which the model was running. This can free up a lot of resources on your system. Therefore it is good practice to always call finalize() when you’re done with an experiment.

18         In:
pcrglob.finalize()

image

Running Wflow using the ewatercycle system

This notebooks shows how to run Wflow model using an example use-case. More information about data, configuration and installation instructions can be found in the System setup chapter in the eWaterCycle documentation.

1         In:
import logging
import warnings

warnings.filterwarnings("ignore", category=UserWarning)
logging.basicConfig(level=logging.WARN)
1         In:
import ewatercycle.forcing
import ewatercycle.models
import ewatercycle.parameter_sets
Setting up the model

The model needs a parameter set and forcing. The parameter set can be gotten from the available parameters sets on the system and the forcing can derived from the parameter set.

2         In:
parameter_set = ewatercycle.parameter_sets.get_parameter_set("wflow_rhine_sbm_nc")
print(parameter_set)
Parameter set
-------------
name=wflow_rhine_sbm_nc
directory=/home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/wflow_rhine_sbm_nc
config=/home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/wflow_rhine_sbm_nc/wflow_sbm_NC.ini
doi=N/A
target_model=wflow
supported_model_versions={'2020.1.1'}
3         In:
forcing = ewatercycle.forcing.load_foreign(
    directory=str(parameter_set.directory),
    target_model=parameter_set.target_model,
    start_time="1991-01-01T00:00:00Z",
    end_time="1991-12-31T00:00:00Z",
    forcing_info=dict(
        # Additional information about the external forcing data needed for the model configuration
        netcdfinput="inmaps.nc",
        Precipitation="/P",
        EvapoTranspiration="/PET",
        Temperature="/TEMP",
    ),
)
print(forcing)
Forcing data for Wflow
----------------------
Directory: /home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/wflow_rhine_sbm_nc
Start time: 1991-01-01T00:00:00Z
End time: 1991-12-31T00:00:00Z
Shapefile: None
Additional information for model config:
  - netcdfinput: inmaps.nc
  - Precipitation: /P
  - Temperature: /TEMP
  - EvapoTranspiration: /PET
  - Inflow: None

Pick a version of Wflow model, so the right model code can be executed which understands the parameter set and forcing.

4         In:
ewatercycle.models.Wflow.available_versions
4       Out:
('2020.1.1',)
5         In:
model = ewatercycle.models.Wflow(
    version="2020.1.1", parameter_set=parameter_set, forcing=forcing
)
WARNING:ewatercycle.models.wflow:Config file from parameter set is missing API section, adding section
WARNING:ewatercycle.models.wflow:Config file from parameter set is missing RiverRunoff option in API section, added it with value '2, m/s option'
19         In:
print(model)
eWaterCycle Wflow
-------------------
Version = 2020.1.1
Parameter set =
  Parameter set
  -------------
  name=wflow_rhine_sbm_nc
  directory=/home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/wflow_rhine_sbm_nc
  config=/home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/wflow_rhine_sbm_nc/wflow_sbm_NC.ini
  doi=N/A
  target_model=wflow
  supported_model_versions={'2020.1.1'}
Forcing =
  Forcing data for Wflow
  ----------------------
  Directory: /home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/parameter-sets/wflow_rhine_sbm_nc
  Start time: 1991-01-01T00:00:00Z
  End time: 1991-12-31T00:00:00Z
  Shapefile: None
  Additional information for model config:
    - netcdfinput: inmaps.nc
    - Precipitation: /P
    - Temperature: /TEMP
    - EvapoTranspiration: /PET
    - Inflow: None

The pre-configured parameters are shown below and can be overwritten with setup()

6         In:
model.parameters
6       Out:
[('start_time', '1991-01-01T00:00:00Z'), ('end_time', '1991-12-31T00:00:00Z')]
7         In:
cfg_file, cfg_dir = model.setup(end_time="1991-02-28T00:00:00Z")
8         In:
print(cfg_file)
print(cfg_dir)
/home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/output/wflow_20210714_073455/wflow_ewatercycle.ini
/home/verhoes/git/eWaterCycle/ewatercycle/docs/examples/output/wflow_20210714_073455

The config file can be edited, but for now we will initialize the model with the config file as is

9         In:
model.initialize(cfg_file)
Running the model
10         In:
while model.time < model.end_time:
    model.update()
    print(model.time_as_isostr)
1991-01-01T00:00:00Z
1991-01-02T00:00:00Z
1991-01-03T00:00:00Z
1991-01-04T00:00:00Z
1991-01-05T00:00:00Z
1991-01-06T00:00:00Z
1991-01-07T00:00:00Z
1991-01-08T00:00:00Z
1991-01-09T00:00:00Z
1991-01-10T00:00:00Z
1991-01-11T00:00:00Z
1991-01-12T00:00:00Z
1991-01-13T00:00:00Z
1991-01-14T00:00:00Z
1991-01-15T00:00:00Z
1991-01-16T00:00:00Z
1991-01-17T00:00:00Z
1991-01-18T00:00:00Z
1991-01-19T00:00:00Z
1991-01-20T00:00:00Z
1991-01-21T00:00:00Z
1991-01-22T00:00:00Z
1991-01-23T00:00:00Z
1991-01-24T00:00:00Z
1991-01-25T00:00:00Z
1991-01-26T00:00:00Z
1991-01-27T00:00:00Z
1991-01-28T00:00:00Z
1991-01-29T00:00:00Z
1991-01-30T00:00:00Z
1991-01-31T00:00:00Z
1991-02-01T00:00:00Z
1991-02-02T00:00:00Z
1991-02-03T00:00:00Z
1991-02-04T00:00:00Z
1991-02-05T00:00:00Z
1991-02-06T00:00:00Z
1991-02-07T00:00:00Z
1991-02-08T00:00:00Z
1991-02-09T00:00:00Z
1991-02-10T00:00:00Z
1991-02-11T00:00:00Z
1991-02-12T00:00:00Z
1991-02-13T00:00:00Z
1991-02-14T00:00:00Z
1991-02-15T00:00:00Z
1991-02-16T00:00:00Z
1991-02-17T00:00:00Z
1991-02-18T00:00:00Z
1991-02-19T00:00:00Z
1991-02-20T00:00:00Z
1991-02-21T00:00:00Z
1991-02-22T00:00:00Z
1991-02-23T00:00:00Z
1991-02-24T00:00:00Z
1991-02-25T00:00:00Z
1991-02-26T00:00:00Z
1991-02-27T00:00:00Z
1991-02-28T00:00:00Z
Inspect the results

The RiverRunnoff values of the current model state can be fetched as a xarray dataset.

14         In:
da = model.get_value_as_xarray("RiverRunoff")
da
14       Out:
<xarray.DataArray 'RiverRunoff' (latitude: 169, longitude: 187)>
array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]])
Coordinates:
  * longitude  (longitude) float64 5.227 5.264 5.3 5.337 ... 11.97 12.01 12.05
  * latitude   (latitude) float64 45.89 45.93 45.97 46.0 ... 51.98 52.02 52.05
    time       object 1991-02-28 00:00:00
Attributes:
    units:     m/s
18         In:
print(da)
<xarray.DataArray 'RiverRunoff' (latitude: 169, longitude: 187)>
array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]])
Coordinates:
  * longitude  (longitude) float64 5.227 5.264 5.3 5.337 ... 11.97 12.01 12.05
  * latitude   (latitude) float64 45.89 45.93 45.97 46.0 ... 51.98 52.02 52.05
    time       object 1991-02-28 00:00:00
Attributes:
    units:     m/s
15         In:
qm = da.plot(robust=True, cmap="GnBu", figsize=(10, 8))

# Add some verification points
target_longitudes = [8.4, 10, 11]
target_latitudes = [50, 50.15, 49]
# Add some crosses to check that 'get_value_at_coords' works correctly below
qm.axes.scatter(target_longitudes, target_latitudes, s=250, c="r", marker="x", lw=2)
15       Out:
<matplotlib.collections.PathCollection at 0x7fae85f616d0>
_images/examples_wflow_21_1.png

Instead of getting the whole spatial grid, you can also get RiverRunoff values at some coordinates (red crosses in above plot).

15         In:
model.get_value_at_coords("RiverRunoff", lon=target_longitudes, lat=target_latitudes)
15       Out:
array([200.80599976,  44.62182236,   0.        ])

We are done with the model so let’s clean it up.

16         In:
model.finalize()

ewatercycle logo

Brute force irrigation experiment

This example notebook shows how the eWaterCycle system can be used to quickly assess the impact of irrigation on river discharge. We will manually overwrite the soil moisture values in an experiment with the PCRGlobWB model, to mimick the effect on irrigation. Obviously, this is not a realistic scenario - the eWaterCycle developers are not accountable for any consequences of implementing a real irrigation system after this example.

1         In:
# This cell is only used to suppress some distracting output messages
import warnings

warnings.filterwarnings("ignore", category=UserWarning)
2         In:
import matplotlib.pyplot as plt
import pandas as pd
from cartopy import crs
from cartopy import feature as cfeature

import ewatercycle.models
import ewatercycle.parameter_sets

We will run 2 versions of the same model: 1. A reference run with the default setup 2. An irrigation experiment where we will overwrite soil moisture values

We will set up the models with identical parameters and settings. We will use a standard dataset with global parameters on 5 and 30 minutes resolution. The example parameter sets also include forcing data.

3         In:
merrimack_parameterset = ewatercycle.parameter_sets.ParameterSet(
    name="custom_parameter_set",
    directory="/mnt/data/examples/technical_paper/pcr-globwb/input",
    config="./pcrglobwb_merrimack.ini",
    target_model="pcrglobwb",
    doi="10.5281/zenodo.1045339",
    supported_model_versions={"setters"},
)

print(merrimack_parameterset)
Parameter set
-------------
name=custom_parameter_set
directory=/mnt/data/examples/technical_paper/pcr-globwb/input
config=/mnt/home/user37/ewatercycle/docs/examples/pcrglobwb_merrimack.ini
doi=10.5281/zenodo.1045339
target_model=pcrglobwb
supported_model_versions={'setters'}

We’ll track a grid cell nearby a GRDC station with the following coordinates:

4         In:
grdc_latitude = 42.6459
grdc_longitude = -71.2984
Reference experiment

For the purpose of illustration, we start by running the reference experiment. Then, in the irrigation experiment, we can focus on the differences with respect to the reference experiment.

5         In:
# Instantiate the model instance
reference = ewatercycle.models.PCRGlobWB(
    version="setters", parameter_set=merrimack_parameterset
)

# Create experiment folder, set up the model configuration,
# and start the container in which the model will run
reference_config, reference_dir = reference.setup()

Initialize the model inside the container. Depending on your system this may take a few minutes, log messages will start to appear in the output directory

6         In:
reference.initialize(reference_config)

Create an empty dataframe to store the modelled discharge

7         In:
time = pd.date_range(reference.start_time_as_isostr, reference.end_time_as_isostr)
timeseries = pd.DataFrame(
    index=pd.Index(time, name="time"), columns=["reference", "experiment"]
)
timeseries.head()
7       Out:
reference experiment
time
2002-01-01 00:00:00+00:00 NaN NaN
2002-01-02 00:00:00+00:00 NaN NaN
2002-01-03 00:00:00+00:00 NaN NaN
2002-01-04 00:00:00+00:00 NaN NaN
2002-01-05 00:00:00+00:00 NaN NaN
8         In:
while reference.time < reference.end_time:

    reference.update()

    # Track discharge at station location
    discharge_at_station = reference.get_value_at_coords(
        "discharge", lat=[grdc_latitude], lon=[grdc_longitude]
    )
    time = reference.time_as_isostr
    timeseries["reference"][time] = discharge_at_station[0]

    # Show progress
    print(time, end="\r")  # "\r" clears the output before printing the next timestamp
2002-12-31T00:00:00Z
Intermediate insights

Before we continue with the experiment, let’s have a look at the intermediate results. First of all, notice that the reference column in our timeseries dataframe has been filled.

9         In:
timeseries.head()
9       Out:
reference experiment
time
2002-01-01 00:00:00+00:00 71.991348 NaN
2002-01-02 00:00:00+00:00 78.788757 NaN
2002-01-03 00:00:00+00:00 79.178329 NaN
2002-01-04 00:00:00+00:00 79.046112 NaN
2002-01-05 00:00:00+00:00 78.232491 NaN

We can also make a map of discharge at the last model step

10         In:
# Use matplotlib to make the figure slightly nicer
fig = plt.figure(dpi=120)
ax = fig.add_subplot(111, projection=crs.PlateCarree())

# Plotting the model field is a one-liner
reference.get_value_as_xarray("discharge").plot(ax=ax, cmap="GnBu")

# Also plot the station location
ax.scatter(grdc_longitude, grdc_latitude, s=25, c="r")

# Overlay ocean and coastines
ax.add_feature(cfeature.OCEAN, zorder=2)
ax.add_feature(cfeature.RIVERS, zorder=2, color="k")
ax.coastlines(zorder=3)
10       Out:
<cartopy.mpl.feature_artist.FeatureArtist at 0x7fadd0837cd0>
_images/examples_Irrigation_18_1.png

You can see that the grdc location indeed represents a cell that we would identify as a river.

We can also have a quick look at the discharge timeseries we have tracked, to see if it makes any sense.

11         In:
timeseries.plot()
11       Out:
<AxesSubplot:xlabel='time'>
_images/examples_Irrigation_21_1.png
Running the irrigation experiment

Before we initialize the experiment, let’s use the reference model to illustrate the concept of what we will do.

We will fetch the soil moisture field and overwrite a part of it so that the soil will be fully saturated.

12         In:
soil_moisture = reference.get_value_as_xarray("upper_soil_saturation_degree")

# Copy the field and manually overwrite a random part of the domain
irrigated_soil_moisture = soil_moisture.copy()
irrigated_soil_moisture[31:41, 18:28] = 1

Let’s visualize the difference

13         In:
fig = plt.figure(figsize=(10, 5), dpi=120)
left_axes = fig.add_subplot(121, projection=crs.PlateCarree())
right_axes = fig.add_subplot(122, projection=crs.PlateCarree())

soil_moisture.plot(ax=left_axes, cmap="GnBu", vmin=0.3, vmax=1)
irrigated_soil_moisture.plot(ax=right_axes, cmap="GnBu", vmin=0.3, vmax=1)

# Decoration
left_axes.set_title("Reference")
right_axes.set_title("Irrigated patch")

for axes in [left_axes, right_axes]:
    axes.add_feature(cfeature.OCEAN, zorder=2)
    axes.add_feature(cfeature.RIVERS, zorder=2, color="k")
    axes.coastlines(zorder=3)
_images/examples_Irrigation_25_0.png

From here on we will do exactly the same as before, except that we’ll add three extra lines to overwrite soil moisture at every time step.

14         In:
experiment = ewatercycle.models.PCRGlobWB(
    version="setters", parameter_set=merrimack_parameterset
)
experiment_config, experiment_dir = experiment.setup()
15         In:
experiment.initialize(experiment_config)
# this may take a few minutes, log messages will start to appear in the output directory.
16         In:
while experiment.time < experiment.end_time:

    experiment.update()

    # Overwrite soil moisture field
    soil_moisture = experiment.get_value_as_xarray(
        "upper_soil_saturation_degree",
    )
    soil_moisture[31:41, 18:28] = 1
    experiment.set_value("upper_soil_saturation_degree", soil_moisture.values.flatten())

    # Track discharge at station location
    discharge_at_station = experiment.get_value_at_coords(
        "discharge", lat=[grdc_latitude], lon=[grdc_longitude]
    )
    time = experiment.time_as_isostr
    timeseries["experiment"][time] = discharge_at_station[0]

    # Show progress
    print(time, end="\r")  # "\r" clears the output before printing the next timestamp
2002-12-31T00:00:00Z
Final analysis
17         In:
fig, ax = plt.subplots(dpi=120)
timeseries.plot(ax=ax)
ax.set_title("Increased discharge due to irrigation")
17       Out:
Text(0.5, 1.0, 'Increased discharge due to irrigation')
_images/examples_Irrigation_31_1.png
Clean up

It is good practice to remove model instances once you’re done with an experiment. This will free up resources on the system.

18         In:
reference.finalize()
experiment.finalize()

Migrate from HPC to Cluster (Cartesius) guide

The HPC node jupyter.ewatercycle.org can be used for small test experiments, to do actual work you will need to run your notebook/script on the cluster (Cartesius). On Cartesius the forcing data is already present and many users can run jobs at the same time without interfering each other.

Familiarize yourself with Linux by reading this simple guide:

Migration Preparation

1. Create Github repository

Start by creating a Github repository to store (only) your code by following these guides:

2. Create Conda environment.yml (not required)

For ease of transfer it can be helpful to create a environment.yml file. This file contains a list of all the packages you use for running code. This is good practice because it allows users of your Github repository to quickly install the necessary package requirements.

3. Copy files from HPC to Cartesius

To copy files from the eWaterCycle HPC to Cartesius the following command example can be used:

  • scp -r {YourUserNameOnTheHPC}@jupyter.ewatercycle.org:/mnt/{YourUserNameOnTheHPC}/{PathToFolder}/ /home/{YourUserNameOnTheCartesius}/{PathToFolder}/

When prompted, enter your eWaterCycle HPC password.

Login to Cartesius

1. VPN Connection

Cluster computer hosting institutes have a strict policy on which IP-addresses are allowed to connect with the Cluster (Cartesius). For this reason you need to first establish a VPN connection to your University or Research Institute that has a whitelisted IP-address.

2. MobaXterm

To connects with Cartesius a SSH client is required. One such free client is MobaXterm and can be downloaded here: https://mobaxterm.mobatek.net/.

  • After installation open the client and click on the session tab (top left), click on SSH, at remote host fill in “cartesius.surfsara.nl”, tick the specify username box, fill in your Cartesius username and click OK (bottom). Fill in the cartesius password when prompted.

3. Login Node & Compute Node

Once you are logged in you are on the login node. This node should not be used to run scripts as it is only a portal to communicate with the compute nodes running on the background (the actual computers). The compute nodes are where you will do the calculations. We communicate with compute nodes using Bash (.sh) scripts. This will be explained later.

4. Home Directory & Scratch Directory

When you login you are directed to your Home Directory:

  • /home/{YourUserNameOnTheCartesius}/

The Home Directory has slower diskspeeds than the Scratch Directory. The Scratch Directory needs to be created using the following commands:

  • cd /scratch-shared/

  • mkdir {YourUserNameOnTheCartesius}

You can now access the Scratch Directory at /scratch/shared/{YourUserNameOnTheCartesius}/. Best practice is to modify your code such that it first copies all the required files (excluding code) to the Scratch Directory, followed by running the code, after completion copying the files back to the Home Directory, and cleaning up the Scratch Directory.

First Run preparations

1. Clone Github repository

Clone Github repository containing scripts using:

  • git clone https://github.com/example_user/example_repo

2. Install MiniConda

Go to home directory:

  • cd /home/username/

Download MiniConda:

  • wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

Install MiniConda:

  • bash Miniconda3-latest-Linux-x86_64.sh

Restart the connection with Cartesius

  • conda update conda

3. Create Conda environment

Create a Conda enviroment and install required packages following the description:

https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands

Make sure that Jupyter Lab is installed in the Conda environment:

  • wget https://raw.githubusercontent.com/eWaterCycle/ewatercycle/main/environment.yml

  • conda install mamba -n base -c conda-forge -y

  • mamba env create --file environment.yml

  • conda activate ewatercycle

  • conda install -c conda-forge jupyterlab

Install eWatercycle package:

  • pip install ewatercycle

4. Create Singularity Container

On Cartesius, Docker requires root access and can therefore not be used. Singularity is similar to, and integrates well with Docker. It also requires root access, but it is pre-installed on the compute nodes on Cartesius.

The first step to run the model on a compute node is thus to use singularity to create a Singularity image (.sif file) based on the Docker image. This is done with (note the srun command to access the compute node):

  • srun -N 1 -t 40 -p short singularity build --disable-cache ewatercycle-wflow-grpc4bmi.sif docker://ewatercycle/wflow-grpc4bmi:latest

This is an example for the wflow_sbm model, change to the correct Docker container:

  • docker://ewatercycle/{model}-grpc4bmi:{version}

5. Adjust code to run Singularity container

Code should be adjusted to run Singularity instead of Docker following:

from grpc4bmi.bmi_client_singularity import BmiClientSingularity

model = BmiClientSingularity(image='ewatercycle-wflow-grpc4bmi.sif', input_dirs=[input_dir], work_dir=work_dir)
...

6. Adjust code to use Scratch directory

Before running the model copy the model instance to the scratch directory:

/scratch-shared/{YourUsernameOnTheCartesius}/

Run the model from this directory and copy the output back to the home directory:

/home/{YourUsernameOnTheCartesius}/

Cleanup files in the scratch directory.

Submitting Jupyter Job on Cluster node

Here we briefly explain general SBATCH parameters and how to launch a Jupyter Lab environment on Cartesius. Start by opening a text editor on Cartesius (e.g. nano) or (easier) your local machine (e.g. notepad). Copy the following text inside your text editor, edit the Conda environment name, and save as run_jupyter_on_cartesius.sh (make sure the extension is .sh):

#!/bin/bash

# Serve a jupyter lab environment from a compute node on Cartesius
# usage: sbatch run_jupyter_on_compute_node.sh

# SLURM settings
#SBATCH -J jupyter_lab
#SBATCH -t 09:00:00
#SBATCH -N 1
#SBATCH -p normal
#SBATCH --output=slurm_%j.out
#SBATCH --error=slurm_%j.out

# Use an appropriate conda environment
. ~/miniconda3/etc/profile.d/conda.sh
conda activate {YourEnvironmentName}

# Some security: stop script on error and undefined variables
set -euo pipefail

# Specify (random) port to serve the notebook
port=8123
host=$(hostname -s)

# Print command to create ssh tunnel in log file
echo -e "

Command to create ssh tunnel (run from another terminal session on your local machine):
ssh -L ${port}:${host}:${port} $(whoami)@cartesius.surfsara.nl
Below, jupyter will print a number of addresses at which the notebook is served.
Due to the way the tunnel is set up, only the latter option will work.
It's the one that looks like
http://127.0.0.1:${port}/?token=<long_access_token_very_important_to_copy_as_well>
Copy this address in your local browser and you're good to go

Starting notebooks server
**************************************************
"

# Start the jupyter lab session

jupyter lab --no-browser --port ${port} --ip=${host}

Explanation of SBATCH Parameters

  • #SBATCH -J jupyter_lab

Here you can set the job name.

  • #SBATCH -t 09:00:00

Here you specify job runtime. On the Cartesius we have a budget, each half hour cpu runtime costs 1 point on the budget. A Node consists of 24 cores meaning that the specified runtime (9 hours) costs 24*2*9 points on the budget.

  • #SBATCH -N 1

Specifies the amount of nodes used by the run, keep at default value of 1.

  • #SBATCH -p normal

Specifies the type of Node, keep at default value of “normal”.

  • #SBATCH --output=slurm_%j.out

Specifies the location and name of the job log file.

Specifying job runtime

Good practice for calculating job runtime is by for example running a model first for 1 year, calculate the time it takes. Multiply it by the total amount of years for your study. Add a time buffer of around 10-20 percent.

  • For example: 1 year takes 2 hours, total run is 10 years, 20 hours total, add time buffer, estimated runtime equals 22-24 hours.

Running the bash (.sh) script

Enter this command to run the bash script:

  • sbatch run_jupyter_on_cartesius.sh

(If you get DOS and UNIX linebreak errors, run the following command:)

  • dos2unix run_jupyter_on_cartesius.sh

Job control

To view which jobs are running you can enter:

  • squeue -u {YourUserNameOnTheCartesius}

To cancel a running job you can enter:

  • scancel {jobID}

More information on job control can be found here: https://userinfo.surfsara.nl/systems/lisa/user-guide/creating-and-running-jobs#interacting

Launching Jupyter Lab on Cluster Node

1. Open Slurm output log file

  • Open slurm output log file by double clicking in the file browser or by using a text editor (nano) and read the output carefully.

2. Create ssh tunnel between local machine and cluster

To create a ssh connection between your local machine and the cluster you need to open a command prompt interface on your local machine. For example PowerShell or cmd on Windows.

  • copy the line ssh -L ${port}:${host}:${port} $(whoami)@cartesius.surfsara.nl from the slurm log file (not the bash script) into the command prompt and run.

3. Connect through browser

  • Open a browser (e.g. Chrome) and go to the url: localhost:8123/lab

4. Enter the access token

  • Copy the access token from the slurm otput log file and paste in the browser at access token or password.

You have now succesfully launched a Jupyter Lab environment on a cluster node.

Observations

The eWaterCycle platform supports observations relevant for calibrating and validating models. We currently support USGS and GRDC river discharge observations.

USGS

The U.S. Geological Survey Water Services provides public discharge data for a large number of US based stations. In eWaterCycle we make use of the USGS web service to automatically retrieve this data. The Discharge timestamp is corrected to the UTC timezone. Units are converted from cubic feet per second to cubic meter per second.

GRDC

The Global Runoff Data Centre provides discharge data for a large number of stations around the world. In eWaterCycle we support GRDC data. This is not downloaded automatically, but required to be present on the infrastructure where the eWaterCycle platform is deployed. By special permission from GRDC our own instance contains data from the ArcticHYCOS and GCOS/GTN-H, GTN-R projects.

ewatercycle package

Subpackages

ewatercycle.analysis package
ewatercycle.analysis.hydrograph(discharge: pandas.DataFrame, *, reference: str, precipitation: Optional[pandas.DataFrame] = None, dpi: Optional[int] = None, title: str = 'Hydrograph', discharge_units: str = 'm$^3$ s$^{-1}$', precipitation_units: str = 'mm day$^{-1}$', figsize: Tuple[float, float] = (10, 10), filename: Optional[Union[os.PathLike, str]] = None, **kwargs) Tuple[matplotlib.pyplot.Figure, Tuple[matplotlib.pyplot.Axes, matplotlib.pyplot.Axes]]

Plot a hydrograph.

This utility function makes it convenient to create a hydrograph from a set of discharge data from a pandas.DataFrame. A column must be marked as the reference, so that the agreement metrics can be calculated.

Optionally, the corresponding precipitation data can be plotted for comparison.

Parameters
  • discharge (pd.DataFrame) – Dataframe containing time series of discharge data to be plotted.

  • reference (str) – Name of the reference data, must correspond to a column in the discharge dataframe. Metrics are calculated between the reference column and each of the other columns.

  • precipitation (pd.DataFrame, optional) – Optional dataframe containing time series of precipitation data to be plotted from the top of the hydrograph.

  • dpi (int, optional) – DPI for the plot.

  • title (str, optional) – Title of the hydrograph.

  • discharge_units (str, optional) – Units for the discharge data.

  • precipitation_units (str, optional) – Units for the precipitation data.

  • figsize ((float, float), optional) – With, height of the plot in inches.

  • filename (str or Path, optional) – If specified, a copy of the plot will be saved to this path.

  • **kwargs – Options to pass to the matplotlib plotting function

Returns

  • fig (matplotlib.figure.Figure)

  • ax, ax_tbl (tuple of matplotlib.axes.Axes)

ewatercycle.config package
Config

Configuration of eWaterCycle is done via the Config object. The global configuration can be imported from the eWaterCycle module as CFG:

>>> from ewatercycle import CFG
>>> CFG
Config({'container_engine': None,
        'grdc_location': None,
        'output_dir': None,
        'singularity_dir': None,
        'wflow.docker_image': None,
        'wflow.singularity_image': None})

By default all values are initialized as None.

CFG is essentially a python dictionary with a few extra functions, similar to matplotlib.rcParams. This means that values can be updated like this:

>>> CFG['output_dir'] = '~/output'
>>> CFG['output_dir']
PosixPath('/home/user/output')

Notice that CFG automatically converts the path to an instance of pathlib.Path and expands the home directory. All values entered into the config are validated to prevent mistakes, for example, it will warn you if you make a typo in the key:

>>> CFG['output_directory'] = '~/output'
InvalidConfigParameter: `output_directory` is not a valid config parameter.

Or, if the value entered cannot be converted to the expected type:

>>> CFG['output_dir'] = 123
InvalidConfigParameter: Key `output_dir`: Expected a path, but got 123

By default, the config is loaded from the default location (i.e. ~/.config/ewatercycle/ewatercycle.yaml). If it does not exist, it falls back to the default values. to load a different file:

>>> CFG.load_from_file('~/my-config.yml')

Or to reload the current config:

>>> CFG.reload()
ewatercycle.config.CFG

eWaterCycle configuration object.

The configuration is loaded from:

  1. ~/$XDG_CONFIG_HOME/ewatercycle/ewatercycle.yaml

  2. ~/.config/ewatercycle/ewatercycle.yaml

  3. /etc/ewatercycle.yaml

  4. Fall back to empty configuration

The ewatercycle.yaml is formatted in YAML and could for example look like:

grdc_location: /data/grdc
container_engine: singularity
singularity_dir: /data/singularity-images
output_dir: /scratch
# Created with cd  /data/singularity-images &&
# singularity pull docker://ewatercycle/wflow-grpc4bmi:2020.1.1
wflow.singularity_images: wflow-grpc4bmi_2020.1.1.sif
wflow.docker_images: ewatercycle/wflow-grpc4bmi:2020.1.1
class ewatercycle.config.Config(*args, **kwargs)

Bases: ewatercycle.config._validated_config.ValidatedConfig

Configuration object.

Do not instantiate this class directly, but use ewatercycle.CFG instead.

load_from_file(filename: Union[os.PathLike, str]) None

Load user configuration from the given file.

reload() None

Reload the config file.

dump_to_yaml() str

Dumps YAML formatted string of Config object

save_to_file(config_file: Optional[Union[os.PathLike, str]] = None)

Write conf object to a file.

Parameters

config_file – File to write configuration object to. If not given then will try to use CFG[‘ewatercycle_config’] location and if CFG[‘ewatercycle_config’] is not set then will use the location in users home directory.

ewatercycle.forcing package
ewatercycle.forcing.load(directory: str)

Load previously generated or imported forcing data.

Parameters

directory – forcing data directory; must contain ewatercycle_forcing.yaml file

Returns: Forcing object

ewatercycle.forcing.load_foreign(target_model, start_time: str, end_time: str, directory: str = '.', shape: Optional[str] = None, forcing_info: Optional[Dict] = None)

Load existing forcing data generated from an external source.

Parameters
  • target_model – Name of the hydrological model for which the forcing will be used

  • start_time – Start time of forcing in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’.

  • end_time – End time of forcing in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’.

  • directory – forcing data directory

  • shape – Path to a shape file. Used for spatial selection.

  • forcing_info – Dictionary with model-specific information about forcing data. See below for the available options for each model.

Returns

Forcing object

Examples

For Marrmot

from ewatercycle.forcing import load_foreign

forcing = load_foreign('marmot',
                       directory='/data/marrmot-forcings-case1',
                       start_time='1989-01-02T00:00:00Z',
                       end_time='1999-01-02T00:00:00Z',
                       forcing_info={
                           'forcing_file': 'marrmot-1989-1999.mat'
                       })

For LisFlood

from ewatercycle.forcing import load_foreign

forcing = load_foreign(target_model='lisflood',
                       directory='/data/lisflood-forcings-case1',
                       start_time='1989-01-02T00:00:00Z',
                       end_time='1999-01-02T00:00:00Z',
                       forcing_info={
                           'PrefixPrecipitation': 'tp.nc',
                           'PrefixTavg': 'ta.nc',
                           'PrefixE0': 'e.nc',
                           'PrefixES0': 'es.nc',
                           'PrefixET0': 'et.nc'
                       })

Model-specific forcing info:

Hype

None – Hype does not have model-specific load options.

Lisflood
  • PrefixPrecipitation – Path to a NetCDF or pcraster file with precipitation data

  • PrefixTavg – Path to a NetCDF or pcraster file with average temperature data

  • PrefixE0 – Path to a NetCDF or pcraster file with potential evaporation rate from open water surface data

  • PrefixES0 – Path to a NetCDF or pcraster file with potential evaporation rate from bare soil surface data

  • PrefixET0 – Path to a NetCDF or pcraster file with potential (reference) evapotranspiration rate data

Marrmot

forcing_file – Matlab file that contains forcings for Marrmot models. See format forcing file in model implementation.

Pcrglobwb
  • precipitationNC (str) – Input file for precipitation data.

  • temperatureNC (str) – Input file for temperature data.

Wflow
  • netcdfinput (str) – Path to forcing file.”

  • Precipitation (str) – Variable name of precipitation data in input file.

  • EvapoTranspiration (str) – Variable name of evapotranspiration data in input file.

  • Temperature (str) – Variable name of temperature data in input file.

  • Inflow (str) – Variable name of inflow data in input file.

ewatercycle.forcing.generate(target_model: str, dataset: str, start_time: str, end_time: str, shape: str, model_specific_options: Optional[Dict] = None)

Generate forcing data with ESMValTool.

Parameters
  • target_model – Name of the model

  • dataset – Name of the source dataset. See datasets.

  • start_time – Start time of forcing in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’.

  • end_time – End time of forcing in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’.

  • shape – Path to a shape file. Used for spatial selection.

  • model_specific_options – Dictionary with model-specific recipe settings. See below for the available options for each model.

Returns

Forcing object

Model-specific options that can be passed to generate:

Hype

None – Hype does not have model-specific generate options.

Lisflood
  • extract_region (dict) – Region specification, dictionary must contain start_longitude, end_longitude, start_latitude, end_latitude

  • run_lisvap (bool) – if lisvap should be run. Default is False. Running lisvap is not supported yet.

  • TODO add regrid options so forcing can be generated for parameter set

  • TODO that is not on a 0.1x0.1 grid

Marrmot

None – Marrmot does not have model-specific generate options.

Pcrglobwb
  • start_time_climatology (str) – Start time for the climatology data

  • end_time_climatology (str) – End time for the climatology data

  • extract_region (dict) – Region specification, dictionary must contain start_longitude, end_longitude, start_latitude, end_latitude

Wflow
  • dem_file (str) – Name of the dem_file to use. Also defines the basin param.

  • extract_region (dict) – Region specification, dictionary must contain start_longitude, end_longitude, start_latitude, end_latitude

Submodules
ewatercycle.forcing.datasets module

Supported datasets for ESMValTool recipes.

Currently supported: ERA5 and ERA-Interim.

ewatercycle.models package
Submodules
ewatercycle.models.abstract module
class ewatercycle.models.abstract.AbstractModel(version: str, parameter_set: Optional[ewatercycle.parameter_sets.default.ParameterSet] = None, forcing: Optional[ewatercycle.models.abstract.ForcingT] = None)

Bases: Generic[ewatercycle.models.abstract.ForcingT]

Abstract class of a eWaterCycle model.

available_versions: ClassVar[Tuple[str, ...]] = ()

Versions of model that are available in this class

bmi: basic_modeling_interface.Bmi

Basic Modeling Interface object

abstract setup(*args, **kwargs) Tuple[str, str]

Performs model setup.

  1. Creates config file and config directory

  2. Start bmi container and store as self.bmi

Parameters
  • *args – Positional arguments. Sub class should specify each arg.

  • **kwargs – Named arguments. Sub class should specify each arg.

Returns

Path to config file and path to config directory

initialize(config_file: str) None

Initialize the model.

Parameters

config_file – Name of initialization file.

finalize() None

Perform tear-down tasks for the model.

update() None

Advance model state by one time step.

get_value(name: str) numpy.ndarray

Get a copy of values of the given variable.

Parameters

name – Name of variable

get_value_at_coords(name, lat: Iterable[float], lon: Iterable[float]) numpy.ndarray

Get a copy of values of the given variable at lat/lon coordinates.

Parameters
  • name – Name of variable

  • lat – Latitudinal value

  • lon – Longitudinal value

set_value(name: str, value: numpy.ndarray) None

Specify a new value for a model variable.

Parameters
  • name – Name of variable

  • value – The new value for the specified variable.

set_value_at_coords(name: str, lat: Iterable[float], lon: Iterable[float], values: numpy.ndarray) None

Specify a new value for a model variable at at lat/lon coordinates.

Parameters
  • name – Name of variable

  • lat – Latitudinal value

  • lon – Longitudinal value

  • values – The new value for the specified variable.

abstract get_value_as_xarray(name: str) xarray.DataArray

Get a copy values of the given variable as xarray DataArray.

The xarray object also contains coordinate information and additional attributes such as the units.

Args: name: Name of the variable

abstract property parameters: Iterable[Tuple[str, Any]]

Default values for the setup() inputs

property start_time: float

Start time of the model.

property end_time: float

End time of the model.

property time: float

Current time of the model.

property time_units: str

Time units of the model. Formatted using UDUNITS standard from Unidata.

property time_step: float

Current time step of the model.

property output_var_names: Iterable[str]

List of a model’s output variables.

property start_time_as_isostr: str

Start time of the model.

In UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’.

property end_time_as_isostr: str

End time of the model.

In UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’.

property time_as_isostr: str

Current time of the model.

In UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’.

property start_time_as_datetime: datetime.datetime

Start time of the model as a datetime object.

property end_time_as_datetime: datetime.datetime

End time of the model as a datetime object’.

property time_as_datetime: datetime.datetime

Current time of the model as a datetime object’.

ewatercycle.models.lisflood module

eWaterCycle wrapper around Lisflood BMI.

class ewatercycle.models.lisflood.Lisflood(version: str, parameter_set: ewatercycle.parameter_sets.default.ParameterSet, forcing: ewatercycle.forcing._lisflood.LisfloodForcing)

Bases: ewatercycle.models.abstract.AbstractModel[ewatercycle.forcing._lisflood.LisfloodForcing]

eWaterCycle implementation of Lisflood hydrological model.

Parameters
  • version – pick a version for which an grpc4bmi docker image is available.

  • parameter_set – LISFLOOD input files. Any included forcing data will be ignored.

  • forcing – a LisfloodForcing object.

Example

See examples/lisflood.ipynb in ewatercycle repository

available_versions: ClassVar[Tuple[str, ...]] = ('20.10',)

Versions for which ewatercycle grpc4bmi docker images are available.

setup(IrrigationEfficiency: Optional[str] = None, start_time: Optional[str] = None, end_time: Optional[str] = None, MaskMap: Optional[str] = None, cfg_dir: Optional[str] = None) Tuple[str, str]

Configure model run.

  1. Creates config file and config directory based on the forcing variables and time range.

  2. Start bmi container and store as bmi

Parameters
  • IrrigationEfficiency – Field application irrigation efficiency. max 1, ~0.90 drip irrigation, ~0.75 sprinkling

  • start_time – Start time of model in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’. If not given then forcing start time is used.

  • end_time – End time of model in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’. If not given then forcing end time is used.

  • MaskMap – Mask map to use instead of one supplied in parameter set. Path to a NetCDF or pcraster file with same dimensions as parameter set map files and a boolean variable.

  • cfg_dir – a run directory given by user or created for user.

Returns

Path to config file and path to config directory

get_value_as_xarray(name: str) xarray.DataArray

Return the value as xarray object.

property parameters: Iterable[Tuple[str, Any]]

List the parameters for this model.

finalize() None

Perform tear-down tasks for the model.

forcing: Optional[ForcingT]
bmi: Bmi

Basic Modeling Interface object

class ewatercycle.models.lisflood.XmlConfig(source)

Bases: ewatercycle.parametersetdb.config.AbstractConfig

Config container where config is read/saved in xml format.

config: xml.etree.ElementTree.Element

XML element used to make changes to the config

save(target)

Save xml to file.

Parameters

target – file to save to

ewatercycle.models.marrmot module

eWaterCycle wrapper around Marrmot BMI.

class ewatercycle.models.marrmot.Solver(name: str = 'createOdeApprox_IE', resnorm_tolerance: float = 0.1, resnorm_maxiter: float = 6.0)

Bases: object

Container for properties of the solver.

For current implementations see here.

name: str = 'createOdeApprox_IE'
resnorm_tolerance: float = 0.1
resnorm_maxiter: float = 6.0
class ewatercycle.models.marrmot.MarrmotM01(version: str, forcing: ewatercycle.forcing._marrmot.MarrmotForcing)

Bases: ewatercycle.models.abstract.AbstractModel[ewatercycle.forcing._marrmot.MarrmotForcing]

eWaterCycle implementation of Marrmot Collie River 1 (traditional bucket) model.

It sets MarrmotM01 parameter with an initial value that is the mean value of the range specfied in model parameter range file.

Parameters

version – pick a version for which an ewatercycle grpc4bmi docker image is available. forcing: a MarrmotForcing object. If forcing file contains parameter and other settings, those are used and can be changed in setup().

Example

See examples/marrmotM01.ipynb in ewatercycle repository

model_name = 'm_01_collie1_1p_1s'

Name of model in Matlab code.

available_versions: ClassVar[Tuple[str, ...]] = ('2020.11',)

Versions for which ewatercycle grpc4bmi docker images are available.

setup(maximum_soil_moisture_storage: Optional[float] = None, initial_soil_moisture_storage: Optional[float] = None, start_time: Optional[str] = None, end_time: Optional[str] = None, solver: Optional[ewatercycle.models.marrmot.Solver] = None, cfg_dir: Optional[str] = None) Tuple[str, str]

Configure model run.

  1. Creates config file and config directory based on the forcing variables and time range

  2. Start bmi container and store as bmi

Parameters
  • maximum_soil_moisture_storage

    in mm. Range is specfied in model parameter range file.

  • initial_soil_moisture_storage – in mm.

  • start_time – Start time of model in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’. If not given then forcing start time is used.

  • end_time – End time of model in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’. If not given then forcing end time is used.

  • solver – Solver settings

  • cfg_dir – a run directory given by user or created for user.

Returns

Path to config file and path to config directory

get_value_as_xarray(name: str) xarray.DataArray

Return the value as xarray object.

property parameters: Iterable[Tuple[str, Any]]

List the parameters for this model.

forcing: Optional[ForcingT]
bmi: Bmi

Basic Modeling Interface object

class ewatercycle.models.marrmot.MarrmotM14(version: str, forcing: ewatercycle.forcing._marrmot.MarrmotForcing)

Bases: ewatercycle.models.abstract.AbstractModel[ewatercycle.forcing._marrmot.MarrmotForcing]

eWaterCycle implementation of Marrmot Top Model hydrological model.

It sets MarrmotM14 parameter with an initial value that is the mean value of the range specfied in model parameter range file.

Parameters
  • version – pick a version for which an ewatercycle grpc4bmi docker image is available.

  • forcing – a MarrmotForcing object. If forcing file contains parameter and other settings, those are used and can be changed in setup().

Example

See examples/marrmotM14.ipynb in ewatercycle repository

model_name = 'm_14_topmodel_7p_2s'

Name of model in Matlab code.

available_versions: ClassVar[Tuple[str, ...]] = ('2020.11',)

Versions for which ewatercycle grpc4bmi docker images are available.

setup(maximum_soil_moisture_storage: Optional[float] = None, threshold_flow_generation_evap_change: Optional[float] = None, leakage_saturated_zone_flow_coefficient: Optional[float] = None, zero_deficit_base_flow_speed: Optional[float] = None, baseflow_coefficient: Optional[float] = None, gamma_distribution_chi_parameter: Optional[float] = None, gamma_distribution_phi_parameter: Optional[float] = None, initial_upper_zone_storage: Optional[float] = None, initial_saturated_zone_storage: Optional[float] = None, start_time: Optional[str] = None, end_time: Optional[str] = None, solver: Optional[ewatercycle.models.marrmot.Solver] = None, cfg_dir: Optional[str] = None) Tuple[str, str]

Configure model run.

  1. Creates config file and config directory based on the forcing variables and time range

  2. Start bmi container and store as bmi

Parameters
  • maximum_soil_moisture_storage

    in mm. Range is specfied in model parameter range file. threshold_flow_generation_evap_change.

  • leakage_saturated_zone_flow_coefficient – in mm/d.

  • zero_deficit_base_flow_speed – in mm/d.

  • baseflow_coefficient – in mm-1.

  • gamma_distribution_chi_parameter.

  • gamma_distribution_phi_parameter.

  • initial_upper_zone_storage – in mm.

  • initial_saturated_zone_storage – in mm.

  • start_time – Start time of model in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’. If not given then forcing start time is used.

  • end_time – End time of model in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’. If not given then forcing end time is used. solver: Solver settings

  • cfg_dir – a run directory given by user or created for user.

Returns

Path to config file and path to config directory

forcing: Optional[ForcingT]
bmi: Bmi

Basic Modeling Interface object

get_value_as_xarray(name: str) xarray.DataArray

Return the value as xarray object.

property parameters: Iterable[Tuple[str, Any]]

List the parameters for this model.

ewatercycle.models.pcrglobwb module

eWaterCycle wrapper around PCRGlobWB BMI.

class ewatercycle.models.pcrglobwb.PCRGlobWB(version: str, parameter_set: ewatercycle.parameter_sets.default.ParameterSet, forcing: Optional[ewatercycle.forcing._pcrglobwb.PCRGlobWBForcing] = None)

Bases: ewatercycle.models.abstract.AbstractModel[ewatercycle.forcing._pcrglobwb.PCRGlobWBForcing]

eWaterCycle implementation of PCRGlobWB hydrological model.

Parameters
available_versions: ClassVar[Tuple[str, ...]] = ('setters',)

Versions of model that are available in this class

setup(cfg_dir: Optional[str] = None, **kwargs) Tuple[str, str]

Start model inside container and return config file and work dir.

Parameters
  • cfg_dir – a run directory given by user or created for user.

  • **kwargs – Use parameters() to see the current values configurable options for this model,

Returns: Path to config file and work dir

get_value_as_xarray(name: str) xarray.DataArray

Return the value as xarray object.

property parameters: Iterable[Tuple[str, Any]]

List the configurable parameters for this model.

forcing: Optional[ForcingT]
bmi: Bmi

Basic Modeling Interface object

ewatercycle.models.wflow module

eWaterCycle wrapper around WFlow BMI.

class ewatercycle.models.wflow.Wflow(version: str, parameter_set: ewatercycle.parameter_sets.default.ParameterSet, forcing: Optional[ewatercycle.forcing._wflow.WflowForcing] = None)

Bases: ewatercycle.models.abstract.AbstractModel[ewatercycle.forcing._wflow.WflowForcing]

Create an instance of the Wflow model class.

Parameters
  • version – pick a version from available_versions

  • parameter_set – instance of ParameterSet.

  • forcing – instance of WflowForcing or None. If None, it is assumed that forcing is included with the parameter_set.

available_versions: ClassVar[Tuple[str, ...]] = ('2020.1.1',)

Show supported WFlow versions in eWaterCycle

setup(cfg_dir: Optional[str] = None, **kwargs) Tuple[str, str]

Start the model inside a container and return a valid config file.

Parameters
  • cfg_dir – a run directory given by user or created for user.

  • **kwargs (optional, dict) – see parameters for all configurable model parameters.

Returns

Path to config file and working directory

get_value_as_xarray(name: str) xarray.DataArray

Return the value as xarray object.

property parameters: Iterable[Tuple[str, Any]]

List the configurable parameters for this model.

forcing: Optional[ForcingT]
bmi: Bmi

Basic Modeling Interface object

ewatercycle.observation package
Submodules
ewatercycle.observation.grdc module

Global Runoff Data Centre module.

ewatercycle.observation.grdc.get_grdc_data(station_id: str, start_time: str, end_time: str, parameter: str = 'Q', data_home: Optional[str] = None, column: str = 'streamflow') Tuple[pandas.core.frame.DataFrame, Dict[str, Union[str, int, float]]]

Get river discharge data from Global Runoff Data Centre (GRDC).

Requires the GRDC daily data files in a local directory. The GRDC daily data files can be ordered at https://www.bafg.de/GRDC/EN/02_srvcs/21_tmsrs/riverdischarge_node.html

Parameters
  • station_id – The station id to get. The station id can be found in the catalogues at https://www.bafg.de/GRDC/EN/02_srvcs/21_tmsrs/212_prjctlgs/project_catalogue_node.html

  • start_time – Start time of model in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’.

  • end_time – End time of model in UTC and ISO format string e.g. ‘YYYY-MM-DDTHH:MM:SSZ’.

  • parameter – optional. The parameter code to get, e.g. (‘Q’) discharge, cubic meters per second.

  • data_home – optional. The directory where the daily grdc data is located. If left out will use the grdc_location in the eWaterCycle configuration file.

  • column – optional. Name of column in dataframe. Default: “streamflow”.

Returns

grdc data in a dataframe and metadata.

Examples

from ewatercycle.observation.grdc import get_grdc_data

df, meta = get_grdc_data('6335020',
                        '2000-01-01T00:00Z',
                        '2001-01-01T00:00Z')
df.describe()
         streamflow
count   4382.000000
mean    2328.992469
std     1190.181058
min      881.000000
25%     1550.000000
50%     2000.000000
75%     2730.000000
max    11300.000000

meta
{'grdc_file_name': '/home/myusername/git/eWaterCycle/ewatercycle/6335020_Q_Day.Cmd.txt',
'id_from_grdc': 6335020,
'file_generation_date': '2019-03-27',
'river_name': 'RHINE RIVER',
'station_name': 'REES',
'country_code': 'DE',
'grdc_latitude_in_arc_degree': 51.756918,
'grdc_longitude_in_arc_degree': 6.395395,
'grdc_catchment_area_in_km2': 159300.0,
'altitude_masl': 8.0,
'dataSetContent': 'MEAN DAILY DISCHARGE (Q)',
'units': 'm³/s',
'time_series': '1814-11 - 2016-12',
'no_of_years': 203,
'last_update': '2018-05-24',
'nrMeasurements': 'NA',
'UserStartTime': '2000-01-01T00:00Z',
'UserEndTime': '2001-01-01T00:00Z',
'nrMissingData': 0}
ewatercycle.observation.usgs module
ewatercycle.observation.usgs.get_usgs_data(station_id, start_date, end_date, parameter='00060', cache_dir=None)

Get river discharge data from the USGS REST web service.

See U.S. Geological Survey Water Services (USGS)

Parameters
  • station_id (str) – The station id to get

  • start_date (str) – String for start date in the format: ‘YYYY-MM-dd’, e.g. ‘1980-01-01’

  • end_date (str) – String for start date in the format: ‘YYYY-MM-dd’, e.g. ‘2018-12-31’

  • parameter (str) – The parameter code to get, e.g. (‘00060’) discharge, cubic feet per second

  • cache_dir (str) – Directory where files retrieved from the web service are cached. If set to None then USGS_DATA_HOME env var will be used as cache directory.

Examples

>>> from ewatercycle.observation.usgs import get_usgs_data
>>> data = get_usgs_data('03109500', '2000-01-01', '2000-12-31', cache_dir='.')
>>> data
    <xarray.Dataset>
    Dimensions:     (time: 8032)
    Coordinates:
      * time        (time) datetime64[ns] 2000-01-04T05:00:00 ... 2000-12-23T04:00:00
    Data variables:
        Streamflow  (time) float32 8.296758 10.420501 ... 10.647034 11.694747
    Attributes:
        title:      USGS Data from streamflow data
        station:    Little Beaver Creek near East Liverpool OH
        stationid:  03109500
        location:   (40.6758974, -80.5406244)
ewatercycle.parameter_sets package
ewatercycle.parameter_sets.available_parameter_sets(target_model: Optional[str] = None) Tuple[str, ...]

List available parameter sets on this machine.

Parameters

target_model – Filter parameter sets on a model name

Returns: Names of available parameter sets on current machine.

ewatercycle.parameter_sets.get_parameter_set(name: str) ewatercycle.parameter_sets.default.ParameterSet

Get parameter set object available on this machine so it can be used in a model.

Parameters

name – Name of parameter set

Returns: Parameter set object that can be used in an ewatercycle model constructor.

ewatercycle.parameter_sets.download_parameter_sets(zenodo_doi: str, target_model: str, config: str)
ewatercycle.parameter_sets.example_parameter_sets() Dict[str, ewatercycle.parameter_sets._example.ExampleParameterSet]

Lists the available example parameter sets.

They can be downloaded with download_example_parameter_sets().

ewatercycle.parameter_sets.download_example_parameter_sets(skip_existing=True)

Downloads all of the example parameter sets and adds them to the config_file.

Downloads to parameterset_dir directory defined in ewatercycle.config.CFG.

Parameters

skip_existing – When true will not download any parameter set which already has a local directory. When false will raise ValueError exception when parameter set already exists.

Submodules
ewatercycle.parameter_sets.default module
class ewatercycle.parameter_sets.default.ParameterSet(name: str, directory: str, config: str, doi='N/A', target_model='generic', supported_model_versions: Optional[Set[str]] = None)

Bases: object

Container object for parameter set options.

name

Name of parameter set

Type

str

directory

Location on disk where files of parameter set are stored. If Path is relative then relative to CFG[‘parameterset_dir’].

Type

Path

config

Model configuration file which uses files from directory. If Path is relative then relative to CFG[‘parameterset_dir’].

Type

Path

doi

Persistent identifier of parameter set. For a example a DOI for a Zenodo record.

Type

str

target_model

Name of model that parameter set can work with

Type

str

supported_model_versions

Set of model versions that are supported by this parameter set. If not set then parameter set will be supported by all versions of model

Type

Set[str]

property is_available: bool

Tests if directory and config file is available on this machine

ewatercycle.parametersetdb package

Documentation about ewatercycle_parametersetdb

class ewatercycle.parametersetdb.ParameterSet(df: ewatercycle.parametersetdb.datafiles.AbstractCopier, cfg: ewatercycle.parametersetdb.config.AbstractConfig)

Bases: object

save_datafiles(target)

Saves datafiles to target directory

Parameters

target – Path of target directory

save_config(target)

Saves config file as target filename

Parameters

target – filename of config file

property config: Any

Configuration as dictionary.

To make changes to configuration before saving set the config keys and/or values.

Can be a nested dict.

ewatercycle.parametersetdb.build_from_urls(config_format, config_url, datafiles_format, datafiles_url) ewatercycle.parametersetdb.ParameterSet

Construct ParameterSet based on urls

Parameters
  • config_format – Format of file found at config url

  • config_url – Url of config file

  • datafiles_format – Method to stage datafiles url

  • datafiles_url – Source url of datafiles

Submodules
ewatercycle.parametersetdb.config module
class ewatercycle.parametersetdb.config.CaseConfigParser(defaults=None, dict_type=<class 'collections.OrderedDict'>, allow_no_value=False, *, delimiters=('=', ':'), comment_prefixes=('#', ';'), inline_comment_prefixes=None, strict=True, empty_lines_in_values=True, default_section='DEFAULT', interpolation=<object object>, converters=<object object>)

Bases: configparser.ConfigParser

Case sensitive config parser See https://stackoverflow.com/questions/1611799/preserve-case-in-configparser

optionxform(optionstr)
ewatercycle.parametersetdb.config.fetch(url)

Fetches text of url

class ewatercycle.parametersetdb.config.AbstractConfig(source: str)

Bases: abc.ABC

config: Any

Dict like content of config

abstract save(target: str)
Parameters

target – File path to save config to

Returns:

class ewatercycle.parametersetdb.config.IniConfig(source)

Bases: ewatercycle.parametersetdb.config.AbstractConfig

Config container where config is read/saved in ini format.

config: Any

Dict like content of config

save(target)
Parameters

target – File path to save config to

Returns:

class ewatercycle.parametersetdb.config.YamlConfig(source)

Bases: ewatercycle.parametersetdb.config.AbstractConfig

Config container where config is read/saved in yaml format

yaml = <ruamel.yaml.main.YAML object>
config: Any

Dict like content of config

save(target)
Parameters

target – File path to save config to

Returns:

ewatercycle.parametersetdb.datafiles module
class ewatercycle.parametersetdb.datafiles.AbstractCopier(source: str)

Bases: abc.ABC

abstract save(target: str)

Saves datafiles to target directory

Parameters

target – Directory where to save the datafiles

Returns:

class ewatercycle.parametersetdb.datafiles.SubversionCopier(source: str)

Bases: ewatercycle.parametersetdb.datafiles.AbstractCopier

Uses subversion export to copy files from source to target

save(target)

Saves datafiles to target directory

Parameters

target – Directory where to save the datafiles

Returns:

class ewatercycle.parametersetdb.datafiles.SymlinkCopier(source: str)

Bases: ewatercycle.parametersetdb.datafiles.AbstractCopier

Creates symlink from source to target

save(target)

Saves datafiles to target directory

Parameters

target – Directory where to save the datafiles

Returns:

Submodules

ewatercycle.util module
ewatercycle.util.find_closest_point(grid_longitudes: Iterable[float], grid_latitudes: Iterable[float], point_longitude: float, point_latitude: float) Tuple[int, int]

Find closest grid cell to a point based on Geographical distances.

It uses Spherical Earth projected to a plane formula: https://en.wikipedia.org/wiki/Geographical_distance

Parameters
  • grid_longitudes – 1d array of model grid longitudes in degrees

  • grid_latitudes – 1d array of model grid latitudes in degrees

  • point_longitude – longitude in degrees of target coordinate

  • point_latitude – latitude in degrees of target coordinate

Returns

index of closest grid point in the original longitude array idx_lat: index of closest grid point in the original latitude array

Return type

idx_lon

ewatercycle.util.get_time(time_iso: str) datetime.datetime

Return a datetime in UTC.

Convert a date string in ISO format to a datetime and check if it is in UTC.

ewatercycle.util.get_extents(shapefile: Any, pad=0) Dict[str, float]

Get lat/lon extents from shapefile and add padding.

Parameters
  • shapefile – Path to shapfile

  • pad – Optional padding

Returns

Dict with start_longitude, start_latitude, end_longitude, end_latitude

ewatercycle.util.data_files_from_recipe_output(recipe_output: esmvalcore.experimental.recipe_output.RecipeOutput) Tuple[str, Dict[str, str]]

Get data files from a ESMVaLTool recipe output

Expects first diagnostic task to produce files with single var each.

Parameters

recipe_output – ESMVaLTool recipe output

Returns

Tuple with directory of files and a dict where key is cmor short name and value is relative path to NetCDF file

ewatercycle.util.to_absolute_path(input_path: str, parent: Optional[pathlib.Path] = None, must_exist: bool = False, must_be_in_parent=True) pathlib.Path

Parse input string as pathlib.Path object.

Parameters
  • input_path – Input string path that can be a relative or absolute path.

  • parent – Optional parent path of the input path

  • must_exist – Optional argument to check if the input path exists.

  • must_be_in_parent – Optional argument to check if the input path is subpath of parent path

Returns

The input path that is an absolute path and a pathlib.Path object.

ewatercycle.version module