Getting Started
Floodwater is based on using Anaconda as an environment manager. This allows the user to have multiple installations on their system without interfering with each other.
Installation
The installation of Floodwater uses a single Python script to manage the environment
creation within Anaconda. The script is called install.py
and is located in the
root directory of the repository.
Pre-requisites
- The following are pre-requisites for installing Floodwater:
A system with a Linux distribution (e.g. Ubuntu, Debian, Rocky, etc.)
Anaconda (https://www.anaconda.com/distribution/)
A job manager (e.g. SLURM, PBS, etc.)
Access to a MetGet instance.
Note
If you do not have access to an instance, you can contact zcobell@thewaterinstitute.org or begin running your own instance by following the instructions on the metget-server GitHub page.
In the case that your system is based on ARM processors, you will need to build the ecFlow package from source. This is recommended only for experienced users. We have shown that the system works well on both ARM and x86_64 architectures, however, ECMWF only provides pre-built binaries for x86_64.
Installation Steps
The first step to installing Floodwater is to clone the repository:
git clone https://github.com/waterinstitute/floodwater.git
This will create a directory called floodwater
in the current working directory.
You should then change into the floodwater
directory:
cd floodwater
The installation should use the install.py
file. The script allows the user
to enable/disable certain features of Floodwater. For example, you may only want
to have the code of Floodwater installed on your local system so that you can run
the GUI.
Full Installation
python install.py --conda-name my-floodwater-env
Warning
For all the good things about Anaconda, the default solver is extraordinarily slow. We highly recommend using the Mamba solver. This can be done by following the instructions here.
Partial Installation
A common use case is to install the full package on a remote system and then install only the essential packages on the local system. The local system can access the remote system over SSH which will allow a much more snappy experience.
python install.py --conda-name my-floodwater-env --minimal
Warning
It is also possible to forward an X11 connection over SSH so that the GUI can be used remotely, however, this will be slow, particularly if the remote system is not on the same network.
Note
Developers will need to re-install the Floodwater package (not the full Anaconda environment)
if they make changes to the code. This can be done by running the pip install .
from the
root directory. This helps to maintain environment isolation during development and allows the
user to maintain multiple active Floodwater environments.
Checking the Installation
To confirm that the installation was successful, you should activate the newly create conda environment:
conda activate my-floodwater-env
You can then check that the Floodwater package is installed by running:
floodwater --help
ecFlow Server Setup
Systems based on the ecFlow workflow manager require that a server daemon is running. The server is responsible for managing the workflows and the jobs that are submitted. Note that many of the jobs within the system which aren’t submitted to the job scheduler (i.e. SLURM or PBS) are run directly on the server which is running the daemon. This means that users should be considerate on shared systems and work with system administrators in the event that there are too many concurrent processes on the system. In the past, we have worked with system administrators on large NSF funded resources to provide a virtual machine which is dedicated to running the server daemon.
Environment Variables
To avoid manually specifying the server host and port number during each Floodwater command, we recommend setting the following environment variables:
export ECF_HOST=hostname
export ECF_PORT=port_number
The ECF_HOST
variable should be set to the hostname of the system which is running the
server daemon. The ECF_PORT
variable should be set to the port number which the server.
Note that the ECF_HOST
variable should be a fully qualified domain name (FQDN) which
can be resolved by the client system.
Starting the Server
The server can be started by running the following command:
floodwater server start [--port xxxx]
The port number is optional. If it is not supplied on the command line, the system will search for an
environment variable named ECF_PORT
. If this variable is not set, you will receive an error.
You can check that the server has become active by running:
floodwater ping
This will ping the server and return the time it took to respond. This command uses environment variables above to establish the connection to the server.
The system will return something that looks like the following:
2023-10-13T08:56:07CDT :: INFO :: floodwater_cli_funcs.py :: floodwater_ping_ecflow :: Success! Ping ecFlow server (myhost.local:3121) succeeded in ~4 ms
Stopping the Server
The server can be stopped by running the following command:
floodwater server stop
Running Your First Job
The system comes packaged with a simple example which runs the ec95d ADCIRC model. This is a very coarse model and should only be used for testing.
The example files are contained within: floodwater/examples/ec95d
You will need to make minor modifications to these files, however, they are designed to be as generic as possible for the purposes of getting started.
Make sure an ecflow server instance is started on your system. Then, the first step is to copy the example files to a new directory:
mkdir my_first_job
cp -r floodwater/examples/ec95d/* my_first_job
Then, modify the yaml and .h
files as necessary to fit your systems specification.
Then, to load the job, you can run:
floodwater load adcirc_gfs.yaml
This will load the job into the server. You can check that the job has been loaded by running:
floodwater status
Then, to start the job, you can run:
floodwater start adcirc_gfs
Note that you can load and start the job in a single command by running:
floodwater run adcirc_gfs.yaml
The job will appear in the GUI and you can monitor the progress of the job using the GUI or the status command on the command line.
Tips and Tricks
The ecFlow client can show running instances from a remote system. The best performance is usually archived by running the GUI on your local machine and allowing the client to connect to the remote server over SSH. This is only possible for Mac and Linux systems. For Windows, you will need to use X11 forwarding to view the GUI. This can be done using MobaXterm or PuTTY coupled with an X11 server such as Xming. Note that you will need to be on the same network (virtual or physical) as the remote system which may require running a VPN client.
While not strictly necessary, to use a local ecFlow GUI, you can simply install the Anaconda package for Floodwater and use the command:
floodwater gui
Note
The GUI is not required to run Floodwater. It is only used for monitoring the status of the jobs.
Sometimes, it is necessary to connect to a machine which is behind a load balancer or distribution framework of some kind. This is often the case with HPC systems which have offered to run a virtual machine to manage our ecFlow server instances. The login node of the HPC system will be accessible from the outside world, however, the virtual machine will only be accessible from the login node. There are some tricks that can be done using SSH to forward the connection to the virtual machine.
To open a SSH tunnel to the virtual machine, you can run the following command:
ssh -J username@login.hpc.edu username@vmname -C -N -L 4121:vmname:3121
This will forward the connection to the virtual machine on port 3121 to your local machine on port 4121. You can
then use the GUI to connect to the server by setting the host to localhost
and the port to 4121
.
Note that by forwarding to different ports, in this case 4121, you can connect to multiple servers at the same time.
Similar techniques can be used for servers that require an SSH key to connect. You can simliarly open an SSH tunnel to the server and then connect to the server using the GUI.
ssh -i /path/to/key.pem -N -L 3122:login.hpc.edu:3121 username@login.hpc.edu