=============== Getting Started =============== Floodwater is based on using Anaconda as an environment manager. This allows the user to have multiple installations on their system without interfering with each other. ************ Installation ************ The installation of Floodwater uses a single Python script to manage the environment creation within Anaconda. The script is called :code:`install.py` and is located in the root directory of the repository. Pre-requisites ============== The following are pre-requisites for installing Floodwater: 1. A system with a Linux distribution (e.g. Ubuntu, Debian, Rocky, etc.) 2. Anaconda (https://www.anaconda.com/distribution/) 3. Git (https://git-scm.com/downloads) 4. A job manager (e.g. SLURM, PBS, etc.) 5. Access to a MetGet instance. .. note:: If you do not have access to an instance, you can contact zcobell@thewaterinstitute.org or begin running your own instance by following the instructions on the `metget-server `_ GitHub page. In the case that your system is based on ARM processors, you will need to build the ecFlow package from source. This is recommended only for experienced users. We have shown that the system works well on both ARM and x86_64 architectures, however, ECMWF only provides pre-built binaries for x86_64. Installation Steps ================== The first step to installing Floodwater is to clone the repository: .. code-block:: bash git clone https://github.com/waterinstitute/floodwater.git This will create a directory called :code:`floodwater` in the current working directory. You should then change into the :code:`floodwater` directory: .. code-block:: bash cd floodwater The installation should use the :code:`install.py` file. The script allows the user to enable/disable certain features of Floodwater. For example, you may only want to have the code of Floodwater installed on your local system so that you can run the GUI. Full Installation ----------------- .. code-block:: bash python install.py --conda-name my-floodwater-env .. warning:: For all the good things about Anaconda, the default solver is extraordinarily slow. We highly recommend using the Mamba solver. This can be done by following the instructions `here `_. Partial Installation -------------------- A common use case is to install the full package on a remote system and then install only the essential packages on the local system. The local system can access the remote system over SSH which will allow a much more snappy experience. .. code-block:: bash python install.py --conda-name my-floodwater-env --minimal .. warning:: It is also possible to forward an X11 connection over SSH so that the GUI can be used remotely, however, this will be slow, particularly if the remote system is not on the same network. .. note:: Developers will need to re-install the Floodwater package (not the full Anaconda environment) if they make changes to the code. This can be done by running the :code:`pip install .` from the root directory. This helps to maintain environment isolation during development and allows the user to maintain multiple active Floodwater environments. Checking the Installation ------------------------- To confirm that the installation was successful, you should activate the newly create conda environment: .. code-block:: bash conda activate my-floodwater-env You can then check that the Floodwater package is installed by running: .. code-block:: bash floodwater --help ******************* ecFlow Server Setup ******************* Systems based on the ecFlow workflow manager require that a server daemon is running. The server is responsible for managing the workflows and the jobs that are submitted. Note that many of the jobs within the system which aren't submitted to the job scheduler (i.e. SLURM or PBS) are run directly on the server which is running the daemon. This means that users should be considerate on shared systems and work with system administrators in the event that there are too many concurrent processes on the system. In the past, we have worked with system administrators on large NSF funded resources to provide a virtual machine which is dedicated to running the server daemon. Environment Variables ===================== To avoid manually specifying the server host and port number during each Floodwater command, we recommend setting the following environment variables: .. code-block:: bash export ECF_HOST=hostname export ECF_PORT=port_number The :code:`ECF_HOST` variable should be set to the hostname of the system which is running the server daemon. The :code:`ECF_PORT` variable should be set to the port number which the server. Note that the :code:`ECF_HOST` variable should be a fully qualified domain name (FQDN) which can be resolved by the client system. Starting the Server =================== The server can be started by running the following command: .. code-block:: bash floodwater server start [--port xxxx] The port number is optional. If it is not supplied on the command line, the system will search for an environment variable named :code:`ECF_PORT`. If this variable is not set, you will receive an error. You can check that the server has become active by running: .. code-block:: bash floodwater ping This will ping the server and return the time it took to respond. This command uses environment variables above to establish the connection to the server. The system will return something that looks like the following: .. code-block:: bash 2023-10-13T08:56:07CDT :: INFO :: floodwater_cli_funcs.py :: floodwater_ping_ecflow :: Success! Ping ecFlow server (myhost.local:3121) succeeded in ~4 ms Stopping the Server =================== The server can be stopped by running the following command: .. code-block:: bash floodwater server stop ********************** Running Your First Job ********************** The system comes packaged with a simple example which runs the ec95d ADCIRC model. This is a very coarse model and should only be used for testing. The example files are contained within: :code:`floodwater/examples/ec95d` You will need to make minor modifications to these files, however, they are designed to be as generic as possible for the purposes of getting started. Make sure an ecflow server instance is started on your system. Then, the first step is to copy the example files to a new directory: .. code-block:: bash mkdir my_first_job cp -r floodwater/examples/ec95d/* my_first_job Then, modify the yaml and ``.h`` files as necessary to fit your systems specification. Then, to load the job, you can run: .. code-block:: bash floodwater load adcirc_gfs.yaml This will load the job into the server. You can check that the job has been loaded by running: .. code-block:: bash floodwater status Then, to start the job, you can run: .. code-block:: bash floodwater start adcirc_gfs Note that you can load and start the job in a single command by running: .. code-block:: bash floodwater run adcirc_gfs.yaml The job will appear in the GUI and you can monitor the progress of the job using the GUI or the status command on the command line. *************** Tips and Tricks *************** ecFlow GUI ---------- The ecFlow client can show running instances from a remote system. The best performance is usually archived by running the GUI on your local machine and allowing the client to connect to the remote server over SSH. This is only possible for Mac and Linux systems. For Windows, you will need to use X11 forwarding to view the GUI. This can be done using MobaXterm or PuTTY coupled with an X11 server such as Xming. Note that you will need to be on the same network (virtual or physical) as the remote system which may require running a VPN client. While not strictly necessary, to use a local ecFlow GUI, you can simply install the Anaconda package for Floodwater and use the command: .. code-block:: bash floodwater gui .. note:: The GUI is not required to run Floodwater. It is only used for monitoring the status of the jobs. Sometimes, it is necessary to connect to a machine which is behind a load balancer or distribution framework of some kind. This is often the case with HPC systems which have offered to run a virtual machine to manage our ecFlow server instances. The login node of the HPC system will be accessible from the outside world, however, the virtual machine will only be accessible from the login node. There are some tricks that can be done using SSH to forward the connection to the virtual machine. To open a SSH tunnel to the virtual machine, you can run the following command: .. code-block:: bash ssh -J username@login.hpc.edu username@vmname -C -N -L 4121:vmname:3121 This will forward the connection to the virtual machine on port 3121 to your local machine on port 4121. You can then use the GUI to connect to the server by setting the host to :code:`localhost` and the port to :code:`4121`. Note that by forwarding to different ports, in this case 4121, you can connect to multiple servers at the same time. Similar techniques can be used for servers that require an SSH key to connect. You can simliarly open an SSH tunnel to the server and then connect to the server using the GUI. .. code-block:: bash ssh -i /path/to/key.pem -N -L 3122:login.hpc.edu:3121 username@login.hpc.edu