===============
Getting Started
===============

Floodwater is based on using Anaconda as an environment manager. This allows the user
to have multiple installations on their system without interfering with each other.

************
Installation
************

The installation of Floodwater uses a single Python script to manage the environment
creation within Anaconda. The script is called :code:`install.py` and is located in the
root directory of the repository.

Pre-requisites
==============

The following are pre-requisites for installing Floodwater:
  1. A system with a Linux distribution (e.g. Ubuntu, Debian, Rocky, etc.)
  2. Anaconda (https://www.anaconda.com/distribution/)
  3. Git (https://git-scm.com/downloads)
  4. A job manager (e.g. SLURM, PBS, etc.)
  5. Access to a MetGet instance.

.. note::
     If you do not have access to an instance,
     you can contact zcobell@thewaterinstitute.org or begin running your own
     instance by following the instructions on the `metget-server <https://github.com/waterinstitute/metget-server>`_
     GitHub page.

In the case that your system is based on ARM processors, you will need to build
the ecFlow package from source. This is recommended only for experienced users.
We have shown that the system works well on both ARM and x86_64 architectures,
however, ECMWF only provides pre-built binaries for x86_64.

Installation Steps
==================

The first step to installing Floodwater is to clone the repository:

.. code-block:: bash

    git clone https://github.com/waterinstitute/floodwater.git

This will create a directory called :code:`floodwater` in the current working directory.
You should then change into the :code:`floodwater` directory:

.. code-block:: bash

    cd floodwater

The installation should use the :code:`install.py` file. The script allows the user
to enable/disable certain features of Floodwater. For example, you may only want
to have the code of Floodwater installed on your local system so that you can run
the GUI.

Full Installation
-----------------

.. code-block:: bash

    python install.py --conda-name my-floodwater-env


.. warning::
    For all the good things about Anaconda, the default solver is extraordinarily slow. We highly
    recommend using the Mamba solver. This can be done by following the instructions
    `here <https://www.anaconda.com/blog/a-faster-conda-for-a-growing-community>`_.

Partial Installation
--------------------

A common use case is to install the full package on a remote system and then install only
the essential packages on the local system. The local system can access the remote system
over SSH which will allow a much more snappy experience.

.. code-block:: bash

    python install.py --conda-name my-floodwater-env --minimal

.. warning::
    It is also possible to forward an
    X11 connection over SSH so that the GUI can be used remotely, however, this will be slow,
    particularly if the remote system is not on the same network.

.. note::
   Developers will need to re-install the Floodwater package (not the full Anaconda environment)
   if they make changes to the code. This can be done by running the :code:`pip install .` from the
   root directory. This helps to maintain environment isolation during development and allows the
   user to maintain multiple active Floodwater environments.

Checking the Installation
-------------------------
To confirm that the installation was successful, you should activate the newly create
conda environment:

.. code-block:: bash

    conda activate my-floodwater-env

You can then check that the Floodwater package is installed by running:

.. code-block:: bash

    floodwater --help

*******************
ecFlow Server Setup
*******************

Systems based on the ecFlow workflow manager require that a server daemon is running.
The server is responsible for managing the workflows and the jobs that are submitted.
Note that many of the jobs within the system which aren't submitted to the job scheduler
(i.e. SLURM or PBS) are run directly on the server which is running the daemon. This
means that users should be considerate on shared systems and work with system administrators
in the event that there are too many concurrent processes on the system. In the past, we
have worked with system administrators on large NSF funded resources to provide a virtual machine
which is dedicated to running the server daemon.

Environment Variables
=====================

To avoid manually specifying the server host and port number during each Floodwater command,
we recommend setting the following environment variables:

.. code-block:: bash

    export ECF_HOST=hostname
    export ECF_PORT=port_number

The :code:`ECF_HOST` variable should be set to the hostname of the system which is running the
server daemon. The :code:`ECF_PORT` variable should be set to the port number which the server.
Note that the :code:`ECF_HOST` variable should be a fully qualified domain name (FQDN) which
can be resolved by the client system.

Starting the Server
===================

The server can be started by running the following command:

.. code-block:: bash

    floodwater server start [--port xxxx]

The port number is optional. If it is not supplied on the command line, the system will search for an
environment variable named :code:`ECF_PORT`. If this variable is not set, you will receive an error.

You can check that the server has become active by running:

.. code-block:: bash

    floodwater ping

This will ping the server and return the time it took to respond. This command uses
environment variables above to establish the connection to the server.

The system will return something that looks like the following:

.. code-block:: bash

    2023-10-13T08:56:07CDT :: INFO :: floodwater_cli_funcs.py :: floodwater_ping_ecflow :: Success! Ping ecFlow server (myhost.local:3121) succeeded in ~4 ms

Stopping the Server
===================

The server can be stopped by running the following command:

.. code-block:: bash

    floodwater server stop

**********************
Running Your First Job
**********************

The system comes packaged with a simple example which runs the ec95d ADCIRC
model. This is a very coarse model and should only be used for testing.

The example files are contained within: :code:`floodwater/examples/ec95d`

You will need to make minor modifications to these files, however, they are
designed to be as generic as possible for the purposes of getting started.

Make sure an ecflow server instance is started on your system. Then, the first
step is to copy the example files to a new directory:

.. code-block:: bash

    mkdir my_first_job
    cp -r floodwater/examples/ec95d/* my_first_job

Then, modify the yaml and ``.h`` files as necessary to fit your systems specification.

Then, to load the job, you can run:

.. code-block:: bash

    floodwater load adcirc_gfs.yaml

This will load the job into the server. You can check that the job has been loaded by running:

.. code-block:: bash

    floodwater status

Then, to start the job, you can run:

.. code-block:: bash

    floodwater start adcirc_gfs

Note that you can load and start the job in a single command by running:

.. code-block:: bash

    floodwater run adcirc_gfs.yaml

The job will appear in the GUI and you can monitor the progress of the job using the GUI or
the status command on the command line.

***************
Tips and Tricks
***************

ecFlow GUI
----------

The ecFlow client can show running instances from a remote system. The best performance is usually
archived by running the GUI on your local machine and allowing the client to connect to the remote server
over SSH. This is only possible for Mac and Linux systems. For Windows, you will need to use X11 forwarding
to view the GUI. This can be done using MobaXterm or PuTTY coupled with an X11 server such as Xming. Note that
you will need to be on the same network (virtual or physical) as the remote system which may require running
a VPN client.

While not strictly necessary, to use a local ecFlow GUI, you can simply install the Anaconda package for
Floodwater and use the command:

.. code-block:: bash

    floodwater gui

.. note::
    The GUI is not required to run Floodwater. It is only used for monitoring the status of the jobs.

Sometimes, it is necessary to connect to a machine which is behind a load balancer or distribution framework
of some kind. This is often the case with HPC systems which have offered to run a virtual machine to manage our
ecFlow server instances. The login node of the HPC system will be accessible from the outside world, however,
the virtual machine will only be accessible from the login node. There are some tricks that can be done using SSH
to forward the connection to the virtual machine.

To open a SSH tunnel to the virtual machine, you can run the following command:

.. code-block:: bash

    ssh -J username@login.hpc.edu username@vmname -C -N -L 4121:vmname:3121

This will forward the connection to the virtual machine on port 3121 to your local machine on port 4121. You can
then use the GUI to connect to the server by setting the host to :code:`localhost` and the port to :code:`4121`.
Note that by forwarding to different ports, in this case 4121, you can connect to multiple servers at the same time.

Similar techniques can be used for servers that require an SSH key to connect. You can simliarly open an SSH tunnel
to the server and then connect to the server using the GUI.

.. code-block:: bash

    ssh -i /path/to/key.pem -N -L 3122:login.hpc.edu:3121 username@login.hpc.edu