Running Rapthor¶
Starting a Rapthor run¶
Note
For runs on a single machine, the recommended method of running Rapthor is to run everything within a container (see Using a (u)Docker/Singularity image for details).
Rapthor can be run from the command line as follows:
$ rapthor rapthor.parset
where rapthor.parset
is the parset described in The Rapthor parset. A
number of options are available and are described below:
Usage: rapthor parset
Options:
--version show program's version number and exit
-h, --help show this help message and exit
-q enable quiet mode
-r RESET, --reset=RESET
reset one or more operations so that
they will be rerun
-v enable verbose mode
Rapthor begins a run by checking the input measurement sets. Next, Rapthor will determine the DDE calibrators from the input sky model and begin self calibration and imaging. Rapthor uses Toil+CWL to handle the distribution of jobs and to keep track of the state of a reduction. Each Rapthor operation is done in a separate workflow. See General structure for an overview of the various operations that Rapthor performs and their relation to one another, and see Operations for details of each operation and their primary data products.
Using a (u)Docker/Singularity image¶
Rapthor can use containers in two ways: by running Rapthor completely within a container (for use on a single machine) or by installing Rapthor locally and running only the operations within the container (for use with multiple nodes of a compute cluster).
Running everything in a container (single-machine mode)¶
For runs on a single machine, the recommended method of running Rapthor is to run everything within a container. To use this method, first obtain the container image as follows:
For Docker:
$ docker pull astronrd/rapthor
For uDocker:
$ udocker pull astronrd/rapthor
For Singularity:
$ singularity pull docker://astronrd/rapthor
Then start the run, making sure that all necessary volumes are accessible from inside the container, e.g.,:
$ docker run --rm <docker_options> -v <mount_points>:<mount_points> -w $PWD astronrd/rapthor rapthor rapthor.parset
$ udocker run --rm <docker_options> -v <mount_points>:<mount_points> -w $PWD astronrd/rapthor rapthor rapthor.parset
$ singularity exec --bind <mount_points>:<mount_points> <rapthor.sif> rapthor rapthor.parset
In this mode, since Rapthor is running fully inside a container, the use_container parameter should not be set, as activating this option instructs Rapthor to run the operations inside another, additional container (resulting in it running a container inside a container).
Running only the operations in a container (multinode mode)¶
For runs that use multiple nodes of a compute cluster (i.e., when
batch_system = slurm
), the recommended method of running Rapthor is
to run the operations (CWL workflows) inside containers, with the parent Rapthor process,
which controls the submission of Slurm jobs, running outside of a container.
Therefore, the use of this mode requires a minimal local installation of Rapthor
on the cluster head node (for details, see the installation instructions on the
Rapthor GitLab page). Other, non-Python
dependencies (such as DP3 and WSClean) do not need to be installed locally. To
use this mode, activate the use_container parameter in the parset. No
further configuration should be necessary, as the CWL runner will handle the
pulling and running of the containers.
Troubleshooting a run¶
See the Installation FAQ for tips on troubleshooting Rapthor.
Resuming an interrupted run¶
Due to the potentially long run times and the consequent non-negligible chance of some unforeseen failure occurring, Rapthor has been designed to allow easy resumption of a reduction from a saved state and will skip over any steps that were successfully completed previously. In this way, one can quickly resume a reduction that was halted (either by the user or due to some problem) by simply re-running Rapthor with the same parset.
Resetting an operation¶
Rapthor allows for the processing of an operation to be reset:
$ rapthor -r rapthor.parset
Upon running this command, a prompt will appear prompting the user to select an operation to reset:
INFO - rapthor:state - Reading parset and checking state...
Current strategy: selfcal
Operations:
1) calibrate_1
2) predict_1
3) image_1
4) mosaic_1
5) calibrate_2
6) image_2
7) mosaic_2
8) calibrate_3
9) image_3
Enter number of operation to reset or "q" to quit:
All operations after the selected one will also be reset.