The Rapthor parset¶
Before Rapthor can be run, a parset describing the reduction must be made. The parset is a simple text file defining the parameters of a run in a number of sections. For example, a minimal parset for a basic reduction on a single machine could look like the following (see Tips for running Rapthor for tips on setting up an optimal parset):
[global]
dir_working = /path/to/rapthor/working/dir
input_ms = /path/to/input/dir/input.ms
The available options are described below under their respective sections.
Note
An example parset is available here.
[global]
¶
- dir_working¶
Full path to working dir where rapthor will run (required). All output will be placed in this directory. E.g.,
dir_working = /data/rapthor
.- input_ms¶
Full path to the input MS files (required). Wildcards can be used (e.g.,
input_ms = /path/to/data/*.ms
). The paths can also be given as a list (e.g.,input_ms = [/path/to/data1/*.ms, /path/to/data2/*.ms, /path/to/data3/obs3.ms]
). Note that Rapthor works on a copy of these files and does not modify the originals in any way.Note
The MS files output by the LINC pipeline can be directly used with Rapthor. See Data preparation for details.
- download_initial_skymodel¶
Download the initial sky model automatically instead of using a user-provided one (default is
True
). This option is ignored if a file is specified with the input_skymodel option.- download_initial_skymodel_radius¶
The radius in degrees out to which a sky model should be downloaded (default is 5.0).
- download_initial_skymodel_server¶
Place to download the initial sky model from (default is
TGSS
). This can either beTGSS
to use the TFIR GMRT Sky Survey,LOTSS
to use the LOFAR Two-metre Sky Survey, orGSM
to use the Global Sky Model.- download_overwrite_skymodel¶
Overwrite any existing sky model with a downloaded one (default is
False
).- input_skymodel¶
Full path to the input sky model file, with true-sky fluxes (required if automatic download is disabled). If you also have a sky model with apparent flux densities, specify it with the apparent_skymodel option.
See Data preparation for more info on preparing the sky model.
- apparent_skymodel¶
Full path to the input sky model file, with apparent-sky fluxes (optional). Note that the source names must be identical to those in the input_skymodel.
- regroup_input_skymodel¶
Regroup input skymodel as needed to meet target flux (default =
True
). If False, the existing patches are used for the calibration.- strategy¶
Name of processing strategy to use (default =
selfcal
). A custom strategy can be used by giving instead the full path to the strategy file. See Defining a custom processing strategy for details on making a custom strategy file.- selfcal_data_fraction¶
Fraction of data to use (default = 0.2). If less than one, the input data are divided by time into chunks that sum to the requested fraction, spaced out evenly over the full time range. Using a low value (0.2 or so) is strongly recommended for typical 8-hour, full-bandwidth observations.
- final_data_fraction¶
A final data fraction can be specified (default = 1.0) such that a final processing pass (i.e., after selfcal finishes) is done with a different fraction.
- flag_abstime¶
Range of times to flag (default = no flagging). The syntax is that of the preflagger
abstime
parameter (see the DP3 documentation for details of the syntax). E.g.,[12-Mar-2010/11:31:00.0..12-Mar-2010/11:50:00.0]
.- flag_baseline¶
Range of baselines to flag (default = no flagging). The syntax is that of the preflagger
baseline
parameter (see the DP3 documentation for details of the syntax). E.g.,flag_baseline = [CS013HBA*]
.- flag_freqrange¶
Range of frequencies to flag (default = no flagging). The syntax is that of the preflagger
freqrange
parameter (see the DP3 documentation for details of the syntax). E.g.,flag_freqrange = [125.2..126.4MHz]
.- flag_expr¶
Expression that defines how the above flagging ranges are combined to produce the final flags (default = all ranges are
AND
-ed). The syntax is that of the preflaggerexpr
parameter (see the DP3 documentation for details of the syntax). E.g.,flag_freqrange or flag_baseline
.- input_h5parm¶
Full path to an H5parm file with direction-dependent solutions (default = None). This file is used if no calibration is to be done.
Note
The directions in the H5parm file must match the patches in the input sky model, and the time and frequency coverage must be sufficient to cover the duration and bandwidth of the input dataset.
- input_fulljones_h5parm¶
Full path to an H5parm file with full-Jones solutions (default = None). This file is used if no calibration is to be done.
[calibration]
¶
- llssolver¶
The linear least-squares solver to use (one of
qr
,svd
, orlsmr
; default =qr
).- maxiter¶
Maximum number of iterations to perform during calibration (default = 150).
- propagatesolutions¶
Propagate solutions to next time slot as initial guess (default =
True
)?- solveralgorithm¶
The algorithm used for solving (one of
directionsolve
,directioniterative
,lbfgs
, orhybrid
; default =hybrid
). When usinglbfgs
, the stepsize should be set to a small value like 0.001.- onebeamperpatch¶
Calculate the beam correction once per calibration patch (default =
False
)? IfFalse
, the beam correction is calculated separately for each source in the patch. Setting this toTrue
can speed up calibration and prediction, but can also reduce the quality when the patches are large.- parallelbaselines¶
Parallelize model calculation over baselines, instead of parallelizing over directions (default =
False
).- sagecalpredict¶
Use SAGECal for model calculation, both in predict and calibration (default =
False
).- stepsize¶
Size of steps used during calibration (default = 0.02). When using solveralgorithm =
lbfgs
, the stepsize should be set to a small value like 0.001.- stepsigma¶
In oder to stop solving iterations when no further improvement is seen, the mean of the step reduction is compared to the standard deviation multiplied by stepsigma factor (default = 0.1). If mean of the step reduction is lower than this value (noise dominated), solver iterations are stopped since no possible improvement can be gained.
- tolerance¶
Tolerance used to check convergence during calibration (default = 1e-3).
- fast_freqstep_hz¶
Frequency step used during fast phase calibration, in Hz (default = 1e6).
- fast_smoothnessconstraint¶
Smoothness constraint bandwidth used during fast phase calibration, in Hz (default = 3e6).
- fast_smoothnessreffrequency¶
Smoothness constraint reference frequency used during fast phase calibration, in Hz. If not specified this will automatically be set to 144 MHz for HBA or the midpoint of the frequency coverage for LBA.
- fast_smoothnessrefdistance¶
Smoothness constraint reference distance used during fast phase calibration, in m (default = 0).
- slow_freqstep_hz¶
Frequency step used during slow amplitude calibration, in Hz (default = 1e6).
- slow_smoothnessconstraint_joint¶
Smoothness constraint bandwidth used during the first slow gain calibration, where a joint solution is found for all stations, in Hz (default = 3e6).
- slow_smoothnessconstraint_separate¶
Smoothness constraint bandwidth used during the second slow gain calibration, where separate solutions are found for each station, in Hz (default = 3e6).
- fulljones_timestep_sec¶
Time step used during the full-Jones gain calibration, in seconds (default = 600).
- fulljones_freqstep_hz¶
Frequency step used during full-Jones amplitude calibration, in Hz (default = 1e6).
- fulljones_smoothnessconstraint¶
Smoothness constraint bandwidth used during the full-Jones gain calibration, in Hz (default = 0).
- dd_interval_factor¶
Maximum factor by which the direction-dependent solution intervals can be increased, so that fainter calibrators get longer intervals (in the fast and slow solves only; default = 1 = disabled). The value determines the maximum allowed adjustment factor by which the solution intervals are allowed to be increased for faint sources. For a given direction, the adjustment is calculated from the ratio of the apparent flux density of the calibrator to the target flux density of the cycle (set in the strategy) or, if a target flux density is not defined, to that of the faintest calibrator in the sky model. A value of 1 disables the use of direction-dependent solution intervals; a value greater than 1 enables direction-dependent solution intervals.
Note
Currently, only solveralgorithm =
directioniterative
is supported when using direction-dependent solution intervals. The ‘directioniterative’ solver is typically less accurate than the other directional solvers and therefore may result in lower-quality solutions for a given solution interval. However, the use of direction-dependent intervals will often outweigh this effect, depending on the field and the settings chosen.- solverlbfgs_dof¶
Degrees of freedom for the LBFGS solver (only used when solveralgorithm =
lbfgs
; default = 200.0).- solverlbfgs_minibatches¶
Number of minibatches for the LBFGS solver (only used when solveralgorithm =
lbfgs
; default = 1).- solverlbfgs_iter¶
Number of iterations per minibatch in the LBFGS solver (only used when solveralgorithm =
lbfgs
; default = 4).
[imaging]
¶
- cellsize_arcsec¶
Pixel size in arcsec (default = 1.25).
- robust¶
Briggs robust parameter (default = -0.5).
- min_uv_lambda¶
Minimum uv distance in lambda to use in imaging (default = 0).
- max_uv_lambda¶
Maximum uv distance in lambda to use in imaging (default = 0).
- taper_arcsec¶
Taper to apply when imaging, in arcsec (default = 0).
- do_multiscale_clean¶
Use multiscale cleaning (default =
True
)?- dde_method¶
Method to use to correct for direction-dependent effects during imaging:
none
,facets
, orscreens
(default =facets
). Ifnone
, the solutions closest to the image centers will be used. Iffacets
, Voronoi faceting is used. Ifscreens
, smooth 2-D screens are used.- screen_type¶
Type of screen to use (default =
tessellated
), if dde_method =screens
:tessellated
(simple, smoothed Voronoi tessellated screens) orkl
(Karhunen-Lo`eve screens).- save_visibilities¶
Save visibilities used for imaging (default =
False
). IfTrue
, the imaging MS files will be saved, with the the direction-independent full-Jones solutions, if available, applied. Note, however, that the direction-dependent solutions will not be applied unless dde_method =none
, in which case the solutions closest to the image centers are used.- idg_mode¶
IDG (image domain gridder) mode to use in WSClean (default =
hybrid
). The mode can becpu
orhybrid
.- mem_gb¶
Maximum memory in GB (per node) to use for WSClean jobs (default = 0 = all available memory).
Note
If the mem_per_node_gb parameter is set, then the maximum memory for WSClean jobs will be set to the smaller of
mem_gb
andmem_per_node_gb
.- apply_diagonal_solutions¶
Apply separate XX and YY corrections during facet-based imaging (default =
True
). IfFalse
, scalar solutions (the average of the XX and YY solutions) are applied instead. (Separate XX and YY corrections are always applied when using non-facet-based imaging methods.)- make_quv_images¶
Make Stokes QUV images in addition to the Stokes I image (default =
False
). IfTrue
, Stokes QUV images are made during the final imaging step, once self calibration has been completed.- pol_combine_method¶
The method used to combine the polarizations during deconvolution can also be specified. This method can be “link” to linked polarization cleaning or “join” to use joined polarization cleaning (default = link). When using linked cleaning, the Stokes I image is used for cleaning and its clean components are subtracted from all polarizations.
- dd_psf_grid¶
The number of direction-dependent PSFs which should be fit horizontally and vertically in the image (default =
[1, 1]
= direction-independent PSF).- use_mpi¶
Use MPI to distribute WSClean jobs over multiple nodes (default =
False
)? IfTrue
and more than one node can be allocated to each WSClean job (i.e.,max_nodes
/num_images
>= 2), then distributed imaging will be used (only available if batch_system =slurm
).Note
If MPI is activated, dir_local (under the [cluster] section below) must not be set unless it is on a shared filesystem.
Note
Currently, Toil does not fully support
openmpi
. Because of this, imaging can only use the worker nodes, and the master node will be idle.- reweight¶
Reweight the visibility data before imaging (default =
False
). IfTrue
, data with high residuals (compared to the predicted model visibilities) are down-weighted. This feature is experimental and should be used with caution.- grid_width_ra_deg¶
Size of area to image when using a grid (default = 1.7 * mean FWHM of the primary beam).
- grid_width_dec_deg¶
Size of area to image when using a grid (default = 1.7 * mean FWHM of the primary beam).
- grid_center_ra¶
Center of area to image when using a grid (default = phase center).
- grid_center_dec¶
Center of area to image when using a grid (default = phase center).
- grid_nsectors_ra¶
Number of sectors along the RA axis (default = 0). The number of sectors in Dec will be determined automatically to ensure the whole area specified with grid_center_ra, grid_center_dec, grid_width_ra_deg, and grid_width_dec_deg is imaged. Set to 0 to force a single sector for the full area. A grid of sectors can be useful for computers with limited memory but generally will give inferior results compared to an equivalent single sector.
- sector_center_ra_list¶
List of image centers (default =
[]
). Instead of a grid, imaging sectors can be defined individually by specifying their centers and widths.- sector_center_dec_list¶
List of image centers (default =
[]
).- sector_width_ra_deg_list¶
List of image widths, in degrees (default =
[]
).- sector_width_dec_deg_list¶
List of image widths, in degrees (default =
[]
).- max_peak_smearing¶
Max desired peak flux density reduction at center of the image edges due to bandwidth smearing (at the mean frequency) and time smearing (default = 0.15 = 15% reduction in peak flux). Higher values result in shorter run times but more smearing away from the image centers.
- skip_corner_sectors¶
Skip corner sectors defined by the imaging grid (default =
False
)? IfTrue
and a grid is used (defined by thegrid_*
parameters above), the four corner sectors are not processed (if possible for the given grid).
[cluster]
¶
- batch_system¶
Cluster batch system (only used when Toil is the CWL runner; default =
single_machine
). Usesingle_machine
when running on a single machine andslurm
to use multiple nodes of a SLURM-based cluster.- max_nodes¶
When batch_system =
slurm
, the maximum number of nodes of the cluster to use at once (default = 12).- cpus_per_task¶
When batch_system =
slurm
, the number of processors per task to request (default = 0 = all). By setting this value to the number of processors per node, one can ensure that each task gets the entire node to itself, which is the recommended way of running Rapthor.- mem_per_node_gb¶
When batch_system =
slurm
, the amount of memory per node in GB to request (default = 0 = all).- max_cores¶
Maximum number of cores per task to use on each node (default = 0 = all).
- max_threads¶
Maximum number of threads per task to use on each node (default = 0 = all).
- deconvolution_threads¶
Number of threads to use by WSClean during deconvolution (default = 0 = 2/5 of
max_threads
).- parallel_gridding_threads¶
Number of threads to use by WSClean during parallel gridding (default = 0 = 2/5 of
max_threads
).- dir_local¶
Full path to a local disk on the nodes for IO-intensive processing (default = not used). The path must exist on all nodes (but does not have to be on a shared filesystem). This parameter is useful if you have a fast local disk (e.g., an SSD) that is not the one used for dir_working. If this parameter is not set, IO-intensive processing (e.g., WSClean) will use a default path in dir_working instead.
Note
This parameter should not be set in the following situations:
when batch_system =
single_machine
and multiple imaging sectors are used (as each sector will overwrite files from the other sectors).when use_mpi =
True
under the [imaging] section anddir_local
is not on a shared filesystem.
- use_container¶
Run the workflows inside a container (default =
False
)? IfTrue
, the CWL workflow for each operation (such as calibrate or image) will be run inside a container. The type of container can be specified with the container_type parameter.Note
This option should not be used when Rapthor itself is being run inside a container. See Using a (u)Docker/Singularity image for details.
- container_type¶
The type of container to use when use_container =
True
. The supported types are:docker
(the default),udocker
, orsingularity
.- cwl_runner¶
CWL runner to use. Currently supported runners are:
cwltool
andtoil
(default). Toil is the recommended runner, since it provides much more fine-grained control over the execution of a workflow. For example, Toil can use Slurm to automatically distribute workflow steps over different compute nodes, whereas CWLTool can only execute workflows on a single node. With CWLTool you also run the risk of overloading your machine when too many jobs are run in parallel. For debugging purposes CWLTool outshines Toil, because its logs are easier to understand.- dir_coordination¶
Set Toil’s coordination directory (only used when Toil is the CWL runner; default = selected automatically by Toil). In most cases, it should not be necessary to set this parameter. However, if errors relating to Toil’s
jobStateFile
are encountered, they may be fixed by setting the coordination directory explicitly.Note
This directory must be on a 100% POSIX-compatible file system, because Toil heavily depends on POSIX file locking to work reliably. For many shared file systems, this criterion is not met.
- debug_workflow¶
Debug workflow related issues. Enabling this will require significantly more disk space. The working directory will never be cleaned up,
stdout
andstderr
will not be redirectied, and log level of the CWL runner will be set toDEBUG
. Additionally, when using Toil as the CWL runner, some tasks will run using only a single thread (to make debugging easier). Use this option with care!Note
If Toil is the CWL runner, this option will only work when batch_system =
single_machine
(the default).