Example jobΒΆ

The following job script job.sh is an example from Riley et al. 2019 (ApJL, 887, L21), for the Slurm batch system on the Cartesius system (see SURFsara systems).

#!/bin/bash
#SBATCH -N 5
#SBATCH --tasks-per-node=32
#SBATCH -t 1-00:00:00
#SBATCH -p broadwell
#SBATCH --job-name=run1

echo start of job in directory $SLURM_SUBMIT_DIR
echo number of nodes is $SLURM_JOB_NUM_NODES
echo the allocated nodes are:
echo $SLURM_JOB_NODELIST

module load intel/2017b
module load python/2.7.9

cp -r $HOME/NICER_analyses/J0030_ST_PST $TMPDIR

cd $TMPDIR/J0030_ST_PST

export PYTHONPATH=$HOME/.local/lib/python2.7/site-packages/:$PYTHONPATH

export OMP_NUM_THREADS=1
export OPENBLAS_NUM_THREADS=1
export GOTO_NUM_THREADS=1
export MKL_NUM_THREADS=1
export LD_LIBRARY_PATH=$HOME/MultiNest_v3.11_CMake/multinest/lib:$LD_LIBRARY_PATH

srun python main_run1.py > out_run1 2> err_run1

cp run1* out_run1 err_run1 $HOME/NICER_analyses/J0030_ST_PST/.
#end of job file

A corresponding resume script would look like:

#!/bin/bash
#SBATCH -N 30
#SBATCH --tasks-per-node=32
#SBATCH -t 2-00:00:00
#SBATCH -p broadwell
#SBATCH --job-name=run1_r1

echo start of job in directory $SLURM_SUBMIT_DIR
echo number of nodes is $SLURM_JOB_NUM_NODES
echo the allocated nodes are:
echo $SLURM_JOB_NODELIST

module load intel/2017b
module load python/2.7.9

cp -r $HOME/NICER_analyses/J0030_ST_PST $TMPDIR

cd $TMPDIR/J0030_ST_PST

export PYTHONPATH=$HOME/.local/lib/python2.7/site-packages/:$PYTHONPATH

export OMP_NUM_THREADS=1
export OPENBLAS_NUM_THREADS=1
export GOTO_NUM_THREADS=1
export MKL_NUM_THREADS=1
export LD_LIBRARY_PATH=$HOME/MultiNest_v3.11_CMake/multinest/lib:$LD_LIBRARY_PATH

srun python main_run1_resume1.py > out_run1_resume1 2> err_run1_resume1

cp run1* out_run1_resume1 err_run1_resume1 $HOME/NICER_analyses/J0030_ST_PST/.
#end of job file

Note how srun is aware of the MPI World, so there is no need specify the number of processes to spawn as a flag argument. Also note that the number of processes (which we set to equal the number of physical cores per node) specified in the top directives is far higher than for the initial run. This is because parallelisation efficiency scales with the local rejection fraction during a nested sampling iteration.

Finally, note that only the root process will generate output for inspection.