more detailed examples for Slurm and HTCondor

tomeichlersmith · tomeichlersmith · commit 10d97cde7e8e · 2025-03-10T11:22:17.000-05:00
diff --git a/src/using/batch.md b/src/using/batch.md
@@ -37,8 +37,8 @@ This is plenty small enough to include in your `${HOME}` directory on most if no
 Additionally, most clusters share your `${HOME}` directory with the working nodes and so you don't even need to bother copying `denv` to where the jobs are being run.
 
 ## Preparing for Batch Running
-The above instructions have you setup to run `denv` on the cluster just like you run `denv` on your own computer; however,
-doing a few more steps is helpful to ensure that the batch jobs run reliably and efficiently.
+The above instructions have you setup to run `denv` on the cluster just like you run `denv` on your own computer;
+however, doing a few more steps is helpful to ensure that the batch jobs run reliably and efficiently.
 
 ### Pre-Building SIF Images
 Under-the-hood, `apptainer` runs images from SIF files.
@@ -54,11 +54,21 @@ cd path/to/big/dir
 apptainer build ldmx_pro_v4.2.3.sif docker://ldmx/pro:v4.2.3 # just an example, name the SIF file appropriately
 ```
 
-### Running the SIF Image
+## Running the SIF Image
 How we run the image during the jobs depends on how the jobs are configured.
 For the clusters I have access to (UMN and SLAC), there are two different ways for jobs to be configured
 that mainly change _where_ the job is run.
 
+~~~admonish success title="Check Where Jobs are Run"
+A good way to figure this out (and learn about the batch job system that you want to use)
+is to figure out how to run a job that just runs `pwd`.
+This command prints out the "present working directory" and so you can see where
+the job is being run from.
+
+Refer to your cluster's IT, documentation, and the batch job system's documentation to
+learn how to do this.
+~~~
+
 #### Jobs Run In Submitted Directory
 At SLAC S3DF, the jobs submitted with `sbatch` are run from the directory where `sbatch` was run.
 This makes it rather easy to run jobs.
@@ -67,34 +77,102 @@ We can create a denv and then submit a job running `denv` from within that direc
 cd batch/submit/dir
 denv init /full/path/to/big/dir/ldmx_pro_v4.2.3.sif
 ```
-Submitting the job would look like `sbatch <job-options> submit.sh` with
+
+For example, submitting jobs for a range of run numbers would look like
 ```shell
-# submit.sh
-denv fire config.py # inside ldmx/pro:v4.2.3 IF SUBMITTED FROM batch/submit/dir
+mkdir log # the SBATCH commands in submit put the log files here
+sbatch --array=0-10 submit.sh
+```
+with
+```bash
+#!/bin/bash
+#SBATCH --job-name my-job
+#SBATCH --cpus-per-task=1
+#SBATCH --mem-per-cpu=2g
+#SBATCH --time=04:00:00 # time limit for jobs
+#SBATCH --output=log/%A-%a.log
+#SBATCH --error=log/%A-%a.log
+
+set -o errexit
+set -o nounset
+
+# assume the configuration script config.py takes one argument
+# the run number it should use for the simulation
+# and then uniquely creates the path of the output file here
+denv fire config.py ${SLURM_ARRAY_TASK_ID}
+# fire is run inside ldmx/pro:v4.2.3 IF SUBMITTED FROM batch/submit/dir
 ```
 Look at the SLAC S3DF and Slurm documentation to learn more about configuring the batch jobs themselves.
 
+~~~admonish comments title="Comments"
+- _Technically_, since SLAC S3DF's `${SCRATCH}` directory is also shared across the worker nodes, you do not need to pre-build the image. However, this is not advised because if the `${SCRATCH}` directory is periodically cleaned during your jobs, the cached SIF image would be lost and your jobs could fail in confusing ways.
+- Some clusters configure Slurm to limit the number of jobs you can submit at once with `--array`. This means you might need to submit the jobs in "chunks" and add an offset to `SLURM_ARRAY_TASK_ID` so that the different "chunks" have different run numbers. This can be done with bash's math syntax e.g. `$(( SLURM_ARRAY_TASK_ID + 100 ))`.
+~~~
+
 #### Jobs Run in Scratch Directory
 At UMN's CMS cluster, the jobs submitted with `condor_submit` are run from a newly-created scratch directory.
 This makes it slightly difficult to inform `denv` of the configuration we want to use.
 `denv` has an experimental shebang syntax that could be helpful for this purpose.
 
-```shell
-#!/usr/bin/env denv shebang
+`prod.sh`
+```bash
+#!/full/path/to/denv shebang
 #!denv_image=/full/path/to/ldmx_pro_v4.2.3.sif
 #!bash
 
+set -o nounset
+set -o errexit
+
 # everything here is run in `bash` inside ldmx/pro:v4.2.3
-fire config.py
+# assume run number is provided as an argument
+fire config.py ${1}
+```
+
+with the submit file `submit.sub` in the same directory.
+```
+# run prod.sh and transfer it to scratch area
+executable = prod.sh
+transfer_executable = yes
+
+# terminal and condor output log files
+#   helpful for debugging at slight performance cost
+output = logs/$(run_number)-$(Cluster)-$(Process).out
+error  = $(output)
+log    = $(Cluster)-condor.log
+
+# "hold" the job if there is a non-zero exit code
+#   and store the exit code in the hold reason subcode
+on_exit_hold = ExitCode != 0
+on_exit_hold_subcode = ExitCode
+on_exit_hold_reason = "Program exited with non-zero exit code"
+
+# the 'Process' variable is an index for the job in the submission cluster
+arguments = "$(Process)"
+```
+And then you would `condor_submit` this script with
+```shell
+condor_submit submit.sub --queue 10
 ```
 
-And then you would `condor_submit` this script.
+~~~admonish note collapsible=true title="Alternative Script Design"
 Alternatively, one could write a script _around_ `denv` like
 ```shell
+#!/bin/bash
+
+set -o nounset
+set -o errexit
+
 # stuff here is run outside ldmx/pro:v4.2.3
 # need to call `denv` to go into image
 denv init /full/path/to/ldmx_pro_v4.2.3.sif
-denv fire config
+denv fire config.py ${1}
 ```
 The `denv init` call writes a few small files which shouldn't have a large impact on performance
 (but could if the directory in which the job is being run has a slow filesystem).
+This is helpful if your configuration of HTCondor does not do the file transfer for you and
+your job is responsible for copying in/out any input/output files that are necessary.
+~~~
+
+~~~admonish note title="Comments"
+- Similar to Slurm's `--array`, we are relying on HTCondor's `-queue` command to decide what run numbers to use. Look at HTCondor's documentation (for example [Submitting many similar jobs with one queue command](https://htcondor.readthedocs.io/en/latest/users-manual/submitting-a-job.html#submitting-many-similar-jobs-with-one-queue-command)) for more information.
+~~~