Skip to content
Snippets Groups Projects
Andrew McNab's avatar
Andrew McNab authored
555d5dd3
History
Machine/Job Features Scripts 
============================

See https://twiki.cern.ch/twiki/bin/view/LCG/MachineJobFeaturesImplementations
for more about the mjf-scripts implementations and Machine/Job Features

These files can either be used directly, or the torque-rpm htcondor-rpm 
gridengine-rpm and onlymf-rpm Makefile targets can be used to build RPMs for
SL 6.x. This README assumes you have built the RPM yourself or downloaded the
pre-built RPM from
  https://repo.gridpp.ac.uk/machinejobfeatures/mjf-scripts/

1. Common configuration
2. Torque/PBS configuration
3. HTCondor configuration
4. Grid Engine configuration
5. Only Machine Features 
6. DIRAC Benchmark (DB12)

1. Common configuration
-----------------------

In the simplest case, just install either the Torque, HTCondor, or Grid
Engine RPM and the /etc/rc.d/init.d/mjf script will be run to create 
/etc/machinefeatures. $MACHINEFEATURES is set to this value by 
/etc/profile.d/mjf.sh and /etc/profile.d/mjf.csh which are (likely to be)
sourced by new logins/jobs.

The files /etc/sysconfig/mjf and /var/run/mjf are read when creating the
$MACHINEFEATURES (and $JOBFEATURES) directories, and can provide default 
values for Machine/Job Features key/value pairs. /var/run/mjf values take
precedence. Note that files in /var/run are deleted at system boot time. 

These files can contain the following $MACHINEFEATURES keys:
  total_cpu hs06 grace_secs shutdowntime

These files can contain the following $JOBFEATURES keys:
  allocated_cpu hs06_job wall_limit_secs cpu_limit_secs 
  max_rss_bytes max_swap_bytes scratch_limit_bytes 

Values given this way override values obtained from the system (eg the
total number of logical processors), but are overridden in turn when 
per-job values can be determined from the batch system (eg the number 
of logical processors allocated to this job.)

The values cpu_limit_secs_per_cpu, max_rss_bytes_per_cpu, 
max_swap_bytes_per_cpu, and scratch_limit_bytes_per_cpu can be set in
either mjf file to cause the scripts to calculate the corresponding 
per-job value (eg cpu_limit_secs) by multiplying by $JOBFEATURES/allocated_cpu
(which will be determined from the batch system if available, otherwise 1.)

If you know the HS06 of the worker node, you can include a line like
hs06=99.99  which will be picked up when populating /etc/machinefeatures/
(you can force updates after changing that file with  service mjf start  
as the mjf script looks like a SysV service.) This is then used to create
$MACHINEFEATURES/hs06 for the whole WN.

By default, the per-job $JOBFEATURES directories will be created under
/tmp/mjf-$USER but you can use a directory other than /tmp by setting 
mjf_tmp_dir=/DESIRED/PATH in either mjf file.

2. Torque/PBS
-------------

The mjf-torque RPM installs /var/lib/torque/mom_priv/prologue.user which is 
run by Torque at the start of each job to create 
$JOBFEATURES=/tmp/mjf-$USER/jobfeatures-$PBS_JOBID (by default), and installs 
/var/lib/torque/mom_priv/epilogue.user which runs at the end of the job to
clean up that directory. 

$JOBFEATURES/hs06_job is calculated from $MACHINEFEATURES/hs06 with a pro-rata
share for the job in question, based on $JOBFEATURES/allocated_cpu which is in
turn taken from the Torque ppn for the job (default 1.)

When creating $MACHINEFEATURES/total_cpu, the /usr/sbin/mjf-get-total-cpu 
script uses the value obtained by running the pbsnodes command for the node.
This can be overriden by setting total_cpu in either mjf file. If the value
cannot otherwise by found, it is obtained by counting 'processor' lines in 
/proc/cpuinfo.

3. HTCondor
-----------

The mjf-htcondor RPM installs the /usr/sbin/make-jobfeatures script which must
be run as part of the HTCondor user job wrapper. If a job wrapper is not
already defined, then this can simply be done by setting
USER_JOB_WRAPPER = /usr/sbin/mjf-job-wrapper in the HTCondor configuration.
If a job wrapper is already being used, then it must be modified to run
/usr/sbin/make-jobfeatures in the way mjf-job-wrapper does,
including exporting $JOBFEATURES and $MACHINEFEATURES to the job itself.

$JOBFEATURES/hs06_job is calculated from $MACHINEFEATURES/hs06 with a pro-rata
share for the job in question, based on $JOBFEATURES/allocated_cpu which is
turn taken from the CpusProvisioned value in the job ad (default 1.)

When creating $MACHINEFEATURES/total_cpu, the /usr/sbin/mjf-get-total-cpu 
script uses the value obtained by running  condor_config_val NUM_CPUS  to
discover the number of logical processors HTCondor can allocated to jobs.
This can be overriden by setting total_cpu in either mjf file. If the value
cannot otherwise by found, it is obtained by counting 'processor' lines in 
/proc/cpuinfo.

4. Grid Engine (in early development!)
--------------------------------------

The mjf-gridengine RPM installs the /usr/sbin/make-jobfeatures script which
must be run as part of the user environment set up and creates the $JOBFEATURES
directory. The files /etc/profile.d/mjf.sh and mjf.csh are installed to do this.

NOTE that the value of wall_limit_secs MUST be set in either /etc/sysconfig/mjf
or /var/run/mjf as this value is not supplied to jobs by Grid Engine.

$JOBFEATURES/hs06_job is calculated from $MACHINEFEATURES/hs06 with a pro-rata
share for the job in question, based on $JOBFEATURES/allocated_cpu which is in
turn taken from $NSLOTS set by Grid Engine for the job (default 1.)

Setting total_cpu in either mjf file will set the value to use for
$MACHINEFEATURES/total_cpu . Otherwise it is obtained by counting 'processor'
lines in /proc/cpuinfo.

5. Only Machine Features
------------------------

The mjf-onlymf RPM only installs the common scripts to create
$MACHINEFEATURES/hs06 (if hs06 is defined) and $MACHINEFEATURES/total_cpu.
total_cpu can also be overriden by setting total_cpu in either mjf file. 
If the value cannot otherwise by found, it is obtained by counting 'processor'
lines in /proc/cpuinfo.

$JOBFEATURES is neither defined nor the files created. The mjf-onlymf RPM
should only be used on systems other than Torque/PBS, HTCondor, or Grid
Engine so at least $MACHINFEATURES is available.

6. DIRAC Benchmark (DB12)
-------------------------

Support for the DIRAC fast benchmark (DB12) is also included, which is
implemented by analogy with HEPSPEC06: $MACHINEFEATURES/db12 and
$JOBFEATURES/db12_job are created if the DB12 measurements are available.
The key/value pairs db12 and db12_job can be included in /etc/sysconfig/mjf
or /var/run/mjf as with hs06 and hs06_job as described above.

However, it will normally be more convenient to create the file /etc/db12/db12
by simply installing the mjf-db12 RPM which runs the DB12 benchmark early in the
boot process when the machine is otherwise idle. The /etc/rc.d/init/db12 script 
stores the result in /etc/db12/db12 along with /etc/db12/total_cpu, equal to the
number of DB12 benchmark instances run in parallel to make the measurement.

If /etc/db12/total_cpu exists before the db12 script is run, then it is used
to determine the number of instances to run. Otherwise the number of logical
processors is counted from the operating system and /etc/db12/total_cpu is
created. 

Since /etc/rc.d/init.d/db12 is run very early in the boot process,
if /etc/db12/total_cpu is different from the number of logical processors,
then it must be created during the original installation (typically by Kickstart)
and not by subsequent configuration by a system such as Puppet which will be
started after db12 has run. 

/etc/db12/total_cpu should match $MACHINEFEATURES/total_cpu so that the number of
DB12 instances run matches the number of processors available to be allocated to
jobs.