Skip to content
Snippets Groups Projects
Domenico Giordano's avatar
[BMK-1382] rebuild cms reco
Domenico Giordano authored
ffd291f5
History

HEP-Workloads

HEP-Workloads is a collection of HEP applications for benchmarking purposes. The applications are packaged in self contained Docker and Apptainer images, including the needed libraries from CVMFS.

HEP-Workloads are the components of the HEP benchmark HEPScore. Although HEP-Workloads can be run separately, a more straightforward way to execute them is via the HEPscore application. For that see documentation in the HEPscore project.

For the impatient: Would you like to add a new workload in the repository? Check the following section of this README

Summary of currently supported HEP workloads

The list of HEP workloads is being maintained here, where pointers are available to each workload code and to the images available in the Singularity/Apptainer format (SIF) and the docker format.

Execute a HEP workload container

Given a HEP workload docker image (DIMAGE) from the [docker registry](https://gitlab.cern.ch/hep-benchmarks/hep-workloads/container_registry), or a SIF image (SIMAGE) from the sif registry, get the benchamark to run through

  • docker run --rm -v /tmp/results:/results $DIMAGE
    • In order to access the command line options: docker run --rm -v /tmp/results:/results $DIMAGE -h
  • apptainer run -C -B /tmp:/tmp -B /tmp/results:/results oras://$SIMAGE
    • In order to access the command line options: apptainer run -C -B /tmp:/tmp -B /tmp/results:/results oras://$SIMAGE -h

The HEP workload docker images are also compatible with Apptainer. It's possible to run the containers with Apptainer by prepending the image with a docker:// URI:

  • apptainer run -C -B /tmp:/tmp -B /tmp/results:/results docker://$IMAGE
    • The /tmp/results directory must exist. Note that the overlay or underlay options must be enabled in your apptainer.conf file. On systems with older Linux kernels that do not support OverlayFS, such as in Scientific Linux 6, underlay must be used. Also note that full containment ('-C') is recommended when invoking Apptainer, in order to avoid potential conflicts with your environment settings and the configuraton files in your home directory, which would otherwise be inherited by the container.

Example of a fast benchmark: Run Athena KV

  • Run Athena v17.8.0.9 (legacy version of KV)
    • DIMAGE=gitlab-registry.cern.ch/hep-benchmarks/hep-workloads/atlas-kv-bmk:ci1.2
    • docker run --rm -v /<some_path>:/results $DIMAGE
    • apptainer run -C -B /tmp:/tmp -B /<some_path>:/results docker://$DIMAGE
    • command line options: -h (for help)

Example: Run CMS ttbar GEN-SIM Run3 Multi Architecture

DIMAGE=gitlab-registry.cern.ch/hep-benchmarks/hep-workloads/cms-gen-sim-run3-ma-bmk:latest

SIMAGE=gitlab-registry.cern.ch/hep-benchmarks/hep-workloads-sif/cms-gen-sim-run3-ma-bmk:latest_x86_64

(or use tag latest_aarch64 for ARM nodes. Apptainer cannot recognize a multi-architecture image manifest as docker does. Therefore the architecture is explicitely appended to the tag)

Any combination of the following options

  • command line options: -h (for help), -t xx (number of threads, default: 4), -e xx (number of events per thread, default: depends on the WL ), -c xx (number of copies, default: saturate the server) -d (debug verbosity)

    • The script will spawn a number of parallel processes equal to (number of available cores) / (number of defined threads per proces)
  • In order to retrieve json benchmark results and logs from a local directory, mount a host directory (/some_path) as container volume /results

    • docker run --rm -v /some_path:/results $DIMAGE
    • apptainer run -C -B /tmp:/tmp -B /some_path:/results $SIMAGE
  • In order to fix number of events per thread (default is 20) and number of threads per cmssw (default is 4)

    • docker run --rm -v /some_path:/results $DIMAGE -t 10 -e 50
    • apptainer run -C -B /tmp:/tmp -B /some_path:/results oras://$SIMAGE -t 10 -e 50
  • In order to fix number of running cores

    • docker run --rm --cpuset-cpus=0-7 $DIMAGE
    • This is not directly possible with Apptainer
  • In order to inspect information about the docker container, including the label with the description of the HEP workload included

    • docker inspect $DIMAGE
    • This is not directly possible with Apptainer

Add a new workload

In order to add a new workload, and before continuing with the next section, please check the following conditions

  1. Contact the developers to get support via the HEP Benchmarks project Discourse Forum or create an issue on gitlab.
  2. Is your code in cvmfs?
    • YES: In that case the framework developed in this repository will snapshot what used by the application.
    • NO: please prepare a Dockerfile with all packages to be installed
  3. Does the application require input data?
    • NO: Ok, no need to worry
    • YES: please, make sure that these data are declared OpenAccess by the collaboration. In order to include the input data in the container look at this example
  4. When data, conditions and configuration files are copied locally, does the application require additional network connectivity to remote services?
    • NO: good everything is ok.
    • YES: please clarify with the developers your use case and what cannot be made local in the container. The hep-workload container to be built should be fully standalone and independent from network connectivity

HEP application in a standalone container

A standalone container running an HEP workload consists mainly of

  1. A cvmfs area exposing the software libraries (if available)
  2. A set of input data (root files and condition data)
  3. An orchestrator script

In order to automate the build and testing of the workloads in this repository a number of auxiliary files is also present.

Workloads are organized in sub-directories experiment/workload/experiment-workload. See as an example cms/gen-sim-run3-ma A workload directory contains a .spec file, a Dockerfile.append and the sub-directory with the experiment specific scripts The experiment specific folder includes

The orchestrator script

The orchestrator script (example here for KV ) takes care of configuring the application environment, run the application, parse the produced output and create the score results. The script has few utility functions to start a configurable number of parallel (and independent) copies of the same application, so that all the cores of the machine under test receive a workload. Typically the number of copies depends on the number of available cores and the number of threads each copy will spawn.

The orchestrator script resides in a directory (example here for KV ) that expects all other components (namely /cvmfs and input files) are already available. How to make those components available depends on the approaches followed.

The parser script

At the end of the run, the application must produce a performance result, in general built from data extracted from logs. In the case of event based application, the performance result is the number of processed events per second.

When this performance metric is produced, the application, via a parser produces this kind of JSON report where the mandatory components are - run_info: the configuration parameters - report.wl-scores: the dictionary of performance metrics In addition the report.wl-stats dictionary can be populated by additional metrics. An example of parser used to generate the JSON report is here In some cases the developers have preferred to develop a python parser, and use the parseResults.sh just as a wrapper. It was the case of LHCb here. In both cases the script must assign to the env variable resJSON the string containing the key:value for wl-scores as in here

resJSON='''
    "wl-scores": {"gen-sim": 18.8476 , "gen": 95.9261 , "sim" : 24.2931 } , 
    "wl-stats": { "gen-sim": { "avg" : 4.7119 , "median" : 4.711 , "min" : 4.699 , "max" : 4.7265, "count" : 4 } } ,
    "another_key": "another_value" 
    '''   

Build a standalone Workload container

The standalone container is built in two complementary ways

  1. interactive approach on a local machine
    • this is the option developers will follow at first
  2. automatic approach on a gitlab CI runner
    • this is the option that builds the official versions of the benchmarks. The CI procedure takes care of checking the versions, tagging the containers on the basis of the git commit hashes, uploading the standalone images in the gitlab registy, testing them and announcing the release.

In both approaches something happens behind the scene: the snapshot of CVMFS. Let's look at this briefly

Snapshot the CVMFS repository using the CVMFS Shrinkwrap utility

This is what achieved by using this Dockerfile, where the local cvmfs area (empty in this repo) can be populated with the snapshot fo CVMFS obtained by using the CVMFS Shrinkwrap utility.

This utilities assumes that

  • a recent version of CVMFS including shrinkwrap
  • an appropriate configuration of the shrinkwrap utility is present
  • that the HEP application runs once, in order to open the cvmfs file that will be then extracted to build the snapshot.

When this is done, after running the shrinkwrap utility, a local archive of the cvmfs snapshot is build. The process goes through the creation of a trace file, that traces what has been accesses, followed by the copy of those files to the local archive.

back to the build procedure

In order to automate the build procedure a bash script is available in this repo main.sh The script takes care of

  • Installing the appropriate version of cvmfs (if needed)
  • Configuring the cvmfs application for the tracer
  • Trigger the execution of the HEP application
  • Create the cvmfs local archive
  • Create the standalone Docker container with the HEP application and the local copy of the cvmfs archive

The gitlab CI of this project is configured to run all those steps in an isolate environment using Docker containers (see .gitlabci.yml).

To run the same steps interactively in a local machine, the script run_build.sh should be used. The following example shows how.

Example: Run interactively the CVMFS Shrinkwrap procedure to build a standalone LHCb container

  1. Install docker (doc from official site)

    • yum remove docker docker-common docker-selinux docker-engine
      yum install -y yum-utils device-mapper-persistent-data lvm2
      yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
      yum install -y docker-ce
      systemctl start docker
      systemctl enable docker
  2. Clone the repository

    • git clone https://:@gitlab.cern.ch:8443/hep-benchmarks/hep-workloads.git
    • HEPWL=`readlink -f hep-workloads`
  3. Run the automatic procedure

    • $HEPWL/build-executor/run_build.sh -s $HEPWL/lhcb/gen-sim/lhcb-bmk.spec
    • The script creates a set of directories in /tmp/$USER
      • cvmfs_hep: will contain the mount points for the cvmfs bind mounts
      • logs: will contain the build logs, organized in the following manner
        • build-wl: where the source files (dockerfile, orchestrator, etc) are copied before the build
        • cvmfs-traces: where the traces of the cvmfs accessed files are stored
        • results: where the results of the intermediate runs of the application are stored
      • singularity: will contain the singularity cache
    • Each directory will then include a working directory associated to the build, named noCI-date-time-uid
  4. Tips

    • run $HEPWL/build-executor/run_build.sh -h to know all options
    • In case only cvmfs needs to be exposed, run run_build.sh with the option -e
      • $HEPWL/build-executor/run_build.sh -s $HEPWL/cms/gen-sim-run3-ma/cms-gen-sim.spec -e sleep
        • This will start a privileged container that will mount the cvmfs repositories listed in $HEPWL/cms/gen-sim-run3-ma/cms-gen-sim.spec and will stay alive so that a second container can bind mount the exposed cvmfs mount point
          • docker run -v /tmp/root/cvmfs_hep/MOUNTPOINT:/cvmfs:shared ...