Skip to content
Snippets Groups Projects
Dmitry Popov's avatar
Dmitry Popov authored
Gaussino metrics -- memory

See merge request !310
abb2c9f1
History

How to Write a Handler

Each handler will work like any usual parser , will parse any data it wants and it will save it. The only difference is the way you save the numbers/objects you want and that your handler class must extend the BaseHandler. For example let say you want to parse some logs and extract some numbers, ex : my_number = file.readline()

In order to save my_number you will call the self.saveInt or self.saveFloat etc depending what type is the my_number

The BaseHandler class provides the following methods:

  • saveInt
  • saveFloat
  • saveString
  • saveJSON
  • saveFile

each of the above methods take 4 arguments, ( 'group' and 'description' arguments are optional) , for example, saveInt(name, data, description, group)

  • name : the name of the attribute you save( example : 'cpu_execute_time')
  • data : the data/value your attribute has (example : 12.45)
  • description(optional) : any description you want to add for the attribute (example : 'this number represents...')
  • group(optional) : the group(if any) in which the attribute you save belongs(example : 'timing measures')

The description and group are optional arguments. So let say you have an attribute: execution_time = 12.45, If you want to save this number you can either save it like: self.saveFloat('execution_time', 12.45,'the total execution time', 'timing results') or just save it without description/group, just its name and its value/data: self.saveFloat('execution_time', 12.45)

Remember to use the right method for the right type, if you want to save string use the saveString function if you want to save an Integer use saveInt etc

Commands available to BaseHandler

saveInt(name, an_int, description="",group="")

Saves an int in the database.

saveFloat(name, a_float, description="",group="")

Saves a float in the database.

saveJSON(name, python_object, description="",group="")

If you would like to save python built-in types, their composition, or ROOT objects that inherit from TObject to the database you can use saveJSON method:

obj = {"name": "SomeTest", "values": [1.0, 2.9, 3.2, {"name": "test"}]}]
self.saveJSON("my_obj", obj)

root_obj = ROOT.gDirectory.Get("myRootObject")
self.saveJSON("my_root_obj", root_obj)

Note that saving ROOT objects only work for ROOT versions >= 6.08. If your copy of ROOT is not up to date enough, attempting to ues saveJSON on a ROOT object will fail with NotImplementedError. (You can test using the most up to date ROOT with lb-run ROOT ./testHandlers.py ...; see How to Test Handlers)

In database the attribute is stored as string type, so on the client side you need to decode JSON.

saveFile(name,filename,description="",group="")

if you want to save a file , call the saveFile function giving a 'name' for your file attribute and in 'filename' provide the path to the file you want to save. Example: lets say that you have a file: my_results_file = '/afs/cern.ch/.../path/my_file' you can save it :

    ##				attribute name 			path to file
    self.saveFile('my_results_file' , '/afs/cern.ch/.../path/my_file' )

, also you can add a group or a description(like the explained above)

How the structure of your handler must be

You must create a class which will extend the BaseHandler as shown here:

    from BaseHandler import BaseHandler
    
    class your_handler_name(BaseHandler):
        
        def __init__(self):
            super(self.__class__, self).__init__()
        
        def collectResults(self,directory):
            ...

First and most important , the handler python file must have the same name as the handler class, for example if your handler class is called 'TimingHandler' then the python file must have the name 'TimingHandler.py'

Second in the __init__ you must call super(self.__class__, self).__init__() (as shown above, just copy paste it)

and at last you must override the method collectResults(self,directory) , so now each time you want to save something you will call one of the functions saveInt,saveFloat etc using self.saveInt(...) etc, then you can add anything else (example other functions) you want in your handler class.

Any file your handler needs it must find it in the given directory (the directory argument of collectResults function)

How to Test Handlers

You can use testHandlers.py script. You need to pass to it the directory with job's output and list of handlers

usage: testHandlers.py [-h] [-r RESULTS] -l HANDLERS

Test handlers: you need to set a directory with job results and a list of
handlers

optional arguments:
  -h, --help            show this help message and exit
  -r RESULTS, --results RESULTS
                        Directory which contains results, default is the
                        current directory
  -l HANDLERS, --list-handlers HANDLERS
                        The list of handlers (comma separated.)

For example:

$> ./testHandlers.py -r /path/to/output -l GeantTestEm3Handler

The result is a zip file. This zip file contains all files saved with saveFile method and json_results text file with all other attributes.

Environment

Here are instructions for setting up your environment for testing the handlers. It is potentially easier on lxplus - see the LXPLUS subsection below.

Obtain a local environment for running the handlers (as done centrally):

env -i bash -c "source /cvmfs/lhcb.cern.ch/lib/LbEnv ; lb-conda-dev virtual-env default hd-env"
./hd-env/run pip install \
  --index-url https://lhcb-repository.web.cern.ch/repository/pypi/simple \
  --use-feature=2020-resolver \
  -r https://gitlab.cern.ch/lhcb-core/nightlies-jenkins-scripts/raw/master/lhcbpr2hd-reqs/requirements.txt

If you need access to websites signed by CERN CA (lhcb-couchdb.cern.ch), install CERN certificates (if they are not already in /etc/pki/tls/certs), and set REQUESTS_CA_BUNDLE (as recent versions of requests use certifi rather than the certificates from the system).

sudo rpm -i https://linuxsoft.cern.ch/cern/centos/7/cern/x86_64/repoview/CERN-CA-certs.html
sudo update-ca-trust
export SSL_CERT_DIR=/etc/pki/tls/certs
export REQUESTS_CA_BUNDLE=$SSL_CERT_DIR/ca-bundle.crt

In order to propagate these environment variables, it is useful to wrap the usual ./hd-env/run wrapper (as it does not propagate all variables):

cat >run <<'EOF'
#!/bin/bash
certs_dir=/etc/pki/tls/certs
exec ./hd-env/run env SSL_CERT_DIR=$certs_dir REQUESTS_CA_BUNDLE=$certs_dir/ca-bundle.crt "$@"
EOF
chmod +x run

Any command that you run should then be prefixed with ./run.

LXPLUS

To use lxplus, the following environment variables should be set:

SSL_CERT_DIR='/etc/pki/tls/certs/'
REQUESTS_CA_BUNDLE='/etc/pki/tls/cert.pem'
SSL_CERT_FILE='/etc/pki/tls/cert.pem'

Like the section above, these can either be exported or wrapped into a ./run command. Either method then makes it possible to follow the advice from the section below to run single automated tests.

NB: Some handlers can require extra environment variables to be set, like the BandwidthTestHandler. See the relevant TestHandler docstring for more information.

Semi-automated tests

In addition to the ./testHandlers.py script, there is a folder for automated tests named tests. Automated tests in this manner are not required for every handler, but can sometimes be helpful for tracking down bugs. All tests should use the unittest framework.

Here are some example commands for listing and running the tests:

./run python -m pytest --collect-only  # list
./run python -m pytest  # run all tests
./run python -m pytest --log-cli-level=DEBUG tests/test_ThroughputHandlers.py  # run a single test