How to Write a Handler
Each handler will work like any usual parser , will parse any data it wants and it will save it. The only difference is the way you save the numbers/objects you want and that your handler class must extend the BaseHandler. For example let say you want to parse some logs and extract some numbers, ex : my_number = file.readline()
In order to save my_number you will call the self.saveInt
or self.saveFloat
etc depending what type is the my_number
The BaseHandler class provides the following methods:
- saveInt
- saveFloat
- saveString
- saveJSON
- saveFile
each of the above methods take 4 arguments, ( 'group' and 'description' arguments are optional)
, for example, saveInt(name, data, description, group)
- name : the name of the attribute you save( example : 'cpu_execute_time')
- data : the data/value your attribute has (example : 12.45)
- description(optional) : any description you want to add for the attribute (example : 'this number represents...')
- group(optional) : the group(if any) in which the attribute you save belongs(example : 'timing measures')
The description and group are optional arguments. So let say you have an attribute:
execution_time = 12.45
, If you want to save this number you can either save it like:
self.saveFloat('execution_time', 12.45,'the total execution time', 'timing results')
or just save it without description/group, just its name and its value/data:
self.saveFloat('execution_time', 12.45)
Remember to use the right method for the right type, if you want to save string use the saveString function if you want to save an Integer use saveInt etc
BaseHandler
Commands available to saveInt(name, an_int, description="",group="")
Saves an int in the database.
saveFloat(name, a_float, description="",group="")
Saves a float in the database.
saveJSON(name, python_object, description="",group="")
If you would like to save python built-in types, their composition, or ROOT
objects that inherit from TObject
to the database you can use saveJSON
method:
obj = {"name": "SomeTest", "values": [1.0, 2.9, 3.2, {"name": "test"}]}]
self.saveJSON("my_obj", obj)
root_obj = ROOT.gDirectory.Get("myRootObject")
self.saveJSON("my_root_obj", root_obj)
Note that saving ROOT objects only work for ROOT versions >= 6.08. If your copy
of ROOT is not up to date enough, attempting to ues saveJSON
on a ROOT object
will fail with NotImplementedError
. (You can test using the most up to date
ROOT with lb-run ROOT ./testHandlers.py ...
; see
How to Test Handlers)
In database the attribute is stored as string type, so on the client side you need to decode JSON.
saveFile(name,filename,description="",group="")
if you want to save a file , call the saveFile
function giving a 'name' for your file attribute
and in 'filename' provide the path to the file you want to save. Example:
lets say that you have a file: my_results_file = '/afs/cern.ch/.../path/my_file' you can save it :
## attribute name path to file
self.saveFile('my_results_file' , '/afs/cern.ch/.../path/my_file' )
, also you can add a group or a description(like the explained above)
How the structure of your handler must be
You must create a class which will extend the BaseHandler as shown here:
from BaseHandler import BaseHandler
class your_handler_name(BaseHandler):
def __init__(self):
super(self.__class__, self).__init__()
def collectResults(self,directory):
...
First and most important , the handler python file must have the same name as the handler class, for example if your handler class is called 'TimingHandler' then the python file must have the name 'TimingHandler.py'
Second in the __init__
you must call super(self.__class__, self).__init__()
(as shown above, just copy paste it)
and at last you must override the method collectResults(self,directory) , so now each time you want to save something you
will call one of the functions saveInt,saveFloat etc using self.saveInt(...)
etc, then you can add anything else (example other functions)
you want in your handler class.
Any file your handler needs it must find it in the given directory (the directory argument of collectResults function)
How to Test Handlers
You can use testHandlers.py
script. You need to pass to it the directory with
job's output and list of handlers
usage: testHandlers.py [-h] [-r RESULTS] -l HANDLERS
Test handlers: you need to set a directory with job results and a list of
handlers
optional arguments:
-h, --help show this help message and exit
-r RESULTS, --results RESULTS
Directory which contains results, default is the
current directory
-l HANDLERS, --list-handlers HANDLERS
The list of handlers (comma separated.)
For example:
$> ./testHandlers.py -r /path/to/output -l GeantTestEm3Handler
The result is a zip file. This zip file contains all files saved with saveFile
method
and json_results
text file with all other attributes.
Environment
Here are instructions for setting up your environment for testing the handlers. It is potentially easier on lxplus - see the LXPLUS subsection below.
Obtain a local environment for running the handlers (as done centrally):
env -i bash -c "source /cvmfs/lhcb.cern.ch/lib/LbEnv ; lb-conda-dev virtual-env default hd-env"
./hd-env/run pip install \
--index-url https://lhcb-repository.web.cern.ch/repository/pypi/simple \
--use-feature=2020-resolver \
-r https://gitlab.cern.ch/lhcb-core/nightlies-jenkins-scripts/raw/master/lhcbpr2hd-reqs/requirements.txt
If you need access to websites signed by CERN CA (lhcb-couchdb.cern.ch),
install CERN certificates (if they are not already in /etc/pki/tls/certs
),
and set REQUESTS_CA_BUNDLE
(as recent versions of requests
use certifi
rather than the certificates from the system).
sudo rpm -i https://linuxsoft.cern.ch/cern/centos/7/cern/x86_64/repoview/CERN-CA-certs.html
sudo update-ca-trust
export SSL_CERT_DIR=/etc/pki/tls/certs
export REQUESTS_CA_BUNDLE=$SSL_CERT_DIR/ca-bundle.crt
In order to propagate these environment variables, it is useful to wrap the
usual ./hd-env/run
wrapper (as it does not propagate all variables):
cat >run <<'EOF'
#!/bin/bash
certs_dir=/etc/pki/tls/certs
exec ./hd-env/run env SSL_CERT_DIR=$certs_dir REQUESTS_CA_BUNDLE=$certs_dir/ca-bundle.crt "$@"
EOF
chmod +x run
Any command that you run should then be prefixed with ./run
.
LXPLUS
To use lxplus
, the following environment variables should be set:
SSL_CERT_DIR='/etc/pki/tls/certs/'
REQUESTS_CA_BUNDLE='/etc/pki/tls/cert.pem'
SSL_CERT_FILE='/etc/pki/tls/cert.pem'
Like the section above, these can either be exported or wrapped into a ./run
command.
Either method then makes it possible to follow the advice from the section below to run single automated tests.
NB: Some handlers can require extra environment variables to be set, like the BandwidthTestHandler. See the relevant TestHandler docstring for more information.
Semi-automated tests
In addition to the ./testHandlers.py
script, there is a folder for automated tests named tests
. Automated tests in this manner are not required for every handler, but can sometimes be helpful for tracking down bugs. All tests should use the unittest
framework.
Here are some example commands for listing and running the tests:
./run python -m pytest --collect-only # list
./run python -m pytest # run all tests
./run python -m pytest --log-cli-level=DEBUG tests/test_ThroughputHandlers.py # run a single test