Commit c728ec09 authored by Christoph Hasse's avatar Christoph Hasse 🤸🏻 Committed by Maciej Pawel Szymanski

Rework benchmark to use new scheduler and new options

parent 3efa9021
__pycache__/
*.py[cod]
*.log
This repository provides a set of scripts to ease benchmarking of Minibrunel.
You will find there 8 scripts and here is a short description of each of them
You will find there 7 scripts and here is a short description of each of them
The first one to use is RunThroughputJobs.py, which runs a throughput test given a certain options file.
Note that this is only supported for options files which are part of `Brunel/Rec/Brunel/python/upgrade_options`
The first one to use if runTest.py, I let you guess what it does.
Check the help, but main things you want to know are :
* MiniBrunel config is hardcoded in there, you have to change it by hand at the top in prepareInput
* default config will run HLT1 for 1000 events per thread/job, with 1 thread, 1 job, in hive mode and without the fit
- -t allows to give the number of threads you want to use, for example -t 1,2,4 will run 3 times MiniBrunel with 1, 2 and 3 threads
- -j is similar for number of jobs
- all combinations of -t and -j are ran : -j 1,2 -t 1,2 will run 4 tests. you can limit for global concurrency with -m
- -p allows to explicit list of points (ie nb threads:nbjobs) to run the test for, comma separated. Thus -j and -t ar then ignored
- --bestPoints let the tool run for the 6 best combinations of threads and jobs for the node (checking number of cores). thus -p, -j and -t will be ignored
- -n allows to change number of event per thread/job
- -r allows to run several time each config, by default it's only once
- -o allows to run only in hive mode. By default for nbThreads=1, both hive and non-hive versions are ran
- --numa allows to respect numa nodes. Jobs will be launched using numactl on the different numa nodes in a round-robin fashion. So you probably want to have the number of job being a multiple of the number of numa nodes
- it takes a single argument : the input file name
- finally it internally uses lbsmaps, so you need to have this script in your PATH
- OptsFileName specifies the options file to be used from `Brunel/Rec/Brunel/python/upgrade_options`, only name without .py ending
- TestFileDBKey sets the input file one wishes to use, for actual testing please overwrite filepath to use local files see -f option.
- -j Defines the number of independent jobs you want to launch. Following options are per Job!
- -t allows to give the number of threads you want to use, for example -t 4 will use 4 threads for each job
- -e allows to change number of eventslots per job
- -n allows to change number of events per job
- --FTDecoVersion allows you to specify the needed decoding version for the FTDecoding
- --nonuma disables numa handling, if you don't know what that means, please leave it on (default).
- -f specify filepaths of the input files matching the TestFileDBKey
- --profile this will automatically use vtune to profile a single job (no multijob support) and produce a flamegraph of the result
Take care when you run to use as input a file in RAMFS if you do not want to be slowed down by IO.
The output of runTest.py is a set of .log, .csv and .xml files. One of each per test ran and per job of the test. They will appear in the current directory and are named after the parameters of the test.
Put them in a given directory before you use the other scripts.
extract.py is the next one you may use : you give it a directory full of log, csv and xml files and it parses all of them to extract the key numbers into a file called extractedData (yes, hardcoded name, to be changed).
It only contains a python tuple of dictionaries. That's the one you need for all subsequent plots.
The output of `RunThroughputJobs.py` is a set of .log files. One of each per job. They will appear in the current directory and are named after the parameters of the test.
computeTroughput.py computes a single throughput number from the extractedData file. This number is the average of the 6 best throughputs achieved.
`SetupThroughputOpts.py`, `doprofile.sh`, `flamegraph.pl`, and `stackcollapse-vtune.pl` are helper scripts used internally in `RunThroughputJobs.py`, don't worry about it.
plotMem.py and plotSpeedup.py each take as input this extractedData and plot stuff mem/speedup with respect to parallelism level (nb threads*nb jobs)
`doScaling.py` is a script that enables you to easily run the `RunThroughputJobs.py` for different number of jobs,threads,eventslots, and number of events. The options are similar to the `RunThroughputJobs.py` and are also documented in `./doScaling.py --help`.
plotAlgoUsage.py and hivetimeline.py are different as they take in input a csv file of one single test and they plot what happened during that test.
- plotAlgoUsage.py plots a pie chart of the relative time spent in the different algorithms during the test
- hivetimeline.py plots the timeline of the job, with the scheduling of each algo and event on each core.
Take care that this one will not work so well with many events. You probably want to have no more than a few 10s per thread. One trick is to just take the head -1000 of the csv file and see the start of the job
`plotScaling.py` is simply executed in the base folder and looks for files matching the naming convention of the produced log files of the `doScaling.py` test and then produces a plot `scalingTest.png` which shows the throughput of each tested configuration.
measureThroughput.sh wraps up all the rest to compute the reference throughput of a machine by running test, extracting throughputs and evraging them. It is a light wrapper around runTest, extract and computeThroughput
\ No newline at end of file
#!/usr/bin/env python
"""Runs a single job as configured via specified options file which is passed to SetupThroughputOpts.py """
import re
import os
import sys
import argparse
import subprocess
def runJob(OutputFileName, jobidx, nbNumaNodes, modifiedEnv, doprofile):
'''Run a test with the given options'''
# open log file
outputFile = open("{}.log".format(OutputFileName), 'w+')
# build command line
cmdLine = ["gaudirun.py", str(os.path.dirname(os.path.realpath(__file__))) + "/SetupThroughputOpts.py"]
# deal with numa if needed
if nbNumaNodes > 1:
node = jobidx % nbNumaNodes
cmdLine = ["numactl", "-N", str(node), "-m", str(node), "--"] + cmdLine
# run the test
print("Launching job with cmdLine: {}".format(cmdLine))
process = subprocess.Popen(cmdLine, stdout=outputFile, stderr=subprocess.STDOUT, env=modifiedEnv)
if doprofile:
cmdLine = ["./doprofile.sh" , str(process.pid)]
if nbNumaNodes > 1:
node = (jobidx % nbNumaNodes) + 1
cmdLine = ["numactl", "-N", str(node), "-m", str(node), "--"] + cmdLine
profprocess = subprocess.Popen(cmdLine)
return process, outputFile
def main():
'''Main method : parses options and calls runJob which configures and runs a single Brunel job'''
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument('-j', '--jobs', type=int, required=True,
help='nb of jobs to launch')
parser.add_argument('-t', '--threads', type=int, required=True,
help='nb of threads per job')
parser.add_argument('-e', '--evtSlots', type=int, default=-1,
help='nb of event slots per job (default: nb of threads per job + 10%)')
parser.add_argument('-n', '--events', default=1000, type=int,
help='nb of events to process per thread (default: 1000)')
parser.add_argument('OptsFileName', type=str,
help='Option file name which defines the reconstruction sequence to run.')
parser.add_argument('TestFileDBKey', type=str,
help='TestFileDB key name which defines input files and tags.')
parser.add_argument('--FTDecoVersion', default=4, type=int,
help='SciFi Decoding Version must match used input files (default: 4)')
parser.add_argument('-f','--inputFileNames', type=str, nargs='+', default="",
help='Names of input files, multiple names possible')
parser.add_argument('--nonuma', action='store_true',
help='whether to disable usage of numa domains.')
parser.add_argument('--profile', action='store_true',
help='whether to enable vtune profiling - only supported for single job.')
args = parser.parse_args()
if args.evtSlots == -1:
args.evtSlots = 1+int(11*args.threads/10)
if args.profile and args.jobs >1:
raise RuntimeError("VTune Profile only supported for single job")
try:
with open(os.devnull, 'w') as FNULL:
subprocess.check_call(['amplxe-cl', '--help'], stdout=FNULL )
except:
if args.profile:
raise RuntimeError("can't execute amplxe-cl, please make sure that you have intel tools setup correctly")
else:
print("Warning: can't execute amplxe-cl, please make sure that you have intel tools setup correctly")
try:
with open(os.devnull, 'w') as FNULL:
subprocess.check_call(['gaudirun.py', '--help'], stdout=FNULL )
except:
raise RuntimeError("can't execute gaudirun.py, please make sure that you are in a Brunel environment")
# deal with numa config
nbNumaNodes = 1
if not args.nonuma:
try:
output = subprocess.check_output(["numactl", "-show"])
nodeline = [line for line in output.split('\n') if line.startswith('nodebind')][0]
nbNumaNodes = len(nodeline.split()) - 1
if(nbNumaNodes != args.jobs) and not args.profile:
print("Warning: There are {} available numa nodes but you are launching {} jobs".format(nbNumaNodes, args.jobs))
except:
# numactl not existing
print 'Warning : -- numactl not found, running without setting numa nodes --'
# check how many threads the cpu actually has
cputhreads = int(subprocess.check_output(["lscpu", "-e=CPU"]).split('\n')[-2]) + 1
# check total number of threads
if args.jobs*args.threads > cputhreads:
print("CPU seems to only have {} threads but you specified {} jobs with {} threads each. This will overcommit the CPU, do you know what you are doing?".format(cputhreads, args.jobs, args.threads))
myenv = os.environ.copy()
myenv['NUMEVTS'] = str(args.events)
myenv['NUMEVTSLOTS'] = str(args.evtSlots)
myenv['NUMTHREADS'] = str(args.threads)
myenv['TESTDBKEY'] = args.TestFileDBKey
myenv['FTDECOVER'] = str(args.FTDecoVersion)
myenv['FILE'] = ",".join(args.inputFileNames)
myenv['OPTSFILE'] = args.OptsFileName
runningJobs=[]
for job in range(args.jobs):
outputFileName = 'ThroughputTest.{:s}.{:d}t.{:d}j.{:d}e.{:d}'.format(os.environ['CMTCONFIG'], args.threads, args.jobs, args.events, job)
runningJobs.append(runJob(outputFileName, job, nbNumaNodes, myenv, args.profile))
throughputs = []
regex = re.compile("Evts\/s = ([\d.]+)")
for idx, (job, ofile) in enumerate(runningJobs):
retcode = job.wait()
if retcode != 0 :
print("WARNING: non-zero return code from job {} with output file {}".format(idx,ofile.name))
# set file index to top of file
ofile.seek(0)
# read all lines from file
lines = ofile.readlines();
for l in lines[::-1]:
tmp = regex.search(l)
if tmp:
throughputs.append(float(tmp.group(1)))
print("Throughput of job {} is {} Evts/s.".format(idx,throughputs[-1]))
break
ofile.close()
print("Throughput test is finished. Overall reached Throughput is {} Evts/s".format(sum(throughputs)))
if __name__ == '__main__':
sys.exit(main())
import os, importlib
from Gaudi.Configuration import importOptions
num_events = int(os.environ['NUMEVTS'])
num_eventslots = int(os.environ['NUMEVTSLOTS'])
num_threads = int(os.environ['NUMTHREADS'])
testDBKey = str(os.environ['TESTDBKEY'])
FTDecoVer = int(os.environ['FTDECOVER'])
filepath = os.environ['FILE'].split(',')
optsfile = str(os.environ['OPTSFILE'])
opts=importlib.import_module("upgrade_options.{}".format(optsfile))
# fix for empty list coming from the env variables
if filepath == ['']:
filepath = []
print("###SETUPTHROUGHPUTOPTS###\nImporting {} and executing with numevents:{}, eventslots:{}, threads:{} , using TestFileDBKey {}, SciFi decoding version {}, and set files to: {}".format(optsfile,num_events,num_eventslots, num_threads, testDBKey, FTDecoVer, filepath))
opts.runTest(testDBKey, num_eventslots, num_threads, num_events, FTDecoVer, filepath)
#!/usr/bin/env python
"""Compute reference throughput from the extracted data of a MiniBrunel benchmark"""
__author__ = "Sebastien Ponce"
import sys
times, mems = eval(open(sys.argv[1]).read())
# Extract data from input
if len(times.keys()) > 1:
print "Not supporting mixed CMTCONFIG, giving up"
sys.exit(-1)
times = times[times.keys()[0]]
# go though full list of points and extract throughput for each
scores = []
for nt, nj, useHive in times:
nbEvents, duration = times[(nt,nj,useHive)]
throughput = nbEvents/duration
scores.append(throughput)
# find best scores and average them
sscores = sorted(scores)[::-1][:6]
print sum(sscores)/len(sscores)
#!/usr/bin/env python
"""Runs a various thread : job configurations for scaling test"""
import re
import os
import sys
import argparse
import subprocess
DEFAULT_CONFIG=("1:2:4:50000,2:2:4:50000,4:2:4:50000,10:2:4:50000,18:2:4:100000,20:2:4:100000,22:2:4:100000,"
"1:8:10:200000,2:8:10:200000,3:8:10:200000,4:8:10:400000,5:8:10:400000,6:8:10:400000,"
"1:10:12:250000,2:10:12:250000,3:10:12:250000,"
"1:18:20:400000,2:18:20:400000,"
"1:20:24:500000,2:20:24:500000,"
"1:22:28:600000,2:22:24:600000")
def main():
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument('-c','--config', type=str, default=DEFAULT_CONFIG,
help='''nb of configs to run test for specified as a list:
"j:t:e:n,j:t:e:n,j:t:e:n,... where j=job, t=threads, e=Eventslots and n=number of events per job''')
parser.add_argument('OptsFileName', type=str,
help='Option file name which defines the reconstruction sequence to run.')
parser.add_argument('TestFileDBKey', type=str,
help='TestFileDB key name which defines input files and tags.')
parser.add_argument('--FTDecoVersion', default=4, type=int,
help='SciFi Decoding Version must match used input files')
parser.add_argument('-f','--inputFileNames', type=str, nargs='+', default="",
help='Names of input files, multiple names possible')
args = parser.parse_args()
configs = args.config.split(',')
for conf in configs:
jobs, threads, eventslots, events = conf.split(':')
cmdLine = ['python', './RunThroughputJobs.py', '-e', eventslots, '-t', threads,
'-j', jobs, '-n', events, args.OptsFileName, args.TestFileDBKey, '-f'] + args.inputFileNames
print("Starting config " + conf)
process = subprocess.Popen(cmdLine)
process.wait()
if __name__ == '__main__':
sys.exit(main())
#! /usr/bin/bash
set -euo pipefail
echo "yeah yeah yeah yeah such profile much wow! "
sleep 60
amplxe-cl -collect hotspots -d 60 -target-pid=${1} -r profile_out > /dev/null
amplxe-cl -R top-down -column="CPU Time:Self","Module" -report-out result.csv -format csv -csv-delimiter comma -r profile_out
sed -i '1d' result.csv > /dev/null
./stackcollapse-vtune.pl result.csv | ./flamegraph.pl --title "HLT1 Flame Graph" --minwidth 2 --width 1600 > flamy.svg
echo "Done"
#!/usr/bin/env python
"""Extracts throughputs from a set of log/csv files resulting from running a MiniBrunel benchmark"""
__author__ = "Sebastien Ponce"
import sys, os, csv
from multiprocessing import Pool
sequences = ["BrunelSequencer", "PhysicsSeq", "Reco", "RecoDecodingSeq", "RecoTrFastSeq"]
def extractMem(fileName):
f = open(fileName, 'r')
maxi = 0
for l in f.readlines():
if not l.startswith('<process'): continue
rss = int(l.split()[3][5:-1])/1000.0
if rss > maxi:
maxi = rss
f.close()
return maxi
def extractEvents(fileName):
events = {}
lastEvt = 0
with open(fileName, 'r') as csvfile:
reader = csv.reader(csvfile, delimiter=' ', quotechar='|')
titles = reader.next()
for s, e, a, t, sl, ne in reader:
ne = int(ne)
s = int(s)
e = int(e)
if ne in events:
ps, pe = events[ne]
events[ne] = (min(s, ps), max(e, pe))
else:
events[ne] = (s, e)
if ne > lastEvt:
lastEvt = ne
return lastEvt, events
def extractTimes(allEvents):
'''find out latest event number 10 and earliest last event,
then drop all events outside these bounds and count remaining
ones plus measure the time spent'''
startTime = 0
endTime = 100000000000000000000000000
for lastEvt, events in allEvents:
startTime = max(startTime, events[10][0])
endTime = min(endTime, events[lastEvt][1])
nbEvts = 0
for lastEvt, events in allEvents:
for ne in events:
if ne < 10: continue
if events[ne][0] < startTime or events[ne][1] > endTime: continue
nbEvts += 1
return nbEvts, startTime, endTime
resTime = {}
# list files per type and key (for csv ones)
print 'Listing Files'
xmlFiles = []
csvFiles = {}
prevNbEvents = -1
nbRuns = 0
for d in sys.argv[1:]:
for fileName in os.listdir(d):
if fileName[0:10] == 'MiniBrunel':
if fileName[-3:] == 'xml' :
xmlFiles.append(fileName)
elif fileName[-3:] == 'csv':
print 'using file %s/%s \r' % (d, fileName),
sys.stdout.flush()
base, cmtconfig, threads, jobs, isHive, events, runNb, job, ext = fileName.split('.')
run = int(runNb)
if run > nbRuns:
nbRuns = run+1
nbEvents = int(events[:-1])
if prevNbEvents < 0:
prevNbEvents = nbEvents
else:
if prevNbEvents != nbEvents:
print 'Inconsistent nb of events in the different files. This is not supported. Giving up'
sys.exit(1)
key = (int(threads[:-1]), int(jobs[:-1]), isHive=='hive')
if cmtconfig not in csvFiles:
csvFiles[cmtconfig] = {}
if key not in csvFiles[cmtconfig]:
csvFiles[cmtconfig][key] = {}
if run not in csvFiles[cmtconfig][key]:
csvFiles[cmtconfig][key][run] = []
csvFiles[cmtconfig][key][run].append('%s/%s' % (d, fileName))
print
# dealing with csv files
resTime = {}
print 'Extracting timing data'
for cmtconfig in csvFiles:
resTime[cmtconfig] = {}
for key in csvFiles[cmtconfig]:
for run in csvFiles[cmtconfig][key]:
startTime = 0
endtime = 100000000000000000000000000
allEvents = []
if len(csvFiles[cmtconfig][key][run]) != key[1]:
print 'MISSING FILE for %s, nbthread=%d, nbjobs=%d : %d out of %d present' % (cmtconfig, key[0], key[1], len(csvFiles[cmtconfig][key][run]), key[1])
# parse files in parallel
pool = Pool()
results = []
for fileName in csvFiles[cmtconfig][key][run]:
print 'using file %s/%s \r' % (d, os.path.basename(fileName)),
sys.stdout.flush()
results.append(pool.apply_async(extractEvents, ["%s/%s" % (d, os.path.basename(fileName))]))
for result in results:
allEvents.append(result.get())
nbevts, startTime, endTime = extractTimes(allEvents)
if endTime < startTime or nbevts == 0:
sys.stderr.write('\nWarning, unable to get consistent data for run %d with %d threads and %d jobs\n' % (run, key[0], key[1]))
else:
if key in resTime[cmtconfig]:
curNbEvts, curTime = resTime[cmtconfig][key]
else:
curNbEvts, curTime = 0, 0
resTime[cmtconfig][key] = (nbevts + curNbEvts, (endTime-startTime)/1000000000.0 + curTime)
print
# handle xml files
resMem = {}
print 'Extracting memory data'
for fileName in xmlFiles:
print 'using file %s/%s \r' % (d, fileName),
sys.stdout.flush()
value = extractMem("%s/%s" % (d, fileName))
base, cmtconfig, threads, jobs, isHive, events, runNb, job, ext = fileName.split('.')
run = int(runNb)
key = (int(threads[:-1]), int(jobs[:-1]), isHive=='hive')
if cmtconfig not in resMem:
resMem[cmtconfig] = {}
if key not in resMem[cmtconfig]:
resMem[cmtconfig][key] = [0]*(nbRuns+1)
resMem[cmtconfig][key][run] += value # mem taken is sum of different jobs
print
f = open('extractedData', 'w')
f.write(str((resTime, resMem)))
f.close()
This diff is collapsed.
File deleted
#!/bin/sh
# create a temporary directory for benchmark logs
rm -rf results
mkdir results
cd results
# run effectively the benchmark
runTest.py --bestPoints -n 10000 --numa $1
# extract throughput from the logs
extract.py . >& /dev/null
cd ..
# compute final number
throughput=$(computeThroughput.py results/extractedData 2>&1)
# tar and zip the logs
tar cjf results.tbz results
# cleanup
rm -rf results
# spit out the output
echo "Throughput : $throughput"
from HLT1BaseLine import setupHLT1Reconstruction
from SetupHelper import setupGaudiCore, setupInput
from GaudiKernel.SystemOfUnits import mm, GeV
def runTest(nbEventSlots=1, threadPoolSize=1, evtMax=50000, inputFiles=[]):
appMgr, hiveDataBroker = setupGaudiCore(topAlgs=['PrForwardTrackingFast'], nbEventSlots=nbEventSlots, threadPoolSize=threadPoolSize, evtMax=evtMax)
setupHLT1Reconstruction(appMgr, hiveDataBroker, GECCut=11000, IPCut=True, IPCutVal=0.1*mm, VeloMinPT=0.8*GeV, FTMinPT=1.0*GeV)
setupInput(inputFiles, fileType='MDF', dataType='Upgrade', DDDBTag="dddb-20171010", CONDDBTag="sim-20180530-vc-md100", Simulation=True)
if __name__ == "__builtin__":
runTest(nbEventSlots=1, threadPoolSize=1, evtMax=10000, inputFiles=['/dev/shm/00067189.mdf'])
from HLT1BaseLine import setupHLT1Reconstruction
from SetupHelper import setupGaudiCore, setupInput
from GaudiKernel.SystemOfUnits import mm, GeV
def runTest(nbEventSlots=1, threadPoolSize=1, evtMax=50000, inputFiles=[]):
appMgr, hiveDataBroker = setupGaudiCore(topAlgs=['ForwardFitterAlgParamFast'], nbEventSlots=nbEventSlots, threadPoolSize=threadPoolSize, evtMax=evtMax)
setupHLT1Reconstruction(appMgr, hiveDataBroker, GECCut=11000, IPCut=True, IPCutVal=0.1*mm, VeloMinPT=0.8*GeV, FTMinPT=1.0*GeV)
setupInput(inputFiles, fileType='MDF', dataType='Upgrade', DDDBTag="dddb-20171010", CONDDBTag="sim-20180530-vc-md100", Simulation=True)
if __name__ == "__builtin__":
runTest(nbEventSlots=1, threadPoolSize=1, evtMax=10000, inputFiles=['/dev/shm/00067189.mdf'])
from SetupHelper import setupComponent, setupAlgorithm
from GaudiKernel.SystemOfUnits import mm, GeV
def setupHLT1Reconstruction(appMgr, hiveDataBroker, GECCut = -1, IPCut = False, IPCutVal = 0.1*mm, VeloMinPT = 0.3*GeV, FTMinPT = 0.3 * GeV, Fit=False):
'''declare all algorithm used in HLT1'''
evtClockSvc = setupComponent('EventClockSvc', InitialTime=1433509200000000000)
setupAlgorithm('Gaudi__Hive__FetchDataFromFile', appMgr, hiveDataBroker, instanceName='FetchDataFromFile', iovLockDependency=False, DataKeys = ['/Event/DAQ/RawEvent'])
odinPath = '/Event/DAQ/DummyODIN'
UTTracksLocation='Rec/Track/Upstream'
VeloTracksLocation='Rec/Track/Velo'
setupAlgorithm('LHCb__Tests__FakeEventTimeProducer', appMgr, hiveDataBroker, instanceName='DummyEventTime', iovLockDependency=False, Start=evtClockSvc.InitialTime / 1E9, Step=0, ODIN=odinPath)
setupAlgorithm('LHCb__DetDesc__ReserveDetDescForEvent', appMgr, hiveDataBroker, instanceName='ReserveIOV', iovLockDependency=False, ODIN=odinPath)
setupAlgorithm('createODIN', appMgr, hiveDataBroker, iovLockDependency=False)
setupAlgorithm('PrPixelTracking', appMgr, hiveDataBroker, OutputTracksName=VeloTracksLocation, HardFlagging = True, SkipLoopSens = True, MaxMissedOnTrack = 2, MaxMissedConsecutive = 1, PhiWindow = 2.5, PhiWindowExtrapolation = 2.5, ModulesToSkip = [], EarlyKill3HitTracks = True, UsePhiPerRegionsForward = False, BoostPhysics = False, AlgoConfig="ForwardThenBackward")
setupAlgorithm('FTRawBankDecoder', appMgr, hiveDataBroker, instanceName='createFTClusters', RawEventLocations = "/Event/DAQ/RawEvent")
prFwdTracking = setupAlgorithm('PrForwardTracking', appMgr, hiveDataBroker, instanceName='PrForwardTrackingFast', InputName=UTTracksLocation)
from Configurables import PrForwardTool
prFwdTracking.addTool(PrForwardTool, "PrForwardTool")
prFwdTracking.PrForwardTool.MinPT = FTMinPT
setupAlgorithm('VSPClus', appMgr, hiveDataBroker, instanceName='VSPClustering')
if GECCut > 0:
setupAlgorithm('PrGECFilter', appMgr, hiveDataBroker)
else:
setupAlgorithm('PrGECFilter', appMgr, hiveDataBroker, NumberFTUTClusters = GECCut)
setupAlgorithm('PrStoreFTHit', appMgr, hiveDataBroker)
setupAlgorithm('PrStoreUTHit', appMgr, hiveDataBroker, skipBanksWithErrors = True)
setupAlgorithm('PatPV3DFuture', appMgr, hiveDataBroker, instanceName='PatPV3D', InputTracks=VeloTracksLocation)
setupAlgorithm('TrackBeamLineVertexFinder', appMgr, hiveDataBroker)
setupAlgorithm('PrVeloUT', appMgr, hiveDataBroker, instanceName='PrVeloUTFast', OutputTracksName=UTTracksLocation, doIPCut=IPCut, minIP=IPCutVal, minPT=VeloMinPT)
from Configurables import MeasurementProviderT_MeasurementProviderTypes__UTLite_
fitter = setupAlgorithm('ParameterizedKalmanFit', appMgr, hiveDataBroker, instanceName='ForwardFitterAlgParamFast', InputName="Rec/Track/Forward", OutputName="Rec/Track/FittedForward", MaxNumOutlier=2, MeasProvider_UT=MeasurementProviderT_MeasurementProviderTypes__UTLite_())
from Configurables import TrackMasterExtrapolator, SimplifiedMaterialLocator
fitter.addTool( TrackMasterExtrapolator, name="extr")
fitter.extr.ApplyMultScattCorr = True
fitter.extr.ApplyEnergyLossCorr = False
fitter.extr.ApplyElectronEnergyLossCorr = True
fitter.extr.addTool(SimplifiedMaterialLocator, name = "MaterialLocator")
from HLT1BaseLine import setupHLT1Reconstruction
from SetupHelper import setupGaudiCore, setupInput
from GaudiKernel.SystemOfUnits import mm, GeV
def runTest(nbEventSlots=1, threadPoolSize=1, evtMax=50000, inputFiles=[]):
appMgr, hiveDataBroker = setupGaudiCore(topAlgs=['PrForwardTrackingFast'], nbEventSlots=nbEventSlots, threadPoolSize=threadPoolSize, evtMax=evtMax)
setupHLT1Reconstruction(appMgr, hiveDataBroker, GECCut=11000)
setupInput(inputFiles, fileType='MDF', dataType='Upgrade', DDDBTag="dddb-20171010", CONDDBTag="sim-20180530-vc-md100", Simulation=True)
if __name__ == "__builtin__":
runTest(nbEventSlots=1, threadPoolSize=1, evtMax=10000, inputFiles=['/dev/shm/00067189.mdf'])
from HLT1BaseLine import setupHLT1Reconstruction
from SetupHelper import setupGaudiCore, setupInput
from GaudiKernel.SystemOfUnits import mm, GeV
def runTest(nbEventSlots=1, threadPoolSize=1, evtMax=50000, inputFiles=[]):
appMgr, hiveDataBroker = setupGaudiCore(topAlgs=['ForwardFitterAlgParamFast'], nbEventSlots=nbEventSlots, threadPoolSize=threadPoolSize, evtMax=evtMax)
setupHLT1Reconstruction(appMgr, hiveDataBroker, GECCut=11000)
setupInput(inputFiles, fileType='MDF', dataType='Upgrade', DDDBTag="dddb-20171010", CONDDBTag="sim-20180530-vc-md100", Simulation=True)
if __name__ == "__builtin__":
runTest(nbEventSlots=1, threadPoolSize=1, evtMax=10000, inputFiles=['/dev/shm/00067189.mdf'])
'''
this file provides helper functions symplifying configuration of a Gaudi application
'''
def setupComponent(name, instanceName=None, packageName='Configurables', **kwargs):
if instanceName == None:
instanceName = name
imported = getattr(__import__(packageName, fromlist=[name]), name)
return imported(instanceName, **kwargs)
def _addIOVLockDependency(configurable):
if hasattr(configurable, 'ExtraInputs') and '/Event/IOVLock' not in configurable.ExtraInputs:
configurable.ExtraInputs.append('/Event/IOVLock')
def setupAlgorithm(name, appMgr, hiveDataBroker, instanceName=None, iovLockDependency=True, **kwargs):
# import and create configurable
configurable = setupComponent(name, instanceName=instanceName, **kwargs)
# work around limitations in IOVLock implementation
if iovLockDependency:
_addIOVLockDependency(configurable)
# register it to the applicationManager and hiveDataBroker
hiveDataBroker.DataProducers.append(configurable)
appMgr.TopAlg.append(configurable)
return configurable
def setupGaudiCore(topAlgs, nbEventSlots, threadPoolSize, evtMax):
whiteboard = setupComponent('HiveWhiteBoard', instanceName="EventDataSvc", EventSlots=nbEventSlots, ForceLeaves = True)
eventloopmgr = setupComponent('HLTControlFlowMgr',
CompositeCFNodes = [( 'moore', 'NONLAZY_AND', topAlgs, True ),],
ThreadPoolSize = threadPoolSize)
appMgr = setupComponent('ApplicationMgr', packageName='Gaudi.Configuration', EventLoop = eventloopmgr, EvtMax=evtMax)
appMgr.ExtSvc.insert(0, whiteboard)
setupComponent('UpdateManagerSvc', WithoutBeginEvent=True)
hiveDataBroker = setupComponent('HiveDataBrokerSvc')
return appMgr, hiveDataBroker
def setupInput(inputFiles, fileType, dataType, DDDBTag, CONDDBTag, Simulation):
from GaudiConf import IOHelper
iohelper = IOHelper(fileType, fileType)
iohelper.setupServices()
evtSel = iohelper.inputFiles(inputFiles)
inputs = []
for inp in evtSel.Input:
inputs.append(inp +" IgnoreChecksum='YES'")
evtSel.Input = inputs
evtSel.PrintFreq = 10000
dddb = setupComponent('DDDBConf', Simulation = True, DataType = dataType)
conddb = setupComponent('CondDB', Upgrade = True, Tags = { "DDDB": DDDBTag, "SIMCOND" : CONDDBTag })
import sys
import numpy
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import matplotlib.cm as cm
def extractData(fileName, skipEvts=10):
f = open(fileName, 'r')
res = {}
evtset = set([])
for l in f.readlines()[1:]:
s, e, a, t, sl, ne = l.split() #start end algorithm thread slot event
if a.find('Seq') >= 0 or a == 'Reco':
continue
evtset.add(ne)
s = int(s)
e = int(e)
if a not in res:
res[a] = 0
res[a] += e-s
f.close()
return len(evtset), res
nbevts, data = extractData(sys.argv[1])
print nbevts, data
# cleanup data, by merging everything below 2%
cdata = {}
s = sum(data.values())
other = 0
for alg in data:
if data[alg] < s*0.025:
other += data[alg]
else:
cdata[alg] = data[alg]
cdata['other'] = other
# compute time spent per event
throughput = 1000000000.0*nbevts/s
# Setup colors
colors = cm.rainbow(numpy.linspace(0., 1., len(cdata.keys())))
plt.axes(aspect=1)
plt.pie(cdata.values(), [0.02] * len(cdata.keys()), cdata.keys(), autopct='%1.1f%%', shadow=True, startangle=90, colors=colors)
#plt.title('Throughput : %.1f evts/s/thread' % throughput)
#plt.title('Time Usage in MiniBrunel')
plt.savefig('./algousage.png')
import sys, os, numpy
times, mems = eval(open(sys.argv[1]).read())
import matplotlib.pyplot as plt
plt.xlabel('Nb cores/threads')
plt.ylabel('Max resident memory usage (MB)')