Skip to content

WIP: Common parser

Martina Javurkova requested to merge common_parser into master

This new branch contains common parser scripts. There are many changes in several places so I will describe them here below:

  • common directory:
    • bmk-driver.sh
      • executes bmkParser.py script if commonExpDir exists (for ATLAS, it is atlas/atlasCommon)
      • export EXPERIMENT variable which is used by the common parser script
    • bmkParser.py
      • imports a parser for the experiment WL which is running and calls its methods
      • methods that need to be implemented by each experiment in {EXPERIMENT}Common/{EXPERIMENT}Parser.py
        • check(indir,app): checks if ALL input files are ready to be read i.e. can be opened and the job(s) finished successfully and returns True/False
        • collect_values(indir,app): collect all the information from each input file and fill them to a dictionary with self-explained keys and the following structure: {“digi-reco”: {“wallScore”:[...], “status”:[...]}, “HITtoRDO”:{...}}
          • mandatory keys: score (list of 0/1 containing the status of each copy) and wallScore (list of walltime scores of each copy)
      • methods that are optional and each experiment can put here everything considered useful:
        • get_wl_custom(collectVars,calculateVars)
      • methods that have default functionalities:
        • calculate_stats(...): takes the dictionary from collect_values(indir,app) and calculates score, average, median, minimum and maximum
        • get_wl_inputs(): gets the information about all input parameters from variables exported in bmk-parser.sh (os.environ['NCOPIES']), int(os.environ['NTHREADS']) and os.environ['NEVENTS_THREAD'])), generates a "wl-inputs" key in the output dictionary
        • get_wl_scores(calculateVars): generates a "wl-scores" key
        • get_wl_stats(calculateVars): generates a "wl-stats" key
        • get_wl_info(app): generates a "wl-info" key with the information about WL from version.json file
        • save_output(app,jsonOverallSummary,summaryJSON): saves the output dictionary to a JSON file
      • a "wl-status" key in the output JSON file can have several values:
        • number of failed copies i.e. 0 if everything is fine
        • 901 in case of a problem with input files i.e. missing, corrupted or job failed
        • 902 if no values have been extracted from input files and empty dictionary is returned
        • 903 if there is no "status" key in the returned dictionary from collect_values(...)
        • 904 if there is no "wallScore" key in the returned dictionary from collect_values(...)
      • abstractParser.py
        • contains all predefined methods
      • Dockerfile.header
        • added python-importlib which is used in bmkParser.py to dynamically import methods from each experiment according to the application running
  • atlas directory:
    • atlasCommon directory
      • contains atlasParser.py script with methods specific for the ATLAS experiment
      • contains init.py to mark this directory as python package directory

The output JSON file is called ${APP}_summary_newParser.json and have the following structure: JSON files will have the following keys

  • wl-inputs: copies, threads_per_copy, events_per_thread
  • wl-scores: e.g. wl-scores": {"ESDtoAOD": {"score": 9.4955}, "digi-reco": {"score": 0.58979}, ...}
  • wl-stats: median, avg, min and max
  • wl-status: 0:success, >0: number of failed copies or error message
  • wl-info: version, description, cvmfs_checksum, etc
  • wl-custom: each experiment can add whichever information

This should close several JIRA tickets: BMK-79, BMK-211, BMK-78, BMK-107, BMK-169.

It still needs to be tested.

Merge request reports