Skip to content

Faster prepare output file

Nico Harringer requested to merge faster-prepare-output-file into master

By default, now uses condor to process files to speed up the workflow. If one wants to process the files locally, the --local flag is to be used.

Using the condor-way, one has to pay attention when processing data as an additional step wrt. the local-way is required:

Data processing with condor

Get Merged files (Data: Step 1)

python --input /absolute/input/path --cats --catDict /absolute/path/to/cat_data.json --varDict /absolute/path/to/varDict_data.json --syst --merge --output /absolute/output/path --condor

Get Merged files (Data: Step 2)

python --input /absolute/input/path --cats --catDict /absolute/path/to/cat_data.json --varDict /absolute/path/to/varDict_data.json --syst --merge --output /absolute/output/path --merge-data-only --condor

Get Root files (Data: Step 3)

python --input /absolute/input_path/to_folder_with_merged --cats --catDict /absolute/path/to/cat_data.json --varDict /absolute/path/to/varDict_data.json --syst --root --output /absolute/input_path/to_folder_with_merged --condor

Whenever the parquet files are merged (after step 1), a folder "merged" in the /absolute/output/path is created. For getting the ROOT files, one has to use the folder /absolute/output/path (which is now containing the "merged" subfolders) as the new input folder. The file processing for MC samples function in a similar way:

MC processing with condor

Get Merged files

python --input /absolute/input/path --cats --catDict /absolute/path/to/cat_mc.json --varDict /absolute/path/to/varDict_mc.json --syst --merge --output /absolute/output/path --condor

Get Root files

python --input /absolute/input_path/to_folder_with_merged --cats --catDict /absolute/path/to/cat_mc.json --varDict /absolute/path/to/varDict_mc.json --syst --root --output /absolute/input_path/to_folder_with_merged --condor
Edited by Nico Harringer

Merge request reports