Faster prepare output file
By default, now prepare_output_file.py
uses condor to process files to speed up the workflow. If one wants to process the files locally, the --local
flag is to be used.
Using the condor-way, one has to pay attention when processing data as an additional step wrt. the local-way is required:
Data processing with condor
Get Merged files (Data: Step 1)
python prepare_output_file.py --input /absolute/input/path --cats --catDict /absolute/path/to/cat_data.json --varDict /absolute/path/to/varDict_data.json --syst --merge --output /absolute/output/path --condor
Get Merged files (Data: Step 2)
python prepare_output_file.py --input /absolute/input/path --cats --catDict /absolute/path/to/cat_data.json --varDict /absolute/path/to/varDict_data.json --syst --merge --output /absolute/output/path --merge-data-only --condor
Get Root files (Data: Step 3)
python prepare_output_file.py --input /absolute/input_path/to_folder_with_merged --cats --catDict /absolute/path/to/cat_data.json --varDict /absolute/path/to/varDict_data.json --syst --root --output /absolute/input_path/to_folder_with_merged --condor
Whenever the parquet files are merged (after step 1), a folder "merged" in the /absolute/output/path
is created. For getting the ROOT files, one has to use the folder /absolute/output/path
(which is now containing the "merged" subfolders) as the new input folder. The file processing for MC samples function in a similar way:
MC processing with condor
Get Merged files
python prepare_output_file.py --input /absolute/input/path --cats --catDict /absolute/path/to/cat_mc.json --varDict /absolute/path/to/varDict_mc.json --syst --merge --output /absolute/output/path --condor
Get Root files
python prepare_output_file.py --input /absolute/input_path/to_folder_with_merged --cats --catDict /absolute/path/to/cat_mc.json --varDict /absolute/path/to/varDict_mc.json --syst --root --output /absolute/input_path/to_folder_with_merged --condor
Edited by Nico Harringer