Faster prepare output file (!218) · Merge requests · cms-analysis / General / HiggsDNA

Nico Harringer requested to merge faster-prepare-output-file into master May 23, 2024

By default, now prepare_output_file.py uses condor to process files to speed up the workflow. If one wants to process the files locally, the --local flag is to be used.

Using the condor-way, one has to pay attention when processing data as an additional step wrt. the local-way is required:

Data processing with condor

Get Merged files (Data: Step 1)

python prepare_output_file.py --input /absolute/input/path --cats --catDict /absolute/path/to/cat_data.json --varDict /absolute/path/to/varDict_data.json --syst --merge --output /absolute/output/path --condor

Get Merged files (Data: Step 2)

python prepare_output_file.py --input /absolute/input/path --cats --catDict /absolute/path/to/cat_data.json --varDict /absolute/path/to/varDict_data.json --syst --merge --output /absolute/output/path --merge-data-only --condor

Get Root files (Data: Step 3)

python prepare_output_file.py --input /absolute/input_path/to_folder_with_merged --cats --catDict /absolute/path/to/cat_data.json --varDict /absolute/path/to/varDict_data.json --syst --root --output /absolute/input_path/to_folder_with_merged --condor

Whenever the parquet files are merged (after step 1), a folder "merged" in the /absolute/output/path is created. For getting the ROOT files, one has to use the folder /absolute/output/path (which is now containing the "merged" subfolders) as the new input folder. The file processing for MC samples function in a similar way:

MC processing with condor

Get Merged files

python prepare_output_file.py --input /absolute/input/path --cats --catDict /absolute/path/to/cat_mc.json --varDict /absolute/path/to/varDict_mc.json --syst --merge --output /absolute/output/path --condor

Get Root files

python prepare_output_file.py --input /absolute/input_path/to_folder_with_merged --cats --catDict /absolute/path/to/cat_mc.json --varDict /absolute/path/to/varDict_mc.json --syst --root --output /absolute/input_path/to_folder_with_merged --condor

Edited May 29, 2024 by Nico Harringer

Faster prepare output file

Data processing with condor

MC processing with condor

Merge request reports