From 7d725e2395eff43113fcad35a1cbb9becb177f56 Mon Sep 17 00:00:00 2001 From: Steffen Korn <steffen.korn@cern.ch> Date: Thu, 15 Jun 2023 20:16:23 +0200 Subject: [PATCH 1/3] Updating README --- README.md | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index 5d53d3fd..199f2fd6 100644 --- a/README.md +++ b/README.md @@ -16,19 +16,22 @@ structure should ideally be the following: ``` MVA-Trainer/ ├── config/ + ├── docs/ + ├── python/ ├── scripts/ - └── [other directories] + └── [other files] ``` -While [config](config) will host your config files, [scripts](scripts) hosts the main Python3 scripts as well as some helper modules. +* [config](config) will host your config files. +* [python](python) hosts the main Python3 script as well as all necessary modules. +* [docs](docs) hosts the web-based documentation. +* [scripts](scripts) hosts some scripts (e.g.) for HTCondor, ROOT macros and CI-related scripts. -If you are working on a local system the following command takes care of the entire setup: +If you are working on a local system the following command takes care of the setup: ```sh source setup.sh ``` The setup script checks whether all required libraries are installed. In case you are missing any libraries please install them manually. For this you can use in example `pip3`. -Check out `requirement.txt` and install the required modules. - -Furthermore, the setup script also creates aliases for you such that you can run the code from anywhere. It creates the `Trainer`, `Converter`, `Evaluate`, `MVAInjector`, `Optimise` and `SummariseOptimisation` aliases which serve as shortcuts to the actual scripts. \ No newline at end of file +Check out `requirement.txt` and install the required modules. \ No newline at end of file -- GitLab From 053f72b7127be22ca2d62c9fb1027105b9062899 Mon Sep 17 00:00:00 2001 From: Steffen Korn <steffen.korn@cern.ch> Date: Thu, 15 Jun 2023 20:40:55 +0200 Subject: [PATCH 2/3] Updating web documentation --- docs/Setup/file.md | 35 ++++---------------------- docs/Short_walkthrough/RunInjection.md | 6 +++-- 2 files changed, 9 insertions(+), 32 deletions(-) diff --git a/docs/Setup/file.md b/docs/Setup/file.md index afa0f227..c325ba31 100644 --- a/docs/Setup/file.md +++ b/docs/Setup/file.md @@ -15,33 +15,8 @@ Please install any missing packages. Each time you start a new shell you need to run the setup script again. It will set environmental variables that are picked up by the code. The code itself is python based and hence does not need any compilation. -# Package dependencies - -The code should work using these package versions: - -| **Package** | **Version** | -| ------ | ------ | -| Python | 3.8 or later| -| uproot | 5.02 or later -| pandas | 1.5.2 or later | -| scikit-learn | 1.2.0 | -| skl2onnx | 1.14.0 or later | -| tables | 3.8.0 or later | -| matplotlib | 3.6.2 or later | -| pydot | 1.4.2 or later | -| torch | 1.12.1 or later | - -The following lines should install the necessary prerequisities: -```sh -python3 -m pip install --no-cache-dir --upgrade uproot==5.0.2 -python3 -m pip install --no-cache-dir --upgrade pandas==1.5.2 -python3 -m pip install --no-cache-dir --upgrade scikit-learn==1.2.0 -python3 -m pip install --no-cache-dir --upgrade tables==3.8.0 -python3 -m pip install --no-cache-dir --upgrade matplotlib==3.6.2 -python3 -m pip install --no-cache-dir --upgrade pydot==1.4.2 -python3 -m pip install --no-cache-dir --upgrade skl2onnx -python3 -m pip install --no-cache-dir torch==1.12.1 torchvision torchaudio -python3 -m pip install --no-cache-dir torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -``` - -Please raise an issue if you find versions that are leading to conflicts between packages. \ No newline at end of file +Depending on your system of choice different methods to run the code might be advisable. +You may +* manually install the packages and run the software. This makes sense when you are working on a "local" device +* run the software within a virtual environment. This makes sense when you run on (e.g.) lxplus or your local cluster +* use a docker container. This makes sense when you have docker available and don't want to touch anything in the code. \ No newline at end of file diff --git a/docs/Short_walkthrough/RunInjection.md b/docs/Short_walkthrough/RunInjection.md index cebcbed1..4fe8834b 100644 --- a/docs/Short_walkthrough/RunInjection.md +++ b/docs/Short_walkthrough/RunInjection.md @@ -55,6 +55,8 @@ To inject NN predictions back into Ntuples you can do: python3 python/mva-trainer.py -i <input_path> -o <output_path> -c <config_path> --inject --processes <n_processes> --filter <filter_string> ``` Hereby `<input_path>` describes the path to input Ntuples. The `<output_path>` argument describes the path to an output directory in which the injected Ntuples will be stored. The directory and the sub-directory structure will automatically be created for you. The config file is **again** passed via the `<config_path>` argument. -You can pass multiple config files simultaneously inject multiple neural networks into Ntuples using the `--additionalconfigs` argument. -Using the *processes* argument you can specify a number of independent processes that are run simultaneously. Each process will inject the prediction into one root file and start a new process upon termination. Using the `filter` argument you can restrict your injection using a string as a wildcard. +For the injection you have a few specific command line option that you can pass: +* `--additionalconfigs`: You can pass multiple config files simultaneously inject multiple neural networks into Ntuples using the `--additionalconfigs` argument. +* `--processes`: Using the `--processes` argument you can specify a number of independent processes that are run simultaneously. Each process will inject the prediction into one root file and start a new process upon termination. Using the `filter` argument you can restrict your injection using a string as a wildcard. Keep in mind that this can be very memory-consuming because the entire root-file is loaded. Restrict this to a lower number (e.g. 2-4) in case you have memory limitations. +* `--outputonly`: You might be interested in getting only the output of your model and not all the other branches in the trees. In this case add `--outputonly`. -- GitLab From 42748265d4a423c35c1d987c9745adea19d80345 Mon Sep 17 00:00:00 2001 From: Steffen Korn <steffen.korn@cern.ch> Date: Thu, 15 Jun 2023 20:41:07 +0200 Subject: [PATCH 3/3] Updating help strings and order --- python/mva-trainer.py | 100 +++++++++++++++++++++--------------------- 1 file changed, 51 insertions(+), 49 deletions(-) diff --git a/python/mva-trainer.py b/python/mva-trainer.py index 2b6d4391..63081640 100755 --- a/python/mva-trainer.py +++ b/python/mva-trainer.py @@ -38,72 +38,74 @@ if __name__ == "__main__": parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter) common_arguments = parser.add_argument_group(title="\033[1mCommon arguments\033[0m") injection_arguments = parser.add_argument_group(title="\033[1mInjection-specific arguments\033[0m") + optimisation_arguments = parser.add_argument_group(title="\033[1mOptimisation-specific arguments\033[0m") other_arguments = parser.add_argument_group(title="\033[1mOther arguments\033[0m") # Common arguments common_arguments.add_argument("-c", "--configfile", help="Config file", required=True) common_arguments.add_argument("--convert", - help="flag to set in case you want to run the conversion", - action='store_true', - dest="convert", - default=False) + help="flag to set in case you want to run the conversion", + action='store_true', + dest="convert", + default=False) common_arguments.add_argument("--train", - help="flag to set in case you want to run the training", - action='store_true', - dest="train", - default=False) + help="flag to set in case you want to run the training", + action='store_true', + dest="train", + default=False) common_arguments.add_argument("--evaluate", - help="flag to set in case you want to run the evaluation", - action='store_true', - dest="evaluate", - default=False) + help="flag to set in case you want to run the evaluation", + action='store_true', + dest="evaluate", + default=False) common_arguments.add_argument("--inject", - help="flag to set in case you want to run the injection", - action='store_true', - dest="inject", - default=False) + help="flag to set in case you want to run the injection", + action='store_true', + dest="inject", + default=False) common_arguments.add_argument("--optimise", - help="flag to set in case you want to run the optimisation", - action='store_true', - dest="optimise", - default=False) + help="flag to set in case you want to run the optimisation", + action='store_true', + dest="optimise", + default=False) # Injection-specific arguments injection_arguments.add_argument("-i", "--input_path", help="Ntuple path") injection_arguments.add_argument("-o", "--output_path", help="output path for the converted ntuples") injection_arguments.add_argument("-f", - "--filter", - nargs="+", - dest="wildcard", - help="additional string to filter input files", - default=['']) + "--filter", + nargs="+", + dest="wildcard", + help="additional string to filter input files", + default=['']) injection_arguments.add_argument("--ignore", - nargs="+", - dest="ignore", - help="additional string to ignore input files", - default=['']) + nargs="+", + dest="ignore", + help="additional string to ignore input files", + default=['']) injection_arguments.add_argument("-t", "--treefilter", dest="treewildcard", help="additional string to filter trees", type=str, default='') injection_arguments.add_argument("--processes", - help="the number of parallel processes to run", - type=int, - default=8) + help="the number of parallel processes to run", + type=int, + default=8) injection_arguments.add_argument("--additionalconfigs", help="Additional config files for injection", nargs="+",) injection_arguments.add_argument("--outputonly", help="Only inject the output and don't clone the full tree", action="store_true", dest="outputonly", default=False) - injection_arguments.add_argument("--optimisationpath", - dest="optimisationpath", - help="Path to directory to store optimisation configs.") - injection_arguments.add_argument("--nModels", - dest="nModels", - help="Number of configs to be created for hyperparameter optimisation.", - type=int, - default=10) - injection_arguments.add_argument("--HO_options", - help="Hyperparameter running option", - choices=[ - "Converter", - "Trainer", - "Evaluater", - "All"], - dest="HO_options", - default="Trainer") + # Optimisation-specific arguments + optimisation_arguments.add_argument("--optimisationpath", + dest="optimisationpath", + help="Path to directory to store optimisation configs.") + optimisation_arguments.add_argument("--nModels", + dest="nModels", + help="Number of configs to be created for hyperparameter optimisation.", + type=int, + default=10) + optimisation_arguments.add_argument("--HO_options", + help="Hyperparameter running option", + choices=[ + "Converter", + "Trainer", + "Evaluater", + "All"], + dest="HO_options", + default="Trainer") # Other arguments other_arguments.add_argument("--plots", nargs="+", -- GitLab