Umami merge requestshttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests2022-02-09T16:22:24+01:00https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/408Fixing parallel processing of categories in pdf sampling2022-02-09T16:22:24+01:00Alexander FrochFixing parallel processing of categories in pdf samplingThis MR fixes the parallel running of the pdf sampling method.This MR fixes the parallel running of the pdf sampling method.Alexander FrochAlexander Frochhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/406Update docstrings in for resampling and add inline comments2022-02-09T16:28:02+01:00Alexander FrochUpdate docstrings in for resampling and add inline commentsThis MR updates the docstrings (not all) for the resampling methods. Also, a lot of inline comments for code explanation are added.This MR updates the docstrings (not all) for the resampling methods. Also, a lot of inline comments for code explanation are added.Alexander FrochAlexander Frochhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/404Adding docstring updates and inline comments2022-02-08T16:25:44+01:00Alexander FrochAdding docstring updates and inline commentsThis MR adds a bit more docstring updates and inline comments for the resamplingThis MR adds a bit more docstring updates and inline comments for the resamplingAlexander FrochAlexander Frochhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/401Copy config bugfix2022-02-07T15:56:36+01:00Samuel Van StroudCopy config bugfixFix copy path (go up one dir) and ensure out dir exists before attempting write.
By default, overwrite existing configs, but warn the user about this.Fix copy path (go up one dir) and ensure out dir exists before attempting write.
By default, overwrite existing configs, but warn the user about this.https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/393Preparation cleanups2022-02-03T15:20:14+01:00Manuel GuthPreparation cleanups- Cleaning up doc strings
- adding debug option printing all loaded input files
- printing only once which sample is written out and not after each batch- Cleaning up doc strings
- adding debug option printing all loaded input files
- printing only once which sample is written out and not after each batchhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/392allowing preprocess config without !include2022-02-02T18:10:34+01:00Manuel Guthallowing preprocess config without !includerelated to !386related to !386https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/391Fix track masking in the scaling2022-02-04T10:49:20+01:00Samuel Van StroudFix track masking in the scalingMasking previously happened in a few different places, in a few different ways.
- Checking the first track variable for `NaN`, and building a mask from this. Using the first variable was unstable, as we are relying on it being `float`, w...Masking previously happened in a few different places, in a few different ways.
- Checking the first track variable for `NaN`, and building a mask from this. Using the first variable was unstable, as we are relying on it being `float`, which is not always the case. This meant the masking broke for the new samples in some cases.
- Elsewhere, we used a `tracks == 0` check, after running `np.nan_to_num`. This is also broken as there are some possible `int` vars (e.g. `JFVertexIndex`, `leptonID`) for which `0` is a valid value, and the default for padded tracks is `-1`.
This implements a solution for both problems, using the `valid` flag (which is designed for this use) where possible, and, if it is not available, defaulting to finding the first `float` variable and using a `NaN` check.
@mguth @alfrochhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/387Fixing bug, missing tracks labels in preprocessed sample2022-02-01T13:41:50+01:00Stefano FranchellucciFixing bug, missing tracks labels in preprocessed sampleThis MR is addressing issue #132.
As mentioned in the issue, the problem was a hard-coded call to `"track_labels"` while with MR[!285](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/285) this should ...This MR is addressing issue #132.
As mentioned in the issue, the problem was a hard-coded call to `"track_labels"` while with MR[!285](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/285) this should be now specific to the track collection, thus `"track_labels"` -> `f"{tracks_name}_labels"`.
Closes #132
~"bug fix"https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/386Copy config files during pp2022-02-03T12:39:29+01:00Samuel Van StroudCopy config files during ppSo that the user can understand which settings were used to produce a training sample, we copy the configs as they are used during preprocessing to the output destination.
Closes #133So that the user can understand which settings were used to produce a training sample, we copy the configs as they are used during preprocessing to the output destination.
Closes #133https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/385doc string improvements2022-02-02T18:58:14+01:00Manuel Guthdoc string improvementshttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/384Fixing pdf sampling files path2022-02-01T15:11:39+01:00Alexander FrochFixing pdf sampling files pathThis MR changes the path of the `PDF_sampling` folder, where all the intermediate files of the pdf sampling are saved, from the `sample_path` to the `file_path` (In the current example this would be from `hybrids` to `preprocessed`. With...This MR changes the path of the `PDF_sampling` folder, where all the intermediate files of the pdf sampling are saved, from the `sample_path` to the `file_path` (In the current example this would be from `hybrids` to `preprocessed`. With this, you can run the `--prepare` step once for all classes you might wanna use (the resulting files are saved in hybrids) and then run different preprocessings (different classes for example) with those files in `hybrids`. If you would try to run the pdf sampling how it is now, the `PDF_sampling` folder (which changes) would be overwritten every time.Alexander FrochAlexander Frochhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/383Small preprocessing changes2022-01-31T14:22:19+01:00Samuel Van StroudSmall preprocessing changes- Allow for no jet cuts in config file
- Ensure integer chunk size in prepare stage
- ~~Ensure `preprocessing_tools` is tracked by git by removing `preprocessing_*` from the `gitignore` (@mguth I don't know if you can suggest an alternat...- Allow for no jet cuts in config file
- Ensure integer chunk size in prepare stage
- ~~Ensure `preprocessing_tools` is tracked by git by removing `preprocessing_*` from the `gitignore` (@mguth I don't know if you can suggest an alternative if this breaks some desired behaviour)~~
- Ensure output dir exists before writing indices filehttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/382Adding version and new test ci preprocess location2022-01-31T09:50:27+01:00Manuel GuthAdding version and new test ci preprocess locationThis mr adds
- version of package in `setup.py` and in module itself
- improving error message if `outfield_name` is not a `.h5` file
- renaming preprocessing integration test output folder to not be confused in `.gitignore`This mr adds
- version of package in `setup.py` and in module itself
- improving error message if `outfield_name` is not a `.h5` file
- renaming preprocessing integration test output folder to not be confused in `.gitignore`https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/374Merge master in protected branch2022-01-27T11:13:37+01:00Alexander FrochMerge master in protected branchThis MR adds the current master to `112-flexible-validation-test-file-definition`This MR adds the current master to `112-flexible-validation-test-file-definition`Alexander FrochAlexander Frochhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/371Fixing path of the pre and past sampling distribution plots2022-01-26T18:41:21+01:00Alexander FrochFixing path of the pre and past sampling distribution plotsThis MR concerns #127. The path where the distribution plots before and after sampling are saved is now defined by where the samples (from the `preparation` stage) are saved and not the absolute path where the command is called from.
Cl...This MR concerns #127. The path where the distribution plots before and after sampling are saved is now defined by where the samples (from the `preparation` stage) are saved and not the absolute path where the command is called from.
Closes #127Alexander FrochAlexander Frochhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/366Validation of variable content of h5 files2022-01-25T14:13:45+01:00Manuel GuthValidation of variable content of h5 filesThis MR introduces the following changes:
- added `compare_h5_files_variables` function to `data_tools`
- automating variable content in Writing of resampling (in count method) to not fail if samples have different variable content
- ad...This MR introduces the following changes:
- added `compare_h5_files_variables` function to `data_tools`
- automating variable content in Writing of resampling (in count method) to not fail if samples have different variable content
- adding table in preprocessing docs for `sampling`https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/363Merge master2022-01-21T15:49:02+01:00Alexander FrochMerge masterMerging master in protected branchMerging master in protected branchAlexander FrochAlexander Frochhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/351Merge Master in Validation remake branch2022-01-19T11:46:37+01:00Alexander FrochMerge Master in Validation remake branchAlexander FrochAlexander Frochhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/348restructure umami to remove tf dependencies for plotting2022-01-21T13:46:33+01:00Philipp Gadowrestructure umami to remove tf dependencies for plottingThis MR moves some functions around to avoid tensor-flow dependencies for plotting related tasks.
It also modifies the output of the `plot_input_variables.py` script to remove the hard-coded prefix.This MR moves some functions around to avoid tensor-flow dependencies for plotting related tasks.
It also modifies the output of the `plot_input_variables.py` script to remove the hard-coded prefix.Philipp GadowPhilipp Gadowhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/346Pylint improvements2022-01-19T11:12:19+01:00Manuel GuthPylint improvementsadresses #105adresses #105