Umami merge requestshttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests2022-11-21T13:41:21+01:00https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/664Simplify flavour map2022-11-21T13:41:21+01:00Tomke SchroerSimplify flavour map## Summary
This MR introduces the following changes
* Saving the labels in resampling step as int labels -> no mapping needed in writing step
* Writing the one-hot labels only in writing step
* This also solves the problem if two label...## Summary
This MR introduces the following changes
* Saving the labels in resampling step as int labels -> no mapping needed in writing step
* Writing the one-hot labels only in writing step
* This also solves the problem if two labels are defined with the same label_value (but different variables) or the label_value is a list
## Conformity
- [x] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/blob/master/changelog.md)
- [x] [Documentation](https://umami-docs.web.cern.ch)
- [x] [Development guidelines](https://umami-docs.web.cern.ch/setup/development/)
- [x] [Style guides](https://umami-docs.web.cern.ch/setup/development/good-practices/)Tomke SchroerTomke Schroerhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/663Adding full precision calculation of the scale/shift dicts2022-11-10T15:31:50+01:00Alexander FrochAdding full precision calculation of the scale/shift dicts## Summary
This MR introduces the following changes
* Enforce full precision for the calculation of the scaling/shifting values
Relates to the following issues
* #212
## Conformity
- [x] [Changelog entry](https://gitlab.cern.ch/atla...## Summary
This MR introduces the following changes
* Enforce full precision for the calculation of the scaling/shifting values
Relates to the following issues
* #212
## Conformity
- [x] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/blob/master/changelog.md)
- [x] [Documentation](https://umami-docs.web.cern.ch)
- [x] [Development guidelines](https://umami-docs.web.cern.ch/setup/development/)
- [x] [Style guides](https://umami-docs.web.cern.ch/setup/development/good-practices/)Alexander FrochAlexander Frochhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/662Adapting split in train/val/test2022-11-02T18:35:14+01:00Alexander FrochAdapting split in train/val/test## Summary
This MR introduces the following changes
* Changing the ratio split between train/val/test to 80%/10%/10%
Relates to the following issues
* Closes #215
## Conformity
- [X] [Changelog entry](https://gitlab.cern.ch/atlas-f...## Summary
This MR introduces the following changes
* Changing the ratio split between train/val/test to 80%/10%/10%
Relates to the following issues
* Closes #215
## Conformity
- [X] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/blob/master/changelog.md)
- [X] [Documentation](https://umami-docs.web.cern.ch)
- [X] [Development guidelines](https://umami-docs.web.cern.ch/setup/development/)
- [X] [Style guides](https://umami-docs.web.cern.ch/setup/development/good-practices/)Alexander FrochAlexander Frochhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/660Adding integration test for DL1* with tfrecords2022-10-31T17:46:49+01:00Alexander FrochAdding integration test for DL1* with tfrecords## Summary
This MR introduces the following changes
* Enable hybrid validation sample creation for all integration tests
* Adding integration test for DL1* with tfrecords
* Fixing issue in the tfrecord writing/loading for DL1*
## Conf...## Summary
This MR introduces the following changes
* Enable hybrid validation sample creation for all integration tests
* Adding integration test for DL1* with tfrecords
* Fixing issue in the tfrecord writing/loading for DL1*
## Conformity
- [X] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/blob/master/changelog.md)
- [X] [Documentation](https://umami-docs.web.cern.ch)
- [X] [Development guidelines](https://umami-docs.web.cern.ch/setup/development/)
- [X] [Style guides](https://umami-docs.web.cern.ch/setup/development/good-practices/)Alexander FrochAlexander Frochhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/659Introduced writing of validation samples2022-10-27T17:08:09+02:00Nikita Ivvan PondIntroduced writing of validation samples## Summary
This MR introduces the following changes
* Allows the --hybrid_validation flag to be used in the write step of preprocessing. This applies scaling/shifting, and outputs in the same format as the training files.
## Conform...## Summary
This MR introduces the following changes
* Allows the --hybrid_validation flag to be used in the write step of preprocessing. This applies scaling/shifting, and outputs in the same format as the training files.
## Conformity
- [x] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/blob/master/changelog.md)
- [x] [Documentation](https://umami-docs.web.cern.ch)
- [x] [Development guidelines](https://umami-docs.web.cern.ch/setup/development/)
- [x] [Style guides](https://umami-docs.web.cern.ch/setup/development/good-practices/)Nikita Ivvan PondNikita Ivvan Pondhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/657Add VR track jet configs2023-01-31T10:17:43+01:00Philipp GadowAdd VR track jet configs## Summary
This MR introduces the following changes
* add configuration files for VR track jets
* fix bug in correct_fractions function inside the pdf sampling that resulted in negative jet numbers
* add a segment in the documentation ...## Summary
This MR introduces the following changes
* add configuration files for VR track jets
* fix bug in correct_fractions function inside the pdf sampling that resulted in negative jet numbers
* add a segment in the documentation about VR track jets
## Conformity
- [x] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/blob/master/changelog.md)
- [x] [Documentation](https://umami-docs.web.cern.ch)
- [x] [Development guidelines](https://umami-docs.web.cern.ch/setup/development/)
- [x] [Style guides](https://umami-docs.web.cern.ch/setup/development/good-practices/)Frederic RennerFrederic Rennerhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/655Fixing issue in the try except blocks of the preprocessing plots2022-10-21T14:56:39+02:00Alexander FrochFixing issue in the try except blocks of the preprocessing plots## Summary
This MR introduces the following changes
* Fixing an issue in the try except blocks of the preprocessing plots. The plots were not correctly made. This is fixed now
## Conformity
- [X] [Changelog entry](https://gitlab.cern....## Summary
This MR introduces the following changes
* Fixing an issue in the try except blocks of the preprocessing plots. The plots were not correctly made. This is fixed now
## Conformity
- [X] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/blob/master/changelog.md)
- [X] [Documentation](https://umami-docs.web.cern.ch)
- [X] [Development guidelines](https://umami-docs.web.cern.ch/setup/development/)
- [X] [Style guides](https://umami-docs.web.cern.ch/setup/development/good-practices/)Alexander FrochAlexander Frochhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/654Setting default value for concat_jet_tracks2022-10-19T14:08:04+02:00Alexander FrochSetting default value for concat_jet_tracks## Summary
This MR introduces the following changes
* Setting default for `concat_jet_tracks` to `False` if it is not defined in the config
## Conformity
- [X] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algori...## Summary
This MR introduces the following changes
* Setting default for `concat_jet_tracks` to `False` if it is not defined in the config
## Conformity
- [X] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/blob/master/changelog.md)
- [X] [Documentation](https://umami-docs.web.cern.ch)
- [X] [Development guidelines](https://umami-docs.web.cern.ch/setup/development/)
- [X] [Style guides](https://umami-docs.web.cern.ch/setup/development/good-practices/)Alexander FrochAlexander Frochhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/653Adding support for non-top level and special named jet- and track collections2022-10-24T14:10:57+02:00Alexander FrochAdding support for non-top level and special named jet- and track collections## Summary
This MR introduces the following changes
* Add support for non-top level jet and track collections. This can be set over the "collecton_name" option in the "preparation" part of the preprocessing config now. After the prepar...## Summary
This MR introduces the following changes
* Add support for non-top level jet and track collections. This can be set over the "collecton_name" option in the "preparation" part of the preprocessing config now. After the preparation step, the jet- and track collections will always be top level.
* Add support for different named jet collections. The default is "jets" which can be set over the "jets_name" option in the "preparation" part of the preprocessing config now. After the preparation step, the jet collection is always top level and is called "jets".
* Documentation for the new options is also added.
Tagging @dguest @tmadula
## Conformity
- [X] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/blob/master/changelog.md)
- [X] [Documentation](https://umami-docs.web.cern.ch)
- [X] [Development guidelines](https://umami-docs.web.cern.ch/setup/development/)
- [X] [Style guides](https://umami-docs.web.cern.ch/setup/development/good-practices/)Alexander FrochAlexander Frochhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/651ttbar-merging implementation2022-11-17T09:04:20+01:00Jackson Barrttbar-merging implementation## Summary
This MR introduces the following changes
* Merges mc21 single and dileptonic ttbar events into a single ttbar file for training
Relates to the following issues
* #210
## Conformity
- [x] [Changelog entry](https://gitlab....## Summary
This MR introduces the following changes
* Merges mc21 single and dileptonic ttbar events into a single ttbar file for training
Relates to the following issues
* #210
## Conformity
- [x] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/blob/master/changelog.md)
- [x] [Documentation](https://umami-docs.web.cern.ch)
- [x] [Development guidelines](https://umami-docs.web.cern.ch/setup/development/)
- [x] [Style guides](https://umami-docs.web.cern.ch/setup/development/good_practices_code/)https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/649rewrite selection2022-12-05T14:06:51+01:00Tomke Schroerrewrite selection## Summary
This MR introduces the following changes
* correct loading of selections in test file
* rewriting selection of global config to make more complex selection possible
## Conformity
- [x] [Changelog entry](https://gitlab.cern....## Summary
This MR introduces the following changes
* correct loading of selections in test file
* rewriting selection of global config to make more complex selection possible
## Conformity
- [x] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/blob/master/changelog.md)
- [x] [Documentation](https://umami-docs.web.cern.ch)
- [x] [Development guidelines](https://umami-docs.web.cern.ch/setup/development/)
- [x] [Style guides](https://umami-docs.web.cern.ch/setup/development/good-practices/)Tomke SchroerTomke Schroerhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/648Organise/improve train file writing2022-11-04T10:01:57+01:00Samuel Van StroudOrganise/improve train file writing## Summary
This MR introduces the following changes
* Use lzf compression by default (improve read speed)
* Map flavour labels from 0, 4 5 -> 0, 1, 2
* Store all jet and track type datasets in respective groups
* Save valid flag for tr...## Summary
This MR introduces the following changes
* Use lzf compression by default (improve read speed)
* Map flavour labels from 0, 4 5 -> 0, 1, 2
* Store all jet and track type datasets in respective groups
* Save valid flag for track type data
* Make sure `preprocessed` dir is created before resampling stage
* Error if no input files are found during prepare stage
* Clean up taus preproecssing file (do we really need to keep this? perhaps we can just write in the docs that the class labels line needs to be changes, or start to include taus by default?).
* Option to concatenate jet and track inputs (on by default since I believe this is used for most track based taggers)
The main change is the reorganisation of the train file which I guess will break some things. The structure is now as below, which is more organised and much more suited for multiple track-type groups.
```
/jets Group
/jets/inputs Dataset {41808, 2}
/jets/labels Dataset {41808}
/jets/labels_one_hot Dataset {41808, 3}
/jets/weight Dataset {41808}
/tracks_loose Group
/tracks_loose/inputs Dataset {41808, 40, 21}
/tracks_loose/labels Dataset {41808, 40, 2}
/tracks_loose/valid Dataset {41808, 40}
```
Relates to the following issues
* ticking several boxes here https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/issues/207
## Conformity
- [x] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/blob/master/changelog.md)
- [x] [Documentation](https://umami-docs.web.cern.ch)
- [x] [Development guidelines](https://umami-docs.web.cern.ch/setup/development/)
- [x] [Style guides](https://umami-docs.web.cern.ch/setup/development/good-practices/)Samuel Van StroudSamuel Van Stroudhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/646Adding proper hybrid validation sample creation2022-10-11T16:50:22+02:00Alexander FrochAdding proper hybrid validation sample creation## Summary
This MR introduces the following changes
* Adding `Preprocessing-samples.yaml` in which all the samples which are created by the `prepare` step are saved now (to clean the main config a bit up)
* Adding a proper way to produ...## Summary
This MR introduces the following changes
* Adding `Preprocessing-samples.yaml` in which all the samples which are created by the `prepare` step are saved now (to clean the main config a bit up)
* Adding a proper way to produce the resampled hybrid validation samples. This can now be done when running `--resampling` by also giving the `--hybrid_validation` flag.
* Adding the needed options for the creation of the resampled hybrid validation samples in the config files.
* Documentation is updated for the new creation.
* Added the creation to the DL1r integration tests (is the quickest one due to not using tracks)
* Renaming some options in the preprocessing config file (therefore no backwards compatability!)
Relates to the following issues
* Touching #140
## Conformity
- [X] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/blob/master/changelog.md)
- [X] [Documentation](https://umami-docs.web.cern.ch)
- [X] [Development guidelines](https://umami-docs.web.cern.ch/setup/development/)
- [X] [Style guides](https://umami-docs.web.cern.ch/setup/development/good_practices_code/)Alexander FrochAlexander Frochhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/644Adding yaml to requirements2022-09-26T15:55:43+02:00Manuel GuthAdding yaml to requirements## Summary
This MR introduces the following changes
* Adding `yaml` to requirements mainly because in the `umami-slim` image no `yaml` is installed
## Conformity
- [x] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tool...## Summary
This MR introduces the following changes
* Adding `yaml` to requirements mainly because in the `umami-slim` image no `yaml` is installed
## Conformity
- [x] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/blob/master/changelog.md)
- [x] [Development guidelines](https://umami-docs.web.cern.ch/setup/development/)
- [x] [Style guides](https://umami-docs.web.cern.ch/setup/development/good-practices/)https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/641Fixing ufunc issue in scale/shift application2022-09-20T09:41:40+02:00Alexander FrochFixing ufunc issue in scale/shift application## Summary
This MR introduces the following changes
* Fixing an issue with the application of the scale/shift values. When different types (float32 for the variable and float64 for the scale/shift value) an error is raised `cannot cast...## Summary
This MR introduces the following changes
* Fixing an issue with the application of the scale/shift values. When different types (float32 for the variable and float64 for the scale/shift value) an error is raised `cannot cast ufung 'subtract' output from dtype('float64') to dtype('float32') with casting rule 'same_kind'`. This is fixed by removing the `ufunc` `-=` and `/=` and replace them with the usual longer type.
Tagging @sargyrop
## Conformity
- [X] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/blob/master/changelog.md)
- [X] [Documentation](https://umami-docs.web.cern.ch)
- [X] [Development guidelines](https://umami-docs.web.cern.ch/setup/development/)
- [X] [Style guides](https://umami-docs.web.cern.ch/setup/development/good-practices/)Alexander FrochAlexander Frochhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/635readding #!/usr/bin/env python to executable scripts2022-09-12T11:52:59+02:00Manuel Guthreadding #!/usr/bin/env python to executable scripts## Summary
This MR introduces the following changes
* in https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/591 all the `#!/usr/bin/env python` were removed, they need to be added back now for the execu...## Summary
This MR introduces the following changes
* in https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/591 all the `#!/usr/bin/env python` were removed, they need to be added back now for the executable scripts, if not they are not able to run in the packaged images
## Conformity
- [ ] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/blob/master/changelog.md)
- [x] [Development guidelines](https://umami-docs.web.cern.ch/setup/development/)
- [x] [Style guides](https://umami-docs.web.cern.ch/setup/development/good-practices/)https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/632Adding function to flatten nested lists2022-09-02T16:02:48+02:00Manuel GuthAdding function to flatten nested lists## Summary
This MR introduces the following changes
* adding `flatten_list` function to flatten any arbitrarily nested list, especially useful when using anchors for the sample cuts
## Conformity
- [x] [Changelog entry](https://gitla...## Summary
This MR introduces the following changes
* adding `flatten_list` function to flatten any arbitrarily nested list, especially useful when using anchors for the sample cuts
## Conformity
- [x] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/blob/master/changelog.md)
- [x] [Development guidelines](https://umami-docs.web.cern.ch/setup/development/)
- [x] [Style guides](https://umami-docs.web.cern.ch/setup/development/good-practices/)https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/631adding randomise option to input_h5 block in config2022-08-31T16:59:24+02:00Manuel Guthadding randomise option to input_h5 block in config## Summary
This MR introduces the following changes
* Adding option to randomise the file names which are read in in the prepare step, especially useful if you have several data taking campaigns and initialise all at once. For the vali...## Summary
This MR introduces the following changes
* Adding option to randomise the file names which are read in in the prepare step, especially useful if you have several data taking campaigns and initialise all at once. For the validation and test samples such that they are also representative and don't only take the first samples from one campaign.
## Conformity
- [x] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/blob/master/changelog.md)
- [x] [Documentation](https://umami-docs.web.cern.ch)
- [x] [Development guidelines](https://umami-docs.web.cern.ch/setup/development/)
- [x] [Style guides](https://umami-docs.web.cern.ch/setup/development/good-practices/)Manuel GuthManuel Guthhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/625Adding possibility to evaluate classes the freshly trained tagger is not trai...2022-09-21T13:33:18+02:00Alexander FrochAdding possibility to evaluate classes the freshly trained tagger is not trained on## Summary
This MR introduces the following changes
* Adding the new evaluation option `extra_classes_to_evaluate`. Here you can add classes for which the tagger is not trained. When the evaluation is run, these extra classes are also ...## Summary
This MR introduces the following changes
* Adding the new evaluation option `extra_classes_to_evaluate`. Here you can add classes for which the tagger is not trained. When the evaluation is run, these extra classes are also loaded and the tagger is evaluated with them. With this, you can test the behaviour of the freshly trained tagger for jets it is not trained for.
* Adding a fix for the confusion matrix. If a flavour is not used for the matrix, although the network was evaluated on it, the plotting would return an error. This is now fixed by removing all unwanted flavour jets from the matrix plotting.
* Making the rejection calculation more robust. When a `ZeroDivisionError` is given, the value added to the rejection dict is now `NaN`.
Relates to the following issues
* Closes #129
## Conformity
- [X] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/blob/master/changelog.md)
- [X] [Documentation](https://umami-docs.web.cern.ch)
- [X] [Development guidelines](https://umami-docs.web.cern.ch/setup/development/)
- [X] [Style guides](https://umami-docs.web.cern.ch/setup/development/good-practices/)Alexander FrochAlexander Frochhttps://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/622Remove preprocessing config from loading functions2022-08-30T17:45:57+02:00Alexander FrochRemove preprocessing config from loading functions## Summary
This MR introduces the following changes
* Removing the preprocessing config from the load functions and replace it with the scale dict, the only thing that is needed from the config.
* `get_test_sample` and `get_test_sample...## Summary
This MR introduces the following changes
* Removing the preprocessing config from the load functions and replace it with the scale dict, the only thing that is needed from the config.
* `get_test_sample` and `get_test_sample_trks` can now take the path of the `var_dict` and the `scale_dict` or directly the loaded dicts
* Adding possibility to define which step of `evaluate_model.py` is run via command line argument.
* Removing useless `tagger` command line option from `evaluate_model.py`. This is a needed variable in the train config and therefore always given.
* Adding progress bar to the different steps of the `evaluate_model.py` script. By default, this is `False`. When calling `evaluate_model.py` with `-v`, the progress bar is shown.
## Conformity
- [X] [Changelog entry](https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/blob/master/changelog.md)
- [X] [Documentation](https://umami-docs.web.cern.ch)
- [X] [Development guidelines](https://umami-docs.web.cern.ch/setup/development/)
- [X] [Style guides](https://umami-docs.web.cern.ch/setup/development/good-practices/)Alexander FrochAlexander Froch