inference merge requestshttps://gitlab.cern.ch/hh/tools/inference/-/merge_requests2024-03-01T17:34:32+01:00https://gitlab.cern.ch/hh/tools/inference/-/merge_requests/89Updaing the naming convention of the POIs2024-03-01T17:34:32+01:00Aravind Thachayath SugunanUpdaing the naming convention of the POIs* moving (c3,d4) --> (kl,k4).
* The HHHSample takes in c3,d4 to keep consistancy towards the sample name, internally it stores kl,k4 (c3+1 ,d4+1)
* cleanup of the code* moving (c3,d4) --> (kl,k4).
* The HHHSample takes in c3,d4 to keep consistancy towards the sample name, internally it stores kl,k4 (c3+1 ,d4+1)
* cleanup of the codehttps://gitlab.cern.ch/hh/tools/inference/-/merge_requests/88Rework of the 'skip' feature for additional non non-resonant HH signals such...2024-01-31T12:57:13+01:00Torben LangeRework of the 'skip' feature for additional non non-resonant HH signals such...Rework of the 'skip' feature for additional non non-resonant HH signals such as SH and later resonant needed for new models.Rework of the 'skip' feature for additional non non-resonant HH signals such as SH and later resonant needed for new models.https://gitlab.cern.ch/hh/tools/inference/-/merge_requests/87HHH nonresonant model2024-02-01T14:18:42+01:00Alexandra Carvalho Antunes De OliveiraHHH nonresonant modelWork with @mazumdar @mukherje (I did not find Aravind to tag now)
Developments to come:
- make model return the formula to inference draw (because of this only likelihood scans are working now)
- implement reset_pois and get_formulae ...Work with @mazumdar @mukherje (I did not find Aravind to tag now)
Developments to come:
- make model return the formula to inference draw (because of this only likelihood scans are working now)
- implement reset_pois and get_formulae to have the samples list to check (if necessary)
- By now @tolange will implement a hot fix of skip samples checking
- add kt in description (might need more signal samples)
- retest the basis stability
- integration with HH model, can be already with the kl-kt-c2 one for definitiveness
- be sure that c3 = kl without any factors floating
- integration with HH modelling
- implement r_hhh in position to r = that would be the sum of HHH + HH GGF + (VBF HH ?)
- integration H BR scaling
- integration with single H scaling (placeholder for when we add kt to modeling to HHH part)
As soon as the hotfix and the upper limits are working I will mark as ready to PR even if not all expected features are finished
Ps.: pipelines will always fail for comas and spaces, I would not spend time on this up to very endhttps://gitlab.cern.ch/hh/tools/inference/-/merge_requests/86Fix weird bug with new numpy version.2023-11-28T19:10:51+01:00Torben LangeFix weird bug with new numpy version.Comparisions with np.nan and numpy ndarrays dont work as intended in the current version. This hotfix should be backwards compatible and while less clean give the same resultsComparisions with np.nan and numpy ndarrays dont work as intended in the current version. This hotfix should be backwards compatible and while less clean give the same resultshttps://gitlab.cern.ch/hh/tools/inference/-/merge_requests/85New Model for EFT interpretations2023-11-28T19:29:35+01:00Torben LangeNew Model for EFT interpretationsAdding new theory model interpretations using the HEFT parametrisation of our data cards.Adding new theory model interpretations using the HEFT parametrisation of our data cards.https://gitlab.cern.ch/hh/tools/inference/-/merge_requests/84Make combine v9.1.0 the default, customize workspace performance flags2023-06-28T10:13:06+02:00Marcel Riegermarcel.rieger@cern.chMake combine v9.1.0 the default, customize workspace performance flagsThis PR contains two things:
- It makes combine v9.1.0 the default version as it is the recommended version since recently.
- Certain flags added to the workspace creation for performance reasons are optional now. More details below.
#...This PR contains two things:
- It makes combine v9.1.0 the default version as it is the recommended version since recently.
- Certain flags added to the workspace creation for performance reasons are optional now. More details below.
#### Details on performance flags
`CreateWorkspace` is running with 5 flags that increase the performance of limit and likelihood computations:
`--optimize-simpdf-constraints cms`, `--X-pack-asympows`, `--X-optimizeMHDependency fixed`, `--use-histsum`, and --no-wrappers`.
Observations:
- There are two things that make the limits fast: `--optimize-simpdf-constraints cms` and `--use-histsum --no-wrappers`. Each of them accounts for about 50% of the performance boost.
- There is no issue with `--optimize-simpdf-constraints cms` whatsoever, so it should be always safe to use that one.
- `--use-histsum` and `--no-wrappers` always have to be used together. When used alone, each of them leads to failures (either already in text2workspace.py or Segfaults in combine).
- `--X-optimizeMHDependency fixed` only works when both `--use-histsum` and `--no-wrappers` are enabled, but it doesn't increase the performance on top of these two any further.
- `--X-pack-asympows` works when used alone, but not when `--optimize-simpdf-constraints cms` is enabled, **unless** `--use-histsum` is set.
- **But most importantly**: when `--use-histsum --no-wrappers` is set, the result of FitDiagnostics is unusable as it's missing all per-channel info and postfit histograms are collapsed to single bins.
Actions:
- We can safely add `--optimize-simpdf-constraints cms`. It makes things faster and doesn't break anything.
- Per default, none any of the other flags should be enabled as users will run into issues with FitDiagnostics, which they will most likely always need.
- However, there is an `--optimize-limits` flag in CreateWorkspace now that adds the additional four flags.
- In the combination, we can always set this flag to True by default. This can also be done conveniently with an env variable DHI_WORKSPACE_OPTIMIZE_LIMITS=1 so that nothing changes for the workflows there.
- One should just keep in mind that FitDiagnostics won't work and in case it's needed, one needs to have a second set of workspaces created with `--optimize-limits False`.Marcel Riegermarcel.rieger@cern.chMarcel Riegermarcel.rieger@cern.chhttps://gitlab.cern.ch/hh/tools/inference/-/merge_requests/83Fix plotting issues with contour style in likelihood2023-06-15T13:08:18+02:00Torben LangeFix plotting issues with contour style in likelihoodFixes issue #49Fixes issue #49https://gitlab.cern.ch/hh/tools/inference/-/merge_requests/82For r_qqhh upper limit scan in grid mode2023-11-28T19:25:33+01:00Yihui LaiFor r_qqhh upper limit scan in grid modeFor r_qqhh upper limit scan with HH combination.
The grid mode does not work for r_qqhh scan because the `cmd = re.sub(r"^(.+--redefineSignalPOIs\s+[^\s+]\s+)(.+)$", r"\1{} \2".format(repl), cmd)` only matches a single character after `...For r_qqhh upper limit scan with HH combination.
The grid mode does not work for r_qqhh scan because the `cmd = re.sub(r"^(.+--redefineSignalPOIs\s+[^\s+]\s+)(.+)$", r"\1{} \2".format(repl), cmd)` only matches a single character after `--redefineSignalPOIs`. If using `r_qqhh` as POI, `--singlePoint xx` will be missing in the commands.https://gitlab.cern.ch/hh/tools/inference/-/merge_requests/81Fix small typo that prevented unblinded eft benchmark plots.2023-06-09T09:14:02+02:00Torben LangeFix small typo that prevented unblinded eft benchmark plots.The observed limit was added by default which caused the task to fail if blinded. It was again added if unblinded (so twice in that case). Removed "observed" from the default values requested so that it is only added in case the task is ...The observed limit was added by default which caused the task to fail if blinded. It was again added if unblinded (so twice in that case). Removed "observed" from the default values requested so that it is only added in case the task is set to unblind.https://gitlab.cern.ch/hh/tools/inference/-/merge_requests/80Add optimized ws parameters.2023-04-03T14:46:28+02:00Marcel Riegermarcel.rieger@cern.chAdd optimized ws parameters.This PR adds parameters to the CreateWorkspace command as mentioned in #40.
Closes #40.This PR adds parameters to the CreateWorkspace command as mentioned in #40.
Closes #40.https://gitlab.cern.ch/hh/tools/inference/-/merge_requests/79Support for Python 3 and remote targets2023-04-02T19:15:39+02:00Marcel Riegermarcel.rieger@cern.chSupport for Python 3 and remote targetsThis PR contains two main changes.
1. Python 3 support (which is required for point 2).
2. At some point one wants / needs to use submission tools like crab for running many fits with high parallelism. For this to work, the way that t...This PR contains two main changes.
1. Python 3 support (which is required for point 2).
2. At some point one wants / needs to use submission tools like crab for running many fits with high parallelism. For this to work, the way that tasks handle their inputs and outputs has to be compliant with law remote targets. However, as it turns out, this is already mostly the case and there are just a few changes + 2 minor fixes required:
#### Tasks to adjust
- [x] Workspace
- [x] Snapshot
- [x] Limits
- [x] Likelihoods
- [x] Significances
- [x] GOF
- [x] Pulls & impacts
- [x] Postfit
- [x] EFT
- [x] Resonant
- [x] Exclusion
#### Fixes
- [x] ~~The current GLOBUS_THREAD_MODEL setting makes gfal2 timeout in parallel processing (`--workers >1`) → Most likely another weird interference with PyROOT.~~ Turned out to be a Python 2 issue. Fixed with Python 3.
- [x] ~~Remote bundles cannot be fetched after CMSSW setup in remote jobs → xrootd library in remote CMSSW env incompatible with gfal (but not in local one?)~~ Opened #43 to follow up.
Closes #25.Marcel Riegermarcel.rieger@cern.chMarcel Riegermarcel.rieger@cern.chhttps://gitlab.cern.ch/hh/tools/inference/-/merge_requests/78Dummy model2024-02-06T23:24:49+01:00Alexandra Carvalho Antunes De OliveiraDummy modelUse as ` --hh-model "model_dummy.model_dummy" ` in law command line
If model dummy
- does not remove any process from the datacard
- does not add physics model model in the t2w step
- For points limits still divides by theory (HH datac...Use as ` --hh-model "model_dummy.model_dummy" ` in law command line
If model dummy
- does not remove any process from the datacard
- does not add physics model model in the t2w step
- For points limits still divides by theory (HH datacards in developing analyses and HHH datacards in SM done using SM Xsec, so that still makes sense)
Tested with
- PlotUpperLimitsAtPoint
- PlotLikelihoodScan (for lik scan in r)
- PlotPullsAndImpacts
- PlotGoodnessOfFit
No priority at all to merge
Useful for
- developing analyses that still do not have all necessary signals to use the physics model
- developing production modes, for which a model is still being developed
- resonant (pulls and GOF)
Related with [issue 37](https://gitlab.cern.ch/hh/tools/inference/-/issues/37)https://gitlab.cern.ch/hh/tools/inference/-/merge_requests/77Resonant workflow2023-06-07T14:06:18+02:00Marcel Riegermarcel.rieger@cern.chResonant workflowThis PR is intended to provide all tools necessary to create resonant limits.
- [x] Task structure
- [x] Plot functions
- [x] Documentation
- [x] Example cards
- [x] Tests
- [x] CI integrationThis PR is intended to provide all tools necessary to create resonant limits.
- [x] Task structure
- [x] Plot functions
- [x] Documentation
- [x] Example cards
- [x] Tests
- [x] CI integrationMarcel Riegermarcel.rieger@cern.chMarcel Riegermarcel.rieger@cern.chhttps://gitlab.cern.ch/hh/tools/inference/-/merge_requests/76Add changes from !69.2023-03-25T14:37:49+01:00Marcel Riegermarcel.rieger@cern.chAdd changes from !69.For some reasons, the source branch for !69 disappeared.
This PR recovers the changes.For some reasons, the source branch for !69 disappeared.
This PR recovers the changes.Marcel Riegermarcel.rieger@cern.chMarcel Riegermarcel.rieger@cern.chhttps://gitlab.cern.ch/hh/tools/inference/-/merge_requests/75Add GridDataInterpolator.2023-03-25T11:11:05+01:00Marcel Riegermarcel.rieger@cern.chAdd GridDataInterpolator.Fixes 2D interpolation for uneven grids.
Closes #29.Fixes 2D interpolation for uneven grids.
Closes #29.Marcel Riegermarcel.rieger@cern.chMarcel Riegermarcel.rieger@cern.chhttps://gitlab.cern.ch/hh/tools/inference/-/merge_requests/74Small fixes found on the way of doing common plots2023-04-03T11:24:27+02:00Alexandra Carvalho Antunes De OliveiraSmall fixes found on the way of doing common plotsThis ready!!!!
The pipeline fails in the place I allow the code to pick up from old runs so that is commented out for fast recovery if neededThis ready!!!!
The pipeline fails in the place I allow the code to pick up from old runs so that is commented out for fast recovery if neededAlexandra Carvalho Antunes De OliveiraAlexandra Carvalho Antunes De Oliveirahttps://gitlab.cern.ch/hh/tools/inference/-/merge_requests/73Refactoring2023-03-24T12:13:15+01:00Marcel Riegermarcel.rieger@cern.chRefactoring(PR details and descriptions below)
## Summary of changes that affect the usage
### Software updates
- Combine has been updated to v9.0.0 with CMSSW 11.3.4. It is recommended to update all software by using `DHI_REINSTALL_SOFTWARE=1 s...(PR details and descriptions below)
## Summary of changes that affect the usage
### Software updates
- Combine has been updated to v9.0.0 with CMSSW 11.3.4. It is recommended to update all software by using `DHI_REINSTALL_SOFTWARE=1 source setup.sh <setup_name>` once.
### Plot style updates
- The `--paper` and `--summary` task parameters are removed. Instead, one can add `--style paper`, `--style summary`, or even `--style paper,summary` (comma-separated). It would be best if the default style is preserved and specific custom styles should depend on the value(s) of the passed style flags. See e.g. plots.exclusion.plot_exclusion_and_bestfit_2d for an example.
- In case people / groups want to use their own plot function, this is possible now by adding a `--plot-function` parameter. This function will be called with the exact same arguments.
- To tweak plots outside of task structures, there is the option `--save-plot-data` now whichs saves all arguments which would be passed to plot functions in a pkl file. This file can be loaded to obtain the arguments which can be passed to the plot function.
### Datacard manipulation scripts
- All scripts have a default value for the `--directory` setting now, meaning that when omitted, datacards are not updated in place by mistake. If in-place updates are required, one should explicitely pass `--directory none` or `--directory ''` (empty string).
### Code-style & testing
- Each commit triggers a CI job that checks the code style according to a rather loose flake8 style (see the .flake8 file in the project directory). Before pushing, developers should check the style of their changes with `./lint.sh`, or by directly configuring the IDE to show linting issues while developing. See [pipelines](https://gitlab.cern.ch/hh/tools/inference/-/pipelines) for more info.
- There is a test pipeline now that can trigger every single inference task based on example datacards. For that, go to "CI/CD > Pipelines" and click on "Run Pipeline". Each test job stores pdf and png versions of the output plots and ensures that they are accessible for one month. [Example](https://gitlab.cern.ch/hh/tools/inference/-/jobs/28326424/artifacts/browse).
### Changes due to law updates
- Job submission should run way faster now since the number of files that are sent with the jobs are drastically reduced.
- The `--start-branch A` and `--end-branch B` parameters no longer exist. Instead, use `--branches A:B` which leads to identical outputs.
## PR details
This PR is meant to refactor the current state of the project.
This is currently still in a draft stage and should only be merged after the items below were resolved and discussed.
Items under "Refactoring" as well as "Critical fixes" should be solved in the context of this PR, whereas those under "Fixes" and "Improvements" are meant for documentation purposes and could be carried over to separate PRs. Unchecked items marked with ❗️ are required to be solved before this PR can be merged.
Currently open PRs should be able to be merged right after *this* one.
Refactoring:
- [x] Streamline setup routine
- [x] Cleanup models
- [x] Cleanup base tasks
- [x] Cleanup actual tasks and plot methods in tandem
- [x] Snapshots
- [x] Likelihoods
- [x] Significances
- [x] Limits
- [x] Benchmark limits
- [x] Exclusion
- [x] GOF
- [x] Pulls & impacts
- [x] Postfit
- [x] Test tasks
- [x] Study tasks
- [x] Other EFT things
- [x] Cleanup hooks
- [x] Cleanup scripts
- [x] Cleanup documentation
Critical fixes:
- [x] Update to latest recommended combine version
- [x] Use CMSSW-based combine installation
- [x] Failing setup under python 2
- [x] CLs grid scan method for limit extraction (#20, #21)
- [x] Axis ordering in 2D c2 scans (#18)
- [x] Fix plot library dependencies (#22)
- [x] Add updated example cards and fix tests (!70)
- [x] Add a default directory to all datacard manipulation scripts
- [x] Adapt documentation to changes
Smaller fixes:
- [x] Fix param label position in study.Enhancement plots
- [x] ~2D nll scan, start at Combine POI or at custom point for recompute best fit~ Opened #28 to follow up.
- [x] ~Repeat best fit if on one axis there is no error in 2D fit~ Opened #27 to follow up.
Improvements:
- [x] Adjust submission to use only cmssw-based combine
- [x] Update underlying law version
- [x] Add mechanism for dumping plot data
- [x] Implement plot data dumping in all plot tasks
- [x] Add mechanism for switching plot functions
- [x] ~Add documentation focussing on developing aspects~ Opened #26 to follow up.
- [x] ~Full update to python 3 (actual issue for #22)~ Opened #25 to follow up.
- [x] Enforce coding rules and tests in CI/CD (test based on `dhi/tasks/test.py`)
- [x] Add back previous plot styles using new `--style` feature.
Fixes #18, #20, #21, #22, #23.
Closes !68, !70.Marcel Riegermarcel.rieger@cern.chMarcel Riegermarcel.rieger@cern.chhttps://gitlab.cern.ch/hh/tools/inference/-/merge_requests/72DatacardTask: cache datacard resolving2023-03-24T12:11:51+01:00Benjamin FischerDatacardTask: cache datacard resolvingthis brings:
- consistency: (sub)tasks do not become inconsistent with each other if globbing for datacards changes due to new/removed files
- massive speed up: instancing workflow branch tasks (e.g. in UpperLimits) would previously repe...this brings:
- consistency: (sub)tasks do not become inconsistent with each other if globbing for datacards changes due to new/removed files
- massive speed up: instancing workflow branch tasks (e.g. in UpperLimits) would previously repeatedly resolve the datacards parameter slowing down the task-tree building & workflow starting by a factor upto 100x (depending on storage load; this was enough to bring down eos directories/nodes that contained the datacards)Marcel Riegermarcel.rieger@cern.chManfred Peter FackeldeyMarcel Riegermarcel.rieger@cern.chhttps://gitlab.cern.ch/hh/tools/inference/-/merge_requests/71[setup] Enables setting Version of CMSSW interactively2023-03-24T13:09:51+01:00Dennis Noll[setup] Enables setting Version of CMSSW interactivelyEnables setting Version of CMSSW interactively.
Only queried and set when not using standalone combine.Enables setting Version of CMSSW interactively.
Only queried and set when not using standalone combine.https://gitlab.cern.ch/hh/tools/inference/-/merge_requests/70Add new testcards.2023-03-24T11:56:53+01:00Torben LangeAdd new testcards.