tools issueshttps://gitlab.cern.ch/groups/hh/tools/-/issues2024-03-21T21:56:18+01:00https://gitlab.cern.ch/hh/tools/inference/-/issues/58[Bug] "poi_mins" not defined2024-03-21T21:56:18+01:00Yihui Lai[Bug] "poi_mins" not definedThe `poi_mins` is needed in [plots/likelihoods.py](https://gitlab.cern.ch/hh/tools/inference/-/blob/master/dhi/plots/likelihoods.py?ref_type=heads#L1102) when making plots, otherwise it always returns `[None, None]`
However, `poi_mins` ...The `poi_mins` is needed in [plots/likelihoods.py](https://gitlab.cern.ch/hh/tools/inference/-/blob/master/dhi/plots/likelihoods.py?ref_type=heads#L1102) when making plots, otherwise it always returns `[None, None]`
However, `poi_mins` is not defined in [tasks/likelihoods.py](https://gitlab.cern.ch/hh/tools/inference/-/blob/master/dhi/tasks/likelihoods.py?ref_type=heads)
Need to change `poi_min` to `poi_mins` in 2 places:
- (1) `PlotMultipleLikelihoodScans` https://gitlab.cern.ch/hh/tools/inference/-/blob/master/dhi/tasks/likelihoods.py?ref_type=heads#L626
- (2) `PlotMultipleLikelihoodScansByModel` https://gitlab.cern.ch/hh/tools/inference/-/blob/master/dhi/tasks/likelihoods.py?ref_type=heads#L782https://gitlab.cern.ch/hh/tools/inference/-/issues/57Warning on adding too much POI's2024-03-18T21:52:10+01:00Alexandra Carvalho Antunes De OliveiraWarning on adding too much POI'sIf we just keep adding every new case/process there will be problems with job submission as condor does not like too long filenames (example bellow) -- perhaps the model being called can set the list for naming, or submission json file c...If we just keep adding every new case/process there will be problems with job submission as condor does not like too long filenames (example bellow) -- perhaps the model being called can set the list for naming, or submission json file can use a hash
```
OSError: [Errno 36] File name too long:
/afs/cern.ch/work/a/acarvalh/HHH/jobs/LikelihoodScan/datacards_07c7d36d6e/m125.0/poi_r_gghh_r_hhh/HHH_multiclass_v2_mus/htcondor_jobs_0To220__poi_r_gghh_r_hhh__scan_r_gghh_-30.0_30.0_n11_r_hhh_-1200.0_1200.0_n20__params_r1.0_r_qqhh1.0_r_vhh1.0_kl1.0_kt1.0_CV1.0_C2V1.0_k41.0_C20.0_A0.0_CA1.0_LA0.0_LE0.0_M20.0_B0.1_MHE125.01_MHP0.0_MA0.0_Z60.0_TB1000.0_CBA0.0_LQ0.0_MQ1000.0_XI0.0_kl_EFT1.0_kt_EFT1.0_C2_EFT0.0_cosbma0.0_tanbeta10.0_c30.0_d40.0_k31.0.json
```https://gitlab.cern.ch/hh/tools/inference/-/issues/56additions in preparation to resonant HH/HY combination2024-02-06T23:45:35+01:00Alexandra Carvalho Antunes De Oliveiraadditions in preparation to resonant HH/HY combinationI list here the lecons learned additions that we will need here, linked to the patch in my fork to explain better what I mean
- HH part (task already existent)
- [ ] To have the fit stable to all masses/analyses we used a custom fac...I list here the lecons learned additions that we will need here, linked to the patch in my fork to explain better what I mean
- HH part (task already existent)
- [ ] To have the fit stable to all masses/analyses we used a custom factor in signal normalization, that would be taken into account in plotting [here](https://gitlab.cern.ch/acarvalh/inference/-/blob/resonant/dhi/tasks/eft.py?ref_type=heads#L797) - That can be an external options instead of a hardcoded value as individual analyses do not need that [1]
- [ ] The final plotter here is completely external and [here](https://gitlab.cern.ch/cms-b2g/generalsummary/-/tree/B2G_EXO_reviews?ref_type=heads). Because of that with every plot we had a human readable file in json format as [here](https://gitlab.cern.ch/acarvalh/inference/-/blob/resonant/dhi/tasks/eft.py?ref_type=heads#L584-595) that would be directly copied and interpreted in the external plotter (I am quite strong in keeping this workflow like this to interface well with B2G summary plots, where not all the analyses uses inference or fancy python output formats) - Before trying to implement anything here I want to see if it is possible to make the HEPData output of inference to be interpretable by the external plotter [1]. I mention this here because this json file is practically used in many of the following patches.
- [ ] We did not had snapshot step implemented in the eft workflow (where the resonant task is located on the fork). To circumvent that and make fit stable I implemented a signal injection in a way that: running once expected a json file (the above-mentioned one) and when running the observed that would look for the expected json file and use to tweak fit borders as [here](https://gitlab.cern.ch/acarvalh/inference/-/blob/resonant/dhi/tasks/eft.py?ref_type=heads#L368-377) (that was a problem specially to boosted analyses) -- before implementing anything here I would like someone to test the B2G-23-005 fit with snapshot and without signal injection to see if any action still needed.
- [ ] Initially I made an option to be possible to skip some masses [here](https://gitlab.cern.ch/acarvalh/inference/-/blob/resonant/dhi/tasks/eft.py?ref_type=heads#L67-76), or run only a few. As in the way I did that would change the hash of the Limits step output that was not used, even if it would have been handful for debugging. We gave more handmade ways to debug couple of failing masses.
- [ ] I was finding masses in datacard by pattern rather than the mass itself as full naming convention (there is a good reason, related with HY first bullet). That would make confusions eg of `300` with `3000`, that in the end we would solve by hand in the json file before the external plotter :-p. That is something to pay attention if in master we are doing similar thing.
- [ ] For GOF and impacts we needed the [dummy_model](https://gitlab.cern.ch/hh/tools/inference/-/merge_requests/78). Some of the working people just commented out the hh_model calling from the WS making for the GOF/impacts workflow call to not maintain other inference installation hehe... I tried to maintain the dummy_model branch up to date with master, but I am not sure if that went well and I think I broke that branch. It would be good if someone tests it [1] before asking to merge to master again.
- [ ] A place where we would do a lot of mistakes in GOF and impacts was to have to keep in mind that signal would have to be injected as external command part AND would not quite be the one of the json file, but one taking into account the real signal normalization [here](https://gitlab.cern.ch/hh/results/datacards_run2_resonant_hh_hy/-/blob/master/commands/hh_fitDiag_280s.sh?ref_type=heads#L22). I am not sure if there is an automatic solution to this other than being super attentive on construction of commands (and keep those under version control)
- HY part - Implement the task in resonant workflow :-D
- [ ] In the fork I did as lawyers of Resonant plot [here](https://gitlab.cern.ch/acarvalh/inference/-/blob/resonant/dhi/tasks/eft.py?ref_type=heads#L1446-1463) (reading the json file mentioned above). I know you would not like to do the link to a plot results, rather to the limits files. But it might be a good idea to make that layered in Resonant for a practical debugging reason -- in debugging fits it was quite handy to be able to run MX-by-MX from the `PlotResonantLimitsHY` task using something like `--datacard-pattern "datacard_mass_X400_Y(.+)\.txt`
- [ ] There are some more stuff to pay attention to be added here [TO BE CONTINUED]
- WARNING: I do not have time now to be thorough, but list the most important ones / the ones more fresh in my memory.
- This description and list can be evolving --> When it is not evolving anymore that warning will be removed
- What is marked with [1] is also useful to the HVT combination (that have shorter timescale), and therefore also interesting to Yihui or Sitian implement as feature and link that in this issuehttps://gitlab.cern.ch/hh/tools/inference/-/issues/55Problems with prefit injection (python3)2023-11-20T14:42:53+01:00Torben LangeProblems with prefit injection (python3)The prefit injection that is needed in 4b does not seem to be working at the moment. This seems to be at least one problem with the behaviour of maps in python2 and 3 ![image](/uploads/e25ffa88cd419378d1b3fda74d925680/image.png) @mrieger...The prefit injection that is needed in 4b does not seem to be working at the moment. This seems to be at least one problem with the behaviour of maps in python2 and 3 ![image](/uploads/e25ffa88cd419378d1b3fda74d925680/image.png) @mrieger as this might also occur in other places, can you have a look at that?Marcel Riegermarcel.rieger@cern.chMarcel Riegermarcel.rieger@cern.chhttps://gitlab.cern.ch/hh/tools/inference/-/issues/54Fix problems with multi signal POIs.2023-07-17T09:43:12+02:00Alexandra Carvalho Antunes De OliveiraFix problems with multi signal POIs.If we want to make 2D fits scans floating a set of POI's we need to allow the "-P " option to be duplicated, see [here](https://gitlab.cern.ch/acarvalh/inference/-/commit/7878833768880022b3cc7b5ff6b851a3d4fa982e)
I put as low priority a...If we want to make 2D fits scans floating a set of POI's we need to allow the "-P " option to be duplicated, see [here](https://gitlab.cern.ch/acarvalh/inference/-/commit/7878833768880022b3cc7b5ff6b851a3d4fa982e)
I put as low priority as I am not sure if we want that for legacy nonres HHhttps://gitlab.cern.ch/hh/tools/inference/-/issues/53overlay stat-only for likelihood scans2023-06-27T16:40:21+02:00Alexandra Carvalho Antunes De Oliveiraoverlay stat-only for likelihood scansCurrently, at least in H+HH I am doing it quite manually.
Not finding in docs or digging conversation, but we had a way to do in Nature no? Or we did manually there as well?Currently, at least in H+HH I am doing it quite manually.
Not finding in docs or digging conversation, but we had a way to do in Nature no? Or we did manually there as well?https://gitlab.cern.ch/hh/tools/inference/-/issues/52floating and freezing more flexible2023-06-26T12:54:52+02:00Alexandra Carvalho Antunes De Oliveirafloating and freezing more flexibleThings we noticed in H+HH
- To do stat-only or try other freezing options on top of what inference makes we made an option to add additional freezing options with more freedom, [example here](https://gitlab.cern.ch/acarvalh/inference/-/...Things we noticed in H+HH
- To do stat-only or try other freezing options on top of what inference makes we made an option to add additional freezing options with more freedom, [example here](https://gitlab.cern.ch/acarvalh/inference/-/commit/4e77c18bbc4f16ff5f3bfbe47e722e70057ea660)
- To do 2D fits with floating options we need to repeat the "-P" argument, what by default inference does not let, the hack on the H+HH branch is [just this](https://gitlab.cern.ch/acarvalh/inference/-/commit/7878833768880022b3cc7b5ff6b851a3d4fa982e)https://gitlab.cern.ch/hh/tools/inference/-/issues/51Bias tests workflow2023-06-21T13:54:17+02:00Alexandra Carvalho Antunes De OliveiraBias tests workflowBoth in the H+HH and resonant reviews we are being asked bias tests, that probably we were not asked on Nature publication as it was too much hurry.
Below is what they look like.
My plan is to implement that workflow in a new branch on...Both in the H+HH and resonant reviews we are being asked bias tests, that probably we were not asked on Nature publication as it was too much hurry.
Below is what they look like.
My plan is to implement that workflow in a new branch on top of the dummy_model one, unless there is already this feature somehow in docs and I lost
![image](/uploads/a1bba52773286e28146957dc52e5d339/image.png)https://gitlab.cern.ch/hh/tools/inference/-/issues/50Inappropriate grid_max setting for r_qqhh scan2023-06-15T04:25:38+02:00Yihui LaiInappropriate grid_max setting for r_qqhh scanIn grid mode the `grid_min/max` are rounded by [round_digits](https://gitlab.cern.ch/hh/tools/inference/-/blob/master/dhi/hooks/run2_combination.py#L187-188) to two significant digits. This is fine if grid_max<100.
However, For r_qqhh ...In grid mode the `grid_min/max` are rounded by [round_digits](https://gitlab.cern.ch/hh/tools/inference/-/blob/master/dhi/hooks/run2_combination.py#L187-188) to two significant digits. This is fine if grid_max<100.
However, For r_qqhh scan, when `0 < r_qqhh < 3` the grid_max>100. For example when r_qqhh=0, `(exp, up, down) = (30.63, 66.13, 15.03)`, therefore `(grid_min, grid_max) = ( 0.0 101.63)`. But `round_digits ` force `grid_max` to be 11, giving an inappropriate grid range. Should we round edges to 3 significant digits ?https://gitlab.cern.ch/hh/tools/inference/-/issues/49Broken likelihood plot style2023-06-15T12:57:37+02:00Torben LangeBroken likelihood plot style![Screenshot_2023-06-13_at_11.15.20](/uploads/b8ca239d0139b2852b67afe910b933ab/Screenshot_2023-06-13_at_11.15.20.png) the style seems broken (contours, contours_hcomb), --show-significances 1,2,3 also only gives up to 2 sigma...![Screenshot_2023-06-13_at_11.15.20](/uploads/b8ca239d0139b2852b67afe910b933ab/Screenshot_2023-06-13_at_11.15.20.png) the style seems broken (contours, contours_hcomb), --show-significances 1,2,3 also only gives up to 2 sigma...https://gitlab.cern.ch/hh/tools/inference/-/issues/48avoid user-specific in setups2023-06-06T13:54:32+02:00Alexandra Carvalho Antunes De Oliveiraavoid user-specific in setupsI just realized that when running a setup file inference auto-completes the setup file with user-specific information
```
export DHI_USER="blabla"
export DHI_DATA=blabla
export DHI_STORE_BUNDLES="/bla/bla"
export DHI_STORE_EOSUSER=blabl...I just realized that when running a setup file inference auto-completes the setup file with user-specific information
```
export DHI_USER="blabla"
export DHI_DATA=blabla
export DHI_STORE_BUNDLES="/bla/bla"
export DHI_STORE_EOSUSER=blabla
export DHI_SOFTWARE="/bla/bla/data/software"
```
To share setups in running in large groups it would be nice to avoid that, or make it relative if written by setup.sh, eg
```
export DHI_USER=""
export DHI_DATA="$DHI_BASE/data"
export DHI_STORE_BUNDLES=$DHI_STORE
export DHI_STORE_EOSUSER=$DHI_STORE
export DHI_SOFTWARE="$DHI_BASE/data/software"
```https://gitlab.cern.ch/hh/tools/inference/-/issues/47Add option to unblind and mask data2023-06-07T12:49:32+02:00Alexandra Carvalho Antunes De OliveiraAdd option to unblind and mask dataTo be able to make see postfit expectedTo be able to make see postfit expectedhttps://gitlab.cern.ch/hh/tools/inference/-/issues/46Observed upper limits stabilization2023-05-17T09:12:50+02:00Alexandra Carvalho Antunes De OliveiraObserved upper limits stabilizationIt was found in the resonant combination that when the rate to converge change a by orders of magnitude combine will be unstable. That can also be a thing to the shape BM results (@tolange)
The solution there was doing one fit with infe...It was found in the resonant combination that when the rate to converge change a by orders of magnitude combine will be unstable. That can also be a thing to the shape BM results (@tolange)
The solution there was doing one fit with inference expected prior to do the " --unblinded True", and when the observed pick up starting point and boundaries from the expected run
Example bellow (there I have a json file with the limits that I pass to general plots area after and for each run take this one as "expected_limts" as the unblinded output of that same task removing the "__unblinded" of the path), but here this can be done smarter, eg taking from the root file in the same area that inference saves
```
try:
ff = open(expected_limits)
data_exp = json.load(ff)
#print("data_exp in mass point", data_exp[str(float(self.branch_map[self.branch]))])
rstart=(data_exp[str(float(self.branch_map[self.branch]))]["limit"])/float(scale)
#print("rstart = ", rstart, scale)
rstart_srt=",r=%s" % str(float(rstart))
rboundaries = " --rMin %f --rMax %f " % (rstart/10, rstart*10)
except:
rstart_srt=""
rboundaries = ""
```
And then in the limits command do
```
" --setParameters MX={mass_X}{rstart_srt} "
" {rboundaries} "
```https://gitlab.cern.ch/hh/tools/inference/-/issues/45Adding basic interpretation lines to the resonant workflow2023-05-12T16:52:45+02:00Alexandra Carvalho Antunes De OliveiraAdding basic interpretation lines to the resonant workflowThe idea is to port the drawing of the WED theory interpretation to master as an exercise to Xuanhao Zhang get familiar with the workflow. I could not find him to tag. He is going to tag himself to follow up here with the relevant MRThe idea is to port the drawing of the WED theory interpretation to master as an exercise to Xuanhao Zhang get familiar with the workflow. I could not find him to tag. He is going to tag himself to follow up here with the relevant MRhttps://gitlab.cern.ch/hh/tools/inference/-/issues/44Parallelization of Hessian impacts2023-04-14T11:01:32+02:00Alexandra Carvalho Antunes De OliveiraParallelization of Hessian impacts@fmonti implementing in a combine fork (but not on top of combine 9...)
We may like to have for our results, we could have a branch here btw with that combine version + support for parallelization in the pulls task
To merge when this f...@fmonti implementing in a combine fork (but not on top of combine 9...)
We may like to have for our results, we could have a branch here btw with that combine version + support for parallelization in the pulls task
To merge when this feature is merged into combinehttps://gitlab.cern.ch/hh/tools/inference/-/issues/43Fix gfal2 bindings in remote CMSSW env2023-04-02T19:04:59+02:00Marcel Riegermarcel.rieger@cern.chFix gfal2 bindings in remote CMSSW envWith !79 merged, the tools have support for storing outputs on remote storage elements.
However, in remote jobs, the gfal2 python bindings seem to have issues in the shallow CMSSW env shipped with jobs.
This is not critical at this poin...With !79 merged, the tools have support for storing outputs on remote storage elements.
However, in remote jobs, the gfal2 python bindings seem to have issues in the shallow CMSSW env shipped with jobs.
This is not critical at this point since all outputs are stored locally by default, but should still be fixed asap.Marcel Riegermarcel.rieger@cern.chMarcel Riegermarcel.rieger@cern.chhttps://gitlab.cern.ch/hh/tools/inference/-/issues/42On automatic naming cards combinations2023-03-30T13:39:01+02:00Alexandra Carvalho Antunes De OliveiraOn automatic naming cards combinationsIt is super handy to have human friendly names if we do a setup nicely
For long combinations however we can arrive to this error
`OSError: [Errno 36] File name too long: '/afs/.../.../HH_resonant/jobs/CreateWorkspace/bbbb_spin0_HH_v2_d...It is super handy to have human friendly names if we do a setup nicely
For long combinations however we can arrive to this error
`OSError: [Errno 36] File name too long: '/afs/.../.../HH_resonant/jobs/CreateWorkspace/bbbb_spin0_HH_v2_datacard_mass_900__bbbb_boosted_spin0_HH_v2_datacard_mass_900__bbgg_spin0_HH_v2_datacard_mass_900__bbtt_spin0_HH_v2_datacard_mass_900__bbww_spin0_HH_v2_datacard_mass_900__bbww_boosted_spin0_HH_v2_datacard_mass_900__multilepton_spin0_HH_v2_datacard_mass_900'`
I think we could simply tell law to ignore this error, something like
```
try:
# try stuff
except OSError as oserr:
if oserr.errno != errno.ENAMETOOLONG:
# ignore
else:
# caught...now what?
```https://gitlab.cern.ch/hh/tools/inference/-/issues/41option of dictionary for multiwords/multinames2023-03-28T13:32:12+02:00Alexandra Carvalho Antunes De Oliveiraoption of dictionary for multiwords/multinamesIf we have a long list of stuff to overlay I always find myself trying to match MULTICARDS and MULTINAMES bash variables.
It could be an option to be possible to give an dictionary file instedIf we have a long list of stuff to overlay I always find myself trying to match MULTICARDS and MULTINAMES bash variables.
It could be an option to be possible to give an dictionary file instedhttps://gitlab.cern.ch/hh/tools/inference/-/issues/37add option of hh_model None on command line2023-04-06T10:49:40+02:00Alexandra Carvalho Antunes De Oliveiraadd option of hh_model None on command lineFor new searches HH related as eg HHH be able to start usingFor new searches HH related as eg HHH be able to start usinghttps://gitlab.cern.ch/hh/tools/inference/-/issues/36Refactor config of analyses2023-03-25T19:00:33+01:00Alexandra Carvalho Antunes De OliveiraRefactor config of analysesRight now we have 3-4 1D arrays with same keys -- that can be one dictionary
With that I would like to introduce a caddy-only common plot for internal consumption
I only do that with some green lightRight now we have 3-4 1D arrays with same keys -- that can be one dictionary
With that I would like to introduce a caddy-only common plot for internal consumption
I only do that with some green lightAlexandra Carvalho Antunes De OliveiraAlexandra Carvalho Antunes De Oliveira