MC Job Options issueshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues2021-05-18T11:32:21+02:00https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/142Follow-up from "LO EFT samples for 4top"2021-05-18T11:32:21+02:00Spyros ArgyropoulosFollow-up from "LO EFT samples for 4top"The following discussions from !1161 should be addressed:
> This shouldn't be part of a JobOption. The first part was fixed properly in 21.6.60 and the second part is obviously gonna cause problems. `ATHENA_PROC_NUMBER` is set to 8 ...The following discussions from !1161 should be addressed:
> This shouldn't be part of a JobOption. The first part was fixed properly in 21.6.60 and the second part is obviously gonna cause problems. `ATHENA_PROC_NUMBER` is set to 8 because the machine has 8 cores, it shouldn't be set to 80 in the JOs.
Should we add the following checks/changes:
- if ATHENA_PROC_NUMBER > 1 and release < 21.2.60 => ERROR
- if ATHENA_PROC_NUMBER > 1 => run only 1 event in CI
- change the way we check whether the jO changes ATHENA_PROC_NUMBER - this would only be safe to catch in the transform btw, but until it is implemented there we could change the check to not use anywhere ATHENA_PROC_NUMBER (not even printing it), so e.g. look in the jO and if there is an uncommented line with "ATHENA_PROC_NUMBER" in it then give error
@cgutschoS1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/141check_unique_controlFile.sh fails when it shouldn't?2021-05-13T09:00:06+02:00Jeff Shahiniancheck_unique_controlFile.sh fails when it shouldn't?[check_unique_controlFile.sh](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/check_unique_controlFile.sh) is apparently a new part of the CI. I noticed that it fails even when given symlinks. For example, whe...[check_unique_controlFile.sh](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/check_unique_controlFile.sh) is apparently a new part of the CI. I noticed that it fails even when given symlinks. For example, when uploading JOs (with symlinks to one control file) that look like this:
```
$ ls -a *
100001:
myJO_1.py
myControlFile.py
100002:
myJO_2.py
myControlFile.py -> ../100001/myControlFile.py
```
The CI job fails and recommends that you use symlinks (even if you already are):
```
ERROR: Duplicate file(s) found:
./100xxx/100001/myControlFile.py
If the files have exactly the same content, please only keep one physical file replacing the rest with symbolic links.
If the files have differences consider renaming the files that you added.
You can check for differences with diff -w file1 file2
```
Perhaps we need to add ```-type f``` to [this line](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/check_unique_controlFile.sh#L23) as well?
Here's an example of a failing CI job:
https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/13818890
Tagging @sargyrop
Best,
JeffS1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/140Strange behaviour of commit script when athena is skipped and run time is > 1h2021-05-06T11:03:19+02:00Spyros ArgyropoulosStrange behaviour of commit script when athena is skipped and run time is > 1hFor example in a case with 1 event, CPU=3.09h, the output is the following:
![Screenshot_2021-05-06_at_09.46.11](/uploads/63cb47b36de5f75bb8e9e6a275602971/Screenshot_2021-05-06_at_09.46.11.png)
which is correct, but when skipping athen...For example in a case with 1 event, CPU=3.09h, the output is the following:
![Screenshot_2021-05-06_at_09.46.11](/uploads/63cb47b36de5f75bb8e9e6a275602971/Screenshot_2021-05-06_at_09.46.11.png)
which is correct, but when skipping athena:
![Screenshot_2021-05-06_at_09.45.42](/uploads/c01022a7990d31535fb7cca7aa2e6a4c/Screenshot_2021-05-06_at_09.45.42.png)
the
```
printGood -f "\tOK: CI job time estimate: $cpu hours, but athena will not run in the CI"
```
message is not printed because the script never reaches that point.S1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/139Handle EVNT->EVNT jobs in CI and logParser2021-06-17T13:30:43+02:00Spyros ArgyropoulosHandle EVNT->EVNT jobs in CI and logParserThese jobs produce a `log.afterburn` instead of `log.generate`.
- [x] I would need an example to see how to treat this
- [x] How can we identify that it's an EVNT->EVNT job from the log?
- [x] Do we need to modify the Gen_tf command?
-...These jobs produce a `log.afterburn` instead of `log.generate`.
- [x] I would need an example to see how to treat this
- [x] How can we identify that it's an EVNT->EVNT job from the log?
- [x] Do we need to modify the Gen_tf command?
- [x] Test with `700267`S1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/138Check for multiple instances of TestHepMC (and TestLHE?)2021-04-01T14:20:41+02:00Christian GutschowCheck for multiple instances of TestHepMC (and TestLHE?)In general, the transform will create an instance of TestHepMC (and in the future also TestLHE) and run some checks as part of the job. For some setups the default thresholds used in these packages may be too strict and occasionally we g...In general, the transform will create an instance of TestHepMC (and in the future also TestLHE) and run some checks as part of the job. For some setups the default thresholds used in these packages may be too strict and occasionally we get JOs that try to loosen them a bit, which is usually fine.
We recently had a case (!1066) where a fresh instance of TestHepMC was created, and the threshold were tweaked on the new instance but not the one that the transform had already created, which was then causing issues down the line.
Could we catch this sort of thing in the CI? I imagine it would just be a case of checking for a line like
```
genSeq += TestHepMC()
```
and throwing an error?S1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/137Default value of inputFilesPerJob not known to logParser (?)2021-02-22T11:33:28+01:00Christian GutschowDefault value of inputFilesPerJob not known to logParser (?)Hi,
it was observed in ATLMCPROD-8962 that the logParser doesn't seem to know that the default value for {{inputFilesPerJob}} is 1 for setups that use input LHE/EVNT files, seeing as it's printing:
```
---------------------
Generate tr...Hi,
it was observed in ATLMCPROD-8962 that the logParser doesn't seem to know that the default value for {{inputFilesPerJob}} is 1 for setups that use input LHE/EVNT files, seeing as it's printing:
```
---------------------
Generate transform params:
---------------------
- ecmEnergy = 13000.0
- nEventsPerJob = 20000
- Requested output events = 20000
- transform = Gen_tf
- inputFilesPerJob = 0
- inputGeneratorFile = 100001/mc15_13TeV.100001.CompHepPy8EG_HbbarZlljj600GeV.evgen.TXT.e0000/TXT.100001._000001.tar.gz
- evgenkeywords = not found <- WARNING: Keyword check has not been performed. Please check that the keywords used in the jobOption are in the allowed list of keywords: https://gitlab.cern.ch/atlas/athena/-/blob/21.6/Generators/EvgenJobTransforms/share/file/evgenkeywords.txt
ERROR: 1 input files used while inputFilesPerJob=0
```
See example logs [here](/afs/cern.ch/user/a/aytul/public/JIRA-2021-Request-AllSignalSamples/900111/900123/).https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/136Human-readable tarball sizes2021-01-23T10:18:19+01:00Christian GutschowHuman-readable tarball sizesWould it be possible to convert the units to e.g. MB in the print out here?
```
/eos/user/m/mgignac/mc/mc_13TeV.Sh_2210_Zee_EnhFun_pTV2_valid.GRID.tar.gz size : 137166995 Files above 100MB can't be accepted.
```Would it be possible to convert the units to e.g. MB in the print out here?
```
/eos/user/m/mgignac/mc/mc_13TeV.Sh_2210_Zee_EnhFun_pTV2_valid.GRID.tar.gz size : 137166995 Files above 100MB can't be accepted.
```https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/135Sanity check for EVNT-to-EVNT transforms2021-06-17T11:07:17+02:00Christian GutschowSanity check for EVNT-to-EVNT transformsHi,
here's an [example JO](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/950xxx/950096/mc.Sh_2210_Zee_E2Etransform_valid.py) for an EVNT-to-EVNT transform.
This basically clones an input EVNT, but only copies the ...Hi,
here's an [example JO](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/950xxx/950096/mc.Sh_2210_Zee_E2Etransform_valid.py) for an EVNT-to-EVNT transform.
This basically clones an input EVNT, but only copies the event if it passes some Athena filter, hence most of the logic being protected by the `if runArgs.trfSubstepName == 'afterburn':` statement.
Now, because it copies the original EVNT, the new EVNT would have the MC channel number (or run number in the HepMC GenEvent) set to the original DSID and not the new DSID (of the E2E transform JO).
This can now be patched using the `postSeq.CountHepMC.CorrectRunNumber = True` flag seen at the bottom. Could we use the CI to catch cases where such a JO is being added, but that tag is missing from the JO?
(In principle, there is a printout in the `log.afterburn` produced by an E2E transform which one could grep for, but the CI doesn't handle jobs without input EVNT files yet.)
Thoughts/ideas?S1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/134Typo in MadSpin config for DSID 500326?2021-01-19T11:05:31+01:00Hongtao YangTypo in MadSpin config for DSID 500326?Hi,
When I check the config https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/500xxx/500326/MadGraphControl_SM4topsLOInclusive.py#L116, I noticed this line seems to have a typo:
```
set Nevents_for_max_weigth 75
```
...Hi,
When I check the config https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/500xxx/500326/MadGraphControl_SM4topsLOInclusive.py#L116, I noticed this line seems to have a typo:
```
set Nevents_for_max_weigth 75
```
With the current config, the MC production with 21.6.55 will give the following error
```
generate 01:57:55 Py:MadGraphUtils ERROR Command "generate_events run_01" interrupted with error:
generate 01:57:55 Py:MadGraphUtils ERROR InvalidCmd : Unknown options Nevents_for_max_weigth
```
I think this line should be fixed to
```
set Nevents_for_max_weight 75
```
After it is fixed the MC production with 21.6.55 can proceed without above error.
Best regards,
Hongtaohttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/133scripts/commit_new_dsid.sh crashes when reading directories2021-01-15T11:20:37+01:00Petr Jackascripts/commit_new_dsid.sh crashes when reading directories```
./scripts/commit_new_dsid.sh Var3c/* -m="Message" --dry-run
```
It fails when it tries to convert JOs directories inside Var3c directory with the message:
```
Traceback (most recent call last):
File "scripts/jo_utils.py", line 8...```
./scripts/commit_new_dsid.sh Var3c/* -m="Message" --dry-run
```
It fails when it tries to convert JOs directories inside Var3c directory with the message:
```
Traceback (most recent call last):
File "scripts/jo_utils.py", line 87, in <module>
_parse(args.DSIDs)
File "scripts/jo_utils.py", line 10, in _parse
dsids = [ int(d) for d in dsids ] # turn strings to integers
File "scripts/jo_utils.py", line 10, in <listcomp>
dsids = [ int(d) for d in dsids ] # turn strings to integers
ValueError: invalid literal for int() with base 10: 'Var3c/py8_yprod_var3cDown'
```
This issue was introduced in this commit: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/commit/2546cd6015fd7a1b95ebfeafa31613c1645e421a
It is still possible to run the script when directories are renamed into dummy dsid numbers
./scripts/commit_new_dsid.sh -d=100000,100001 -m="Adding ttgamma MG+Py8 Var3c variation samples" --dry-run
I attached a tar file with Var3c directory.
[Var3c.tar.gz](/uploads/2d6676234f4a093b0b06806f8e4e3196/Var3c.tar.gz)S1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/132Harmonisation of printouts in tranform and generator interfaces2021-06-17T13:03:33+02:00Spyros ArgyropoulosHarmonisation of printouts in tranform and generator interfacesMost of the printouts from the transform have the following format:
```
IDENTIFIER KEYWORD = VALUE
```
e.g.
```
08:29:01 Py:Gen_tf INFO .transform = Gen_tf
```
however this **very often not the case**. A f...Most of the printouts from the transform have the following format:
```
IDENTIFIER KEYWORD = VALUE
```
e.g.
```
08:29:01 Py:Gen_tf INFO .transform = Gen_tf
```
however this **very often not the case**. A few examples:
```
08:29:01 Py:Gen_tf INFO nEventsPerJob set to 2000
08:29:01 Py:Gen_tf INFO Requested output events 100
08:29:01 Py:Gen_tf WARNING Could not find evgenkeywords.txt file EvgenJobTransforms/evgenkeywords.txt in $JOBOPTSEARCHPATH
05:14:02 Nb of events : 20000
```
This means that new checks that would otherwise be trivial to implement require changes in several places (e.g. !863) and the introduction of logic which is "hacky".
We should make sure that new printouts always conform to the correct format `IDENTIFIER KEYWORD = VALUE` both in the **transform** but also in the **generator interfaces** and the above line should be **printed only once in log.generate**
I am not sure what is the best approach here. Perhaps put this in place as a "coding rule" and make everyone aware of this. (Strict checks would probably be more time-consuming to implement than just putting in place coding practices)
@ewelina I just opened this so that we somehow bring it up with the generator experts to make things easier in the future. You probably know best how to address this and maybe can discuss this in a GIT meeting.Futurehttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/131LogParser fails to pickup nevents keyword2020-12-09T18:06:20+01:00Christian GutschowLogParser fails to pickup nevents keywordFrom @avroy:
```
When trying to check some new JOs that were generated using 21.6.54 and cc7, the logPArser failed with the following error
Traceback (most recent call last):
File "scripts/logParser.py", line 296, in madgraphChecks
...From @avroy:
```
When trying to check some new JOs that were generated using 21.6.54 and cc7, the logPArser failed with the following error
Traceback (most recent call last):
File "scripts/logParser.py", line 296, in madgraphChecks
neventsMG=int(float(generatorDict['"nevents"'][0]))
IndexError: list index out of range
I think the error is associated with the fact that in the new log file, the keyword is logged as nevents (i.e. without the quotes). You can find the log file in the uploaded zipball in https://its.cern.ch/jira/browse/ATLMCPROD-8926
Please look at JOs/200xxx/200001/log.generate
```S2.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/130Support for Centos7 releases2021-01-10T17:33:29+01:00Christian GutschowSupport for Centos7 releasesStarting with release 21.6.51, the releases are built for Centos7 machines and so we should not be using SLC6 containers in the CI for those anymore (and gridpacks prepared on C7 machines are fine to use for those releases).Starting with release 21.6.51, the releases are built for Centos7 machines and so we should not be using SLC6 containers in the CI for those anymore (and gridpacks prepared on C7 machines are fine to use for those releases).S1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/129Improve handling of madgraph checks2021-01-01T18:32:48+01:00Spyros ArgyropoulosImprove handling of madgraph checksInstead of reading the whole file for the madgraphchecks make use of appropriate dictionary, where values can be overwritten.
ATLMCPROD-8252Instead of reading the whole file for the madgraphchecks make use of appropriate dictionary, where values can be overwritten.
ATLMCPROD-8252S1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/128evgen keywords not always being checked2021-01-14T18:12:46+01:00Christian Gutschowevgen keywords not always being checkedJust stumbled across this by accident:
The key words listed in `evgenConfig.keywords` should match the ones in the [official list](https://gitlab.cern.ch/atlas/athena/-/blob/21.6/Generators/EvgenJobTransforms/share/file/evgenkeywords.tx...Just stumbled across this by accident:
The key words listed in `evgenConfig.keywords` should match the ones in the [official list](https://gitlab.cern.ch/atlas/athena/-/blob/21.6/Generators/EvgenJobTransforms/share/file/evgenkeywords.txt). It turns out that when the transform doesn't find the official list in the JobOptions search path for some reason, it will be unable to check for potential mismatches and hence also not be able to print an error message.
If there's an undefined key word, the transform _should_ print:
```
msg = "evgenConfig.keywords contains non-standard keywords: %s. " % ", ".join(evil_keywords)
msg += "Please check the allowed keywords list and fix."
```
but if it cannot find the standard list it just says
```
08:29:01 Py:Gen_tf WARNING Could not find evgenkeywords.txt file EvgenJobTransforms/evgenkeywords.txt in $JOBOPTSEARCHPATH
```
in the log and the CI continues happily, see example log here:
```
/eos/atlas/atlascerngroupdisk/phys-gener/WeakBoson/SingleBoson/log/log.generate
```
Could we get the logParser to perform the check as well?S1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/127ERROR Directory <dsid> does not exist2020-11-17T10:07:49+01:00Frank SiegertERROR Directory <dsid> does not existWith the attached [mcjoboptions.tar.gz](/uploads/f37339d16c15c97b60c40ca97f0aeb54/mcjoboptions.tar.gz) I have problems adding a new setup using the commit script. The `--dry-run` works fine, but the `-n` (or identically without `-n`) err...With the attached [mcjoboptions.tar.gz](/uploads/f37339d16c15c97b60c40ca97f0aeb54/mcjoboptions.tar.gz) I have problems adding a new setup using the commit script. The `--dry-run` works fine, but the `-n` (or identically without `-n`) errors out as follows:
```
[15:29 tauruslogin3: mcjoboptions]$ ./scripts/commit_new_dsid.sh wip/testForSpyros -m='Sherpa 2.2.10 test for Spyros' --dry-run
INFO: will use following remote for pushing: origin
Will use branch: dsid_fsiegert_wiptestForSpyros...
Will create new branch: dsid_fsiegert_wiptestForSpyros
Checking jO consistency and DSID ranges ...
Will move wip/testForSpyros to 700xxx/700119
New DSID directory: wip/testForSpyros ...
OK: log.generate file found.
OK: log.generate file contains no errors
OK: CI job expected to last less than 1h - time estimate: 0.09 hours
Will now add files to git commit
File: wip/testForSpyros/log.generate cannot be added to the commit. Skipping.
Will add: wip/testForSpyros/log.generate.short
Will add: wip/testForSpyros/mc_13TeV.Sh_2210_tttt_muQHT2.GRID.tar.gz
Will add: wip/testForSpyros/mc.Sh_2210_testForSpyros.py
[15:30 tauruslogin3: mcjoboptions]$ ./scripts/commit_new_dsid.sh wip/testForSpyros -m='Sherpa 2.2.10 test for Spyros' -n
INFO: will use following remote for pushing: origin
Will use branch: dsid_fsiegert_wiptestForSpyros...
Will create new branch: dsid_fsiegert_wiptestForSpyros
Checking jO consistency and DSID ranges ...
ERROR: Directory 700119 does not exist
```S2.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/126incorrect printout for multi-block DSIDs2020-12-22T12:49:06+01:00Christian Gutschowincorrect printout for multi-block DSIDsA minor issue, but I just came across a case in !765 where I submitted two JOs, one for a physics block and one for the validation block and the commit script ended up saying:
```
The following DSIDs have been assigned:
100xxx/100000 -...A minor issue, but I just came across a case in !765 where I submitted two JOs, one for a physics block and one for the validation block and the commit script ended up saying:
```
The following DSIDs have been assigned:
100xxx/100000 -> 950xxx/950098
100xxx/100001 -> 500xxx/500332
Run: ./scripts/commit_new_dsid.sh -d=950098-500332 -m="aMC@NLOPy8 triphoton setups for PMG pub note" to push them to git
```
Note the range being suggested for the `-d` flag :sweat_smile:Futurehttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/125JO shouldn't hardcode ATHENA_PROC_NUMBER2020-11-14T13:51:07+01:00Christian GutschowJO shouldn't hardcode ATHENA_PROC_NUMBERThe environment variable for multi-threading `ATHENA_PROC_NUMBER` should be set by prodsys, not the JOs.
Can we make the CI fail if the JOs try to assign a value to that? (The JO are free to ask if this environment variable exists and w...The environment variable for multi-threading `ATHENA_PROC_NUMBER` should be set by prodsys, not the JOs.
Can we make the CI fail if the JOs try to assign a value to that? (The JO are free to ask if this environment variable exists and what it's value is (e.g. to pass it into Madgraph), but they shouldn't try to overwrite its value
See e.g. MR !745 where this had to be corrected, but e.g. [this JO](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/421xxx/421006/mc.MGPy8EG_A14NNPDF23_tWgamma_art.py) where it's used in an acceptable way.S2.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/124New Pythia 8 checks for changing parameters2023-10-26T16:13:33+02:00Spyros ArgyropoulosNew Pythia 8 checks for changing parametersImplement code to use new developments by Giancarlo mentioned in AGENE-1915.
- [ ] To be seen which of these should result in an error and which should be a warning.
- [ ] Also check if this catches the bug reported in ATLMCPROD-7723Implement code to use new developments by Giancarlo mentioned in AGENE-1915.
- [ ] To be seen which of these should result in an error and which should be a warning.
- [ ] Also check if this catches the bug reported in ATLMCPROD-7723S1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/123logParser picks out wrong COM energy2020-09-21T16:08:05+02:00Christian GutschowlogParser picks out wrong COM energySee e.g. !676 where it extracted `ecmEnergy = 13000` even though the `log.generate` was for 8 TeV:
```
/afs/cern.ch/user/c/cgutscho/public/forSpyros/log.generate
```
Why though?See e.g. !676 where it extracted `ecmEnergy = 13000` even though the `log.generate` was for 8 TeV:
```
/afs/cern.ch/user/c/cgutscho/public/forSpyros/log.generate
```
Why though?S2.2020Spyros ArgyropoulosSpyros Argyropoulos