MC Job Options issueshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues2021-04-22T16:46:58+02:00https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/101Long compilation time when running MadGraph in atlas/slc6-atlasos causing CI...2021-04-22T16:46:58+02:00Jason Robert VeatchLong compilation time when running MadGraph in atlas/slc6-atlasos causing CI timeouts## How to reproduce the problem
```
# Mount cvmfs
sudo mkdir -p /cvmfs/atlas.cern.ch
sudo mkdir -p /cvmfs/atlas-condb.cern.ch
sudo mkdir -p /cvmfs/grid.cern.ch
sudo mkdir -p /cvmfs/sft.cern.ch
sudo mount -t cvmfs atlas.cern.ch /cvmfs/at...## How to reproduce the problem
```
# Mount cvmfs
sudo mkdir -p /cvmfs/atlas.cern.ch
sudo mkdir -p /cvmfs/atlas-condb.cern.ch
sudo mkdir -p /cvmfs/grid.cern.ch
sudo mkdir -p /cvmfs/sft.cern.ch
sudo mount -t cvmfs atlas.cern.ch /cvmfs/atlas.cern.ch
sudo mount -t cvmfs atlas-condb.cern.ch /cvmfs/atlas-condb.cern.ch
sudo mount -t cvmfs grid.cern.ch /cvmfs/grid.cern.ch
sudo mount -t cvmfs sft.cern.ch /cvmfs/sft.cern.ch
# Get the docker image
docker pull atlas/slc6-atlasos
# Run image in a container and mount cvmfs
docker run -it -v /cvmfs:/cvmfs b4cfa1203c45
# Inside the docker container get the mcjoboptions repo (or alternatively you can copy it from your local area with docker cp)
kinit USER@CERN.CH
git clone https://:@gitlab.cern.ch:8443/atlas-physics/pmg/mcjoboptions.git
cd mcjoboptions
git checkout dsid_jveatch_500538
export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
./scripts/run_athena.sh
```
## Debugging
#### Bottleneck: compilation time
Comparing the running times at several execution points on lxplus and in the container it seems that the problem lies on the compilation times:
```
Docker (running on a dual-core laptop with cvmfs mounted via fuse):
generate 19:24:54 INFO: Using LHAPDF v6.2.3 interface for PDFs
generate 19:26:19 INFO: Compiling source…
generate 19:31:53 INFO: ...done, continuing with P* directories => 334 sec
generate 19:31:53 INFO: Compiling StdHEP (can take a couple of minutes) ...
generate 19:45:23 INFO: …done. => 810 sec
generate 19:45:24 INFO: Compiling on 1 cores
generate 19:45:24 INFO: Compiling P0_gg_ttx...
generate 19:54:37 INFO: P0_gg_ttx done. => 553 sec
vs lxplus (interactive run)
10:15:08 INFO: Using LHAPDF v6.2.3 interface for PDFs
10:15:14 INFO: Compiling source...
10:15:26 INFO: ...done, continuing with P* directories => 12 sec
10:15:26 INFO: Compiling StdHEP (can take a couple of minutes) ...
10:16:04 INFO: …done. => 38 sec
10:16:05 INFO: Compiling on 1 cores
10:16:05 INFO: Compiling P0_gg_ttx...
10:16:45 INFO: P0_gg_ttx done. => 40 sec
```
#### Size/memory
The container available space is 53GB and where the compilation becomes slow the size of the container is ~230 MB so much smaller => **disk size does not seem to be causing the slowdown**
The available memory was changed from 1GB to 8GB without any effect on the compilation time in the container.
#### Reading from cvmfs
I run a script that 1) reads all the lines from a file that lives on cvmfs and 2) copies this script to a local directory and remove it.
The local run on my laptop (with cvmfs mounted with fuse gives this):
```
Reading 500 times
real 0m21.504s
user 0m12.937s
sys 0m8.429s
Copying 500 times
real 0m4.993s
user 0m0.620s
sys 0m2.440s
```
Running the script from the container, where the locally available cvmfs directory (see above) is mounted to the container as a volume, gives this:
```
Reading 500 times
real 1m44.217s
user 0m18.329s
sys 0m20.376s
Copying 500 times
real 0m3.716s
user 0m0.570s
sys 0m0.981s
```
**So reading a file seems to be 5x slower when running from the docker container**
#### Next steps
* [ ] To debug further we would need to know exactly how cvmfs is mounted in the gitlab runner
* [ ] Also need to check whether there is any correlation between slow reading times on cvmfs and MG - does MG call the compilers from cvmfs/reads any other info from cvmfs? Probably
---
Original report from Jason - similar issues observed with other processes which are apparently very different than this one (an NLO one and a LO one with a long decay chain)
Job [#7937441](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/7937441) failed for 9a6a4445a5bcf7ae08ac81888cccd79ef4cc4af3:
Dear experts,
The run_athena job for my branch times out. I have been trying to debug this from my side, but I am at a loss about how to proceed. The estimated execution time from each log.generate.short is ~0.1 hours, so I wouldn't expect this to be an issue. Could you please advise?
Thanks in advance,
JasonFutureSpyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/102Add checks for input files2020-07-30T07:37:56+02:00Spyros ArgyropoulosAdd checks for input filesAdd checks:
* [ ] no `evgenConfig.inputfilecheck`
* [ ] no `evgenConfig.inputconfcheck` allowed
both are always in the top JO
Also
* [ ] Restructure checks so that everything related to reading the jO is done in one place and everyt...Add checks:
* [ ] no `evgenConfig.inputfilecheck`
* [ ] no `evgenConfig.inputconfcheck` allowed
both are always in the top JO
Also
* [ ] Restructure checks so that everything related to reading the jO is done in one place and everything related to reading the log is done in `logParser`S2.2020https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/104Harmonise whitelist with Gen_tf2021-01-04T15:27:52+01:00Spyros ArgyropoulosHarmonise whitelist with Gen_tfCurrently the transform allows setups which are explicitly excluded in the whitelist, e.g. `DSID/dat/*.dat` which is excluded here: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/whitelist.sh#L9 as discussed ...Currently the transform allows setups which are explicitly excluded in the whitelist, e.g. `DSID/dat/*.dat` which is excluded here: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/whitelist.sh#L9 as discussed in !298
I no longer remember why we excluded some cases but we should definitely harmonise what is done in the transform and what is done in the CI.
@ewelina could you go through the whitelist and let me know what is treated differently there and in `Gen_tf` so that we harmonise?
Tag @cgutscho @fsiegertS1.2021https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/105int conversion of a string "nevents" that contains a float2020-04-29T21:17:33+02:00Xiaohu Sunint conversion of a string "nevents" that contains a floatQuite often people define nevents by multiplying a bunch of numbers (safe margin, truth efficiency etc.), then nevents is a float. The log file would contain
20:49:39 Py:MadGraphUtils INFO Setting nevents = 11000.0.
where "1100.0" ...Quite often people define nevents by multiplying a bunch of numbers (safe margin, truth efficiency etc.), then nevents is a float. The log file would contain
20:49:39 Py:MadGraphUtils INFO Setting nevents = 11000.0.
where "1100.0" is picked by logParser as a string.
Then in the check script
https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/logParser.py#L271
neventsMG=int(generatorDict['nevents'][0])
will crash, as int("11000.0") would not work.
ValueError: invalid literal for int() with base 10
Would this be fixed? Thanks!
Best,
Xiaohu2020-04-30https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/106Job Failed #81849032020-04-29T21:17:04+02:00Xiaohu SunJob Failed #8184903Job [#8184903](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/8184903) failed for 127510e74ddbca868a29efecbd1b8c6144bf63b8:Job [#8184903](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/8184903) failed for 127510e74ddbca868a29efecbd1b8c6144bf63b8:https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/107logParser crash due to double print of nevents in MG2020-12-09T10:09:13+01:00Spyros ArgyropouloslogParser crash due to double print of nevents in MGlogParser failing again because of double print-out of nevents:
```
22:32:58 Py:MadGraphUtils INFO Setting nevents = 11000.
22:33:05 Py:MadGraphUtils INFO "nevents" = 11000
```
The first printout seemed to be the old implementa...logParser failing again because of double print-out of nevents:
```
22:32:58 Py:MadGraphUtils INFO Setting nevents = 11000.
22:33:05 Py:MadGraphUtils INFO "nevents" = 11000
```
The first printout seemed to be the old implementation before the restructuring in rel. 21.6.23, however I don't understand why both printouts are printed now. Is this expected @zmarshal @hmildner @mcfayden ?
The jO is attached - provided by @ewelina - this was run in 21.6.27.
[mc.MGPy8EG_A14NNPDF23_tWgamma.py](/uploads/66b17b0604410f93d826969cc504c7ef/mc.MGPy8EG_A14NNPDF23_tWgamma.py)
Just to say if this is expected we can easily change the behaviour to parse lines containing `"nevents"` (with quotes) currently it tries to find lines containing `nevents` (without quotes) and since the printout is different (trailing dot) the first print-out is not parsed correctly. S1.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/108logParser crash in CI not handled correctly2020-05-05T20:25:13+02:00Spyros ArgyropouloslogParser crash in CI not handled correctlySee https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/8197685See https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/8197685S1.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/109logParser fails in CI when run on MadGraph due to nevents check2020-05-17T14:24:09+02:00Spyros ArgyropouloslogParser fails in CI when run on MadGraph due to nevents checkAs seen in !412 when running a jO with:
```
evgenConfig.nEventsPerJob = 10000
nevents = runArgs.maxEvents1.2 if runArgs.maxEvents>0 else 1.1evgenConfig.nEventsPerJob
```
`logParser` fails with
```
ERROR: Increase nevents to be gener...As seen in !412 when running a jO with:
```
evgenConfig.nEventsPerJob = 10000
nevents = runArgs.maxEvents1.2 if runArgs.maxEvents>0 else 1.1evgenConfig.nEventsPerJob
```
`logParser` fails with
```
ERROR: Increase nevents to be generated in MG from 120 to 11000
```S1.2020Spyros ArgyropoulosSpyros Argyropoulos2020-05-16https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/110unwarranted logParser fail at commit-script stage?2020-05-11T15:51:37+02:00Christian Gutschowunwarranted logParser fail at commit-script stage?From @mgignac:
The commit script complained on [this line](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/logParser.py#L222), but I'm not sure that the logic is correct in the logParser. In my log file, I ha...From @mgignac:
The commit script complained on [this line](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/logParser.py#L222), but I'm not sure that the logic is correct in the logParser. In my log file, I have a single message that's being flagged:
`Matrix_Element_Handler::GenerateOneEvent(): Point for '2_3__u__u__W+__d__u' exceeds maximum by 15.4543.`
And when the above line fails, it's dividing by zero (e.g. `nEventsRequested` is not set).
```
Traceback (most recent call last):
File "scripts/logParser.py", line 624, in <module>
main()
File "scripts/logParser.py", line 485, in main
sherpaChecks(opts.INPUT_FILE)
File "scripts/logParser.py", line 223, in sherpaChecks
logwarn("","WARNING: be aware of: "+str(numexceeds*100./nEventsRequested)+"% of the event weights exceed the maximum by a factor of ten")
ZeroDivisionError: float division by zero
```https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/111commit_new_dsid.sh creates wrong links at the -n step2020-06-03T09:54:09+02:00Judita Mamuziccommit_new_dsid.sh creates wrong links at the -n stepDear Experts,
I would like to upload the new JO and the control file for a case when many DSIDs use a same control file, in SUSY. The folder structure is:
```
Dir1:
mc.1.py -> Control.py
Control.py
Dir2:
mc.2.py -> ../Dir1/Control.py
`...Dear Experts,
I would like to upload the new JO and the control file for a case when many DSIDs use a same control file, in SUSY. The folder structure is:
```
Dir1:
mc.1.py -> Control.py
Control.py
Dir2:
mc.2.py -> ../Dir1/Control.py
```
The jobs run successfully, and the first step of checks with commit_new_dsid.sh using --dry-run is also successful. However, when the option -n is used in the second step like:
```
./scripts/commit_new_dsid.sh -d=100001-100082 -n -m="SUSY direct stau, TFilt."
```
the linked files become wrong.
Initial input:
```
ls -lah 100xxx/100001/mc* 100xxx/100002/mc*
100xxx/100001/mc.MGPy8EG_StauStauDirect_120p0_1p0_TFilt.py -> SUSY_SimplifiedModel_StauStauDirect.py
100xxx/100002/mc.MGPy8EG_StauStauDirect_160p0_1p0_TFilt.py -> ../100001/SUSY_SimplifiedModel_StauStauDirect.py
```
After step -n:
```
ls -lah 501xxx/501047/mc* 501xxx/501048/mc*
501xxx/501047/mc.MGPy8EG_StauStauDirect_120p0_1p0_TFilt.py -> SUSY_SimplifiedModel_StauStauDirect.py
501xxx/501048/mc.MGPy8EG_StauStauDirect_160p0_1p0_TFilt.py -> ../../501xxx/501047/mc.MGPy8EG_StauStauDirect_160p0_1p0_TFilt.py
```
where the last file should be:
```
501xxx/501048/mc.MGPy8EG_StauStauDirect_160p0_1p0_TFilt.py -> ../501047/SUSY_SimplifiedModel_StauStauDirect.py
```
It seems there is a problem with the copy of files with a soft link.
I attach here the reduced example.
Many thanks for your help.
Cheers,
Judita
/cc @gstark , @sargyrop , @wfawcett , @cgutscho
[100xxx_short.tar.gz](/uploads/2ac3b683ca55e00c418bc56071864798/100xxx_short.tar.gz)Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/112CI/logParser addition: maximum value for inputFilesPerJob2020-05-19T10:36:36+02:00Christian GutschowCI/logParser addition: maximum value for inputFilesPerJobThe maximum number of input LHE/EVNT files is `inputFilesPerJob=1000`.
Could this be added to the CI/logParser (whichever is best)?The maximum number of input LHE/EVNT files is `inputFilesPerJob=1000`.
Could this be added to the CI/logParser (whichever is best)?Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/113Improve `check_modified_files` behaviour2020-08-04T10:02:29+02:00Spyros ArgyropoulosImprove `check_modified_files` behaviourDo a local rebase before checking what changed to avoid failed pipelines for commits that are behind master.Do a local rebase before checking what changed to avoid failed pipelines for commits that are behind master.S2.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/114Limit of inputFilesPerJob2021-01-21T18:13:58+01:00Xiaohu SunLimit of inputFilesPerJobIn this ticket https://its.cern.ch/jira/browse/ATLMCPROD-8583 the request needs external LHE files.
In order to have 10000 events per job, we have to set inputFilesPerJob to 200 for some of the JOs. But this tiggers an error in logparse...In this ticket https://its.cern.ch/jira/browse/ATLMCPROD-8583 the request needs external LHE files.
In order to have 10000 events per job, we have to set inputFilesPerJob to 200 for some of the JOs. But this tiggers an error in logparser checks that inputFilesPerJob is limited up to 100.
Well, we cannot cut 10000 events to 5000 to allow inputFilesPerJob back in the limit, because that will touch the CPU hour limit (5000 in the JO takes <1 hour to finish in this case).
Do you suggest how to proceed?
Thanks!Christian GutschowChristian Gutschowhttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/115Wrong printing of branches using a DSID2020-08-01T16:41:21+02:00Spyros ArgyropoulosWrong printing of branches using a DSIDI had a wrong error message when I tried to commit JOs for 421332:
the message I got was that dsid_jveatch_600076 already uses this DSID.
I have checked this branch and it was not the case.
I found that this DSID was used in one of the e...I had a wrong error message when I tried to commit JOs for 421332:
the message I got was that dsid_jveatch_600076 already uses this DSID.
I have checked this branch and it was not the case.
I found that this DSID was used in one of the earlier branches awaiting approval.
I think the problem is that the list of branches
https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/check_jo_consistency.py#L118
is ordered from the newest branch to the oldest and when a new branch is submitted for merging it is updated for the changes that were introduced in other branches awaiting the approval - this way always the newest one will be pointed as the one using already a given DSID (in case of conflict).S2.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/116Don't commit emacs backup files2020-06-23T15:59:46+02:00Christian GutschowDon't commit emacs backup filesCurrently the files ending in `blah~` seem to be included by the commit scripts see e.g. [here](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/tree/master/500xxx/500908).
Cheers,
ChrisCurrently the files ending in `blah~` seem to be included by the commit scripts see e.g. [here](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/tree/master/500xxx/500908).
Cheers,
ChrisSpyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/117Check number of files in gridpack2023-03-01T07:42:53+01:00Christian GutschowCheck number of files in gridpackThe number of files in a gridpack shouldn't exceed 80k, otherwise some grid sites will crash. This has happened a number of times recently, e.g. for the FxFx job where the gridpack contained several files per Feynman diagram. MadGraph co...The number of files in a gridpack shouldn't exceed 80k, otherwise some grid sites will crash. This has happened a number of times recently, e.g. for the FxFx job where the gridpack contained several files per Feynman diagram. MadGraph control cleans up logs and .o files in the latest release, but for older releases it would be good to have a dedicated pipeline step that throws an error if the number of files in the gridpack is larger than 80k. Probably something like `tar -ztvf *.tgz *.tar.gz` could work?S2.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/118Add checks for `inputfilecheck` and `inputGeneratorFile`2020-08-03T10:25:32+02:00Christian GutschowAdd checks for `inputfilecheck` and `inputGeneratorFile`Please see this test commit: 52aa8087
which has the following two lines in the JO:
```
evgenConfig.inputfilecheck = 'PhPy8EG_NNPDF30LO_EWK_ZZeeee'
runArgs.inputGeneratorFile = 'PhPy8EG_NNPDF30LO_EWK_ZZeeee._00052.events.tar.gz'
```
Th...Please see this test commit: 52aa8087
which has the following two lines in the JO:
```
evgenConfig.inputfilecheck = 'PhPy8EG_NNPDF30LO_EWK_ZZeeee'
runArgs.inputGeneratorFile = 'PhPy8EG_NNPDF30LO_EWK_ZZeeee._00052.events.tar.gz'
```
The first one I thought the CI would already be catching [along with `inputconfcheck`, no?] and the second one is clearly a problem for central production.
Can we catch these? I guess the logParser should already throw an error before the files are even committed to gitlab.S2.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/119Mentioning ATLMCPROD ticket in MR doesn't push link to Jira any longer2020-07-31T10:47:09+02:00Christian GutschowMentioning ATLMCPROD ticket in MR doesn't push link to Jira any longer... not sure there's much we can do about this though?
Any ideas anyone?... not sure there's much we can do about this though?
Any ideas anyone?https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/120Allow runArgs to be referred to in JOs but not to be overwritten by JOs2020-08-22T13:08:09+02:00Christian GutschowAllow runArgs to be referred to in JOs but not to be overwritten by JOsSee !631 for an example.See !631 for an example.S2.2020Spyros ArgyropoulosSpyros Argyropoulos2020-08-14https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/121logParser rejects logs with nEventsPerJob > 10k2020-08-28T16:51:47+02:00Christian GutschowlogParser rejects logs with nEventsPerJob > 10kFollowing the successful test in ATLMCPROD-8659, we should allow cases where `nEventsPerJob` is a multiple of 10k.
Currently it fails saying
```
- CountHepMC Events passing all checks and written = 20000 <-- ERROR: Not an acceptable n...Following the successful test in ATLMCPROD-8659, we should allow cases where `nEventsPerJob` is a multiple of 10k.
Currently it fails saying
```
- CountHepMC Events passing all checks and written = 20000 <-- ERROR: Not an acceptable number of events for production (1, 2, 5, 10, 20, 50, 100, 200, 500, 1000, 2000, 5000, 10000)
```S2.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/122Bug: handling of jobs with external LHE file in logParser step2020-09-06T13:46:23+02:00Spyros ArgyropoulosBug: handling of jobs with external LHE file in logParser stepWhen external LHE files are used `log.generate.short` is added to the commit but `run_athena` just skips the job without producing any `log.generate_ci` file. Then the `check_logParser` job thinks this is a bug because if `log.generate.s...When external LHE files are used `log.generate.short` is added to the commit but `run_athena` just skips the job without producing any `log.generate_ci` file. Then the `check_logParser` job thinks this is a bug because if `log.generate.short` is present `log.generate_ci` should also be present as well at this point in the CI and complains see !652S2.2020Spyros ArgyropoulosSpyros Argyropoulos2020-09-04https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/123logParser picks out wrong COM energy2020-09-21T16:08:05+02:00Christian GutschowlogParser picks out wrong COM energySee e.g. !676 where it extracted `ecmEnergy = 13000` even though the `log.generate` was for 8 TeV:
```
/afs/cern.ch/user/c/cgutscho/public/forSpyros/log.generate
```
Why though?See e.g. !676 where it extracted `ecmEnergy = 13000` even though the `log.generate` was for 8 TeV:
```
/afs/cern.ch/user/c/cgutscho/public/forSpyros/log.generate
```
Why though?S2.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/124New Pythia 8 checks for changing parameters2023-10-26T16:13:33+02:00Spyros ArgyropoulosNew Pythia 8 checks for changing parametersImplement code to use new developments by Giancarlo mentioned in AGENE-1915.
- [ ] To be seen which of these should result in an error and which should be a warning.
- [ ] Also check if this catches the bug reported in ATLMCPROD-7723Implement code to use new developments by Giancarlo mentioned in AGENE-1915.
- [ ] To be seen which of these should result in an error and which should be a warning.
- [ ] Also check if this catches the bug reported in ATLMCPROD-7723S1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/125JO shouldn't hardcode ATHENA_PROC_NUMBER2020-11-14T13:51:07+01:00Christian GutschowJO shouldn't hardcode ATHENA_PROC_NUMBERThe environment variable for multi-threading `ATHENA_PROC_NUMBER` should be set by prodsys, not the JOs.
Can we make the CI fail if the JOs try to assign a value to that? (The JO are free to ask if this environment variable exists and w...The environment variable for multi-threading `ATHENA_PROC_NUMBER` should be set by prodsys, not the JOs.
Can we make the CI fail if the JOs try to assign a value to that? (The JO are free to ask if this environment variable exists and what it's value is (e.g. to pass it into Madgraph), but they shouldn't try to overwrite its value
See e.g. MR !745 where this had to be corrected, but e.g. [this JO](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/421xxx/421006/mc.MGPy8EG_A14NNPDF23_tWgamma_art.py) where it's used in an acceptable way.S2.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/126incorrect printout for multi-block DSIDs2020-12-22T12:49:06+01:00Christian Gutschowincorrect printout for multi-block DSIDsA minor issue, but I just came across a case in !765 where I submitted two JOs, one for a physics block and one for the validation block and the commit script ended up saying:
```
The following DSIDs have been assigned:
100xxx/100000 -...A minor issue, but I just came across a case in !765 where I submitted two JOs, one for a physics block and one for the validation block and the commit script ended up saying:
```
The following DSIDs have been assigned:
100xxx/100000 -> 950xxx/950098
100xxx/100001 -> 500xxx/500332
Run: ./scripts/commit_new_dsid.sh -d=950098-500332 -m="aMC@NLOPy8 triphoton setups for PMG pub note" to push them to git
```
Note the range being suggested for the `-d` flag :sweat_smile:Futurehttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/127ERROR Directory <dsid> does not exist2020-11-17T10:07:49+01:00Frank SiegertERROR Directory <dsid> does not existWith the attached [mcjoboptions.tar.gz](/uploads/f37339d16c15c97b60c40ca97f0aeb54/mcjoboptions.tar.gz) I have problems adding a new setup using the commit script. The `--dry-run` works fine, but the `-n` (or identically without `-n`) err...With the attached [mcjoboptions.tar.gz](/uploads/f37339d16c15c97b60c40ca97f0aeb54/mcjoboptions.tar.gz) I have problems adding a new setup using the commit script. The `--dry-run` works fine, but the `-n` (or identically without `-n`) errors out as follows:
```
[15:29 tauruslogin3: mcjoboptions]$ ./scripts/commit_new_dsid.sh wip/testForSpyros -m='Sherpa 2.2.10 test for Spyros' --dry-run
INFO: will use following remote for pushing: origin
Will use branch: dsid_fsiegert_wiptestForSpyros...
Will create new branch: dsid_fsiegert_wiptestForSpyros
Checking jO consistency and DSID ranges ...
Will move wip/testForSpyros to 700xxx/700119
New DSID directory: wip/testForSpyros ...
OK: log.generate file found.
OK: log.generate file contains no errors
OK: CI job expected to last less than 1h - time estimate: 0.09 hours
Will now add files to git commit
File: wip/testForSpyros/log.generate cannot be added to the commit. Skipping.
Will add: wip/testForSpyros/log.generate.short
Will add: wip/testForSpyros/mc_13TeV.Sh_2210_tttt_muQHT2.GRID.tar.gz
Will add: wip/testForSpyros/mc.Sh_2210_testForSpyros.py
[15:30 tauruslogin3: mcjoboptions]$ ./scripts/commit_new_dsid.sh wip/testForSpyros -m='Sherpa 2.2.10 test for Spyros' -n
INFO: will use following remote for pushing: origin
Will use branch: dsid_fsiegert_wiptestForSpyros...
Will create new branch: dsid_fsiegert_wiptestForSpyros
Checking jO consistency and DSID ranges ...
ERROR: Directory 700119 does not exist
```S2.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/128evgen keywords not always being checked2021-01-14T18:12:46+01:00Christian Gutschowevgen keywords not always being checkedJust stumbled across this by accident:
The key words listed in `evgenConfig.keywords` should match the ones in the [official list](https://gitlab.cern.ch/atlas/athena/-/blob/21.6/Generators/EvgenJobTransforms/share/file/evgenkeywords.tx...Just stumbled across this by accident:
The key words listed in `evgenConfig.keywords` should match the ones in the [official list](https://gitlab.cern.ch/atlas/athena/-/blob/21.6/Generators/EvgenJobTransforms/share/file/evgenkeywords.txt). It turns out that when the transform doesn't find the official list in the JobOptions search path for some reason, it will be unable to check for potential mismatches and hence also not be able to print an error message.
If there's an undefined key word, the transform _should_ print:
```
msg = "evgenConfig.keywords contains non-standard keywords: %s. " % ", ".join(evil_keywords)
msg += "Please check the allowed keywords list and fix."
```
but if it cannot find the standard list it just says
```
08:29:01 Py:Gen_tf WARNING Could not find evgenkeywords.txt file EvgenJobTransforms/evgenkeywords.txt in $JOBOPTSEARCHPATH
```
in the log and the CI continues happily, see example log here:
```
/eos/atlas/atlascerngroupdisk/phys-gener/WeakBoson/SingleBoson/log/log.generate
```
Could we get the logParser to perform the check as well?S1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/129Improve handling of madgraph checks2021-01-01T18:32:48+01:00Spyros ArgyropoulosImprove handling of madgraph checksInstead of reading the whole file for the madgraphchecks make use of appropriate dictionary, where values can be overwritten.
ATLMCPROD-8252Instead of reading the whole file for the madgraphchecks make use of appropriate dictionary, where values can be overwritten.
ATLMCPROD-8252S1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/130Support for Centos7 releases2021-01-10T17:33:29+01:00Christian GutschowSupport for Centos7 releasesStarting with release 21.6.51, the releases are built for Centos7 machines and so we should not be using SLC6 containers in the CI for those anymore (and gridpacks prepared on C7 machines are fine to use for those releases).Starting with release 21.6.51, the releases are built for Centos7 machines and so we should not be using SLC6 containers in the CI for those anymore (and gridpacks prepared on C7 machines are fine to use for those releases).S1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/131LogParser fails to pickup nevents keyword2020-12-09T18:06:20+01:00Christian GutschowLogParser fails to pickup nevents keywordFrom @avroy:
```
When trying to check some new JOs that were generated using 21.6.54 and cc7, the logPArser failed with the following error
Traceback (most recent call last):
File "scripts/logParser.py", line 296, in madgraphChecks
...From @avroy:
```
When trying to check some new JOs that were generated using 21.6.54 and cc7, the logPArser failed with the following error
Traceback (most recent call last):
File "scripts/logParser.py", line 296, in madgraphChecks
neventsMG=int(float(generatorDict['"nevents"'][0]))
IndexError: list index out of range
I think the error is associated with the fact that in the new log file, the keyword is logged as nevents (i.e. without the quotes). You can find the log file in the uploaded zipball in https://its.cern.ch/jira/browse/ATLMCPROD-8926
Please look at JOs/200xxx/200001/log.generate
```S2.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/132Harmonisation of printouts in tranform and generator interfaces2021-06-17T13:03:33+02:00Spyros ArgyropoulosHarmonisation of printouts in tranform and generator interfacesMost of the printouts from the transform have the following format:
```
IDENTIFIER KEYWORD = VALUE
```
e.g.
```
08:29:01 Py:Gen_tf INFO .transform = Gen_tf
```
however this **very often not the case**. A f...Most of the printouts from the transform have the following format:
```
IDENTIFIER KEYWORD = VALUE
```
e.g.
```
08:29:01 Py:Gen_tf INFO .transform = Gen_tf
```
however this **very often not the case**. A few examples:
```
08:29:01 Py:Gen_tf INFO nEventsPerJob set to 2000
08:29:01 Py:Gen_tf INFO Requested output events 100
08:29:01 Py:Gen_tf WARNING Could not find evgenkeywords.txt file EvgenJobTransforms/evgenkeywords.txt in $JOBOPTSEARCHPATH
05:14:02 Nb of events : 20000
```
This means that new checks that would otherwise be trivial to implement require changes in several places (e.g. !863) and the introduction of logic which is "hacky".
We should make sure that new printouts always conform to the correct format `IDENTIFIER KEYWORD = VALUE` both in the **transform** but also in the **generator interfaces** and the above line should be **printed only once in log.generate**
I am not sure what is the best approach here. Perhaps put this in place as a "coding rule" and make everyone aware of this. (Strict checks would probably be more time-consuming to implement than just putting in place coding practices)
@ewelina I just opened this so that we somehow bring it up with the generator experts to make things easier in the future. You probably know best how to address this and maybe can discuss this in a GIT meeting.Futurehttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/133scripts/commit_new_dsid.sh crashes when reading directories2021-01-15T11:20:37+01:00Petr Jackascripts/commit_new_dsid.sh crashes when reading directories```
./scripts/commit_new_dsid.sh Var3c/* -m="Message" --dry-run
```
It fails when it tries to convert JOs directories inside Var3c directory with the message:
```
Traceback (most recent call last):
File "scripts/jo_utils.py", line 8...```
./scripts/commit_new_dsid.sh Var3c/* -m="Message" --dry-run
```
It fails when it tries to convert JOs directories inside Var3c directory with the message:
```
Traceback (most recent call last):
File "scripts/jo_utils.py", line 87, in <module>
_parse(args.DSIDs)
File "scripts/jo_utils.py", line 10, in _parse
dsids = [ int(d) for d in dsids ] # turn strings to integers
File "scripts/jo_utils.py", line 10, in <listcomp>
dsids = [ int(d) for d in dsids ] # turn strings to integers
ValueError: invalid literal for int() with base 10: 'Var3c/py8_yprod_var3cDown'
```
This issue was introduced in this commit: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/commit/2546cd6015fd7a1b95ebfeafa31613c1645e421a
It is still possible to run the script when directories are renamed into dummy dsid numbers
./scripts/commit_new_dsid.sh -d=100000,100001 -m="Adding ttgamma MG+Py8 Var3c variation samples" --dry-run
I attached a tar file with Var3c directory.
[Var3c.tar.gz](/uploads/2d6676234f4a093b0b06806f8e4e3196/Var3c.tar.gz)S1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/134Typo in MadSpin config for DSID 500326?2021-01-19T11:05:31+01:00Hongtao YangTypo in MadSpin config for DSID 500326?Hi,
When I check the config https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/500xxx/500326/MadGraphControl_SM4topsLOInclusive.py#L116, I noticed this line seems to have a typo:
```
set Nevents_for_max_weigth 75
```
...Hi,
When I check the config https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/500xxx/500326/MadGraphControl_SM4topsLOInclusive.py#L116, I noticed this line seems to have a typo:
```
set Nevents_for_max_weigth 75
```
With the current config, the MC production with 21.6.55 will give the following error
```
generate 01:57:55 Py:MadGraphUtils ERROR Command "generate_events run_01" interrupted with error:
generate 01:57:55 Py:MadGraphUtils ERROR InvalidCmd : Unknown options Nevents_for_max_weigth
```
I think this line should be fixed to
```
set Nevents_for_max_weight 75
```
After it is fixed the MC production with 21.6.55 can proceed without above error.
Best regards,
Hongtaohttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/135Sanity check for EVNT-to-EVNT transforms2021-06-17T11:07:17+02:00Christian GutschowSanity check for EVNT-to-EVNT transformsHi,
here's an [example JO](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/950xxx/950096/mc.Sh_2210_Zee_E2Etransform_valid.py) for an EVNT-to-EVNT transform.
This basically clones an input EVNT, but only copies the ...Hi,
here's an [example JO](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/950xxx/950096/mc.Sh_2210_Zee_E2Etransform_valid.py) for an EVNT-to-EVNT transform.
This basically clones an input EVNT, but only copies the event if it passes some Athena filter, hence most of the logic being protected by the `if runArgs.trfSubstepName == 'afterburn':` statement.
Now, because it copies the original EVNT, the new EVNT would have the MC channel number (or run number in the HepMC GenEvent) set to the original DSID and not the new DSID (of the E2E transform JO).
This can now be patched using the `postSeq.CountHepMC.CorrectRunNumber = True` flag seen at the bottom. Could we use the CI to catch cases where such a JO is being added, but that tag is missing from the JO?
(In principle, there is a printout in the `log.afterburn` produced by an E2E transform which one could grep for, but the CI doesn't handle jobs without input EVNT files yet.)
Thoughts/ideas?S1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/136Human-readable tarball sizes2021-01-23T10:18:19+01:00Christian GutschowHuman-readable tarball sizesWould it be possible to convert the units to e.g. MB in the print out here?
```
/eos/user/m/mgignac/mc/mc_13TeV.Sh_2210_Zee_EnhFun_pTV2_valid.GRID.tar.gz size : 137166995 Files above 100MB can't be accepted.
```Would it be possible to convert the units to e.g. MB in the print out here?
```
/eos/user/m/mgignac/mc/mc_13TeV.Sh_2210_Zee_EnhFun_pTV2_valid.GRID.tar.gz size : 137166995 Files above 100MB can't be accepted.
```https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/137Default value of inputFilesPerJob not known to logParser (?)2021-02-22T11:33:28+01:00Christian GutschowDefault value of inputFilesPerJob not known to logParser (?)Hi,
it was observed in ATLMCPROD-8962 that the logParser doesn't seem to know that the default value for {{inputFilesPerJob}} is 1 for setups that use input LHE/EVNT files, seeing as it's printing:
```
---------------------
Generate tr...Hi,
it was observed in ATLMCPROD-8962 that the logParser doesn't seem to know that the default value for {{inputFilesPerJob}} is 1 for setups that use input LHE/EVNT files, seeing as it's printing:
```
---------------------
Generate transform params:
---------------------
- ecmEnergy = 13000.0
- nEventsPerJob = 20000
- Requested output events = 20000
- transform = Gen_tf
- inputFilesPerJob = 0
- inputGeneratorFile = 100001/mc15_13TeV.100001.CompHepPy8EG_HbbarZlljj600GeV.evgen.TXT.e0000/TXT.100001._000001.tar.gz
- evgenkeywords = not found <- WARNING: Keyword check has not been performed. Please check that the keywords used in the jobOption are in the allowed list of keywords: https://gitlab.cern.ch/atlas/athena/-/blob/21.6/Generators/EvgenJobTransforms/share/file/evgenkeywords.txt
ERROR: 1 input files used while inputFilesPerJob=0
```
See example logs [here](/afs/cern.ch/user/a/aytul/public/JIRA-2021-Request-AllSignalSamples/900111/900123/).https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/138Check for multiple instances of TestHepMC (and TestLHE?)2021-04-01T14:20:41+02:00Christian GutschowCheck for multiple instances of TestHepMC (and TestLHE?)In general, the transform will create an instance of TestHepMC (and in the future also TestLHE) and run some checks as part of the job. For some setups the default thresholds used in these packages may be too strict and occasionally we g...In general, the transform will create an instance of TestHepMC (and in the future also TestLHE) and run some checks as part of the job. For some setups the default thresholds used in these packages may be too strict and occasionally we get JOs that try to loosen them a bit, which is usually fine.
We recently had a case (!1066) where a fresh instance of TestHepMC was created, and the threshold were tweaked on the new instance but not the one that the transform had already created, which was then causing issues down the line.
Could we catch this sort of thing in the CI? I imagine it would just be a case of checking for a line like
```
genSeq += TestHepMC()
```
and throwing an error?S1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/139Handle EVNT->EVNT jobs in CI and logParser2021-06-17T13:30:43+02:00Spyros ArgyropoulosHandle EVNT->EVNT jobs in CI and logParserThese jobs produce a `log.afterburn` instead of `log.generate`.
- [x] I would need an example to see how to treat this
- [x] How can we identify that it's an EVNT->EVNT job from the log?
- [x] Do we need to modify the Gen_tf command?
-...These jobs produce a `log.afterburn` instead of `log.generate`.
- [x] I would need an example to see how to treat this
- [x] How can we identify that it's an EVNT->EVNT job from the log?
- [x] Do we need to modify the Gen_tf command?
- [x] Test with `700267`S1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/140Strange behaviour of commit script when athena is skipped and run time is > 1h2021-05-06T11:03:19+02:00Spyros ArgyropoulosStrange behaviour of commit script when athena is skipped and run time is > 1hFor example in a case with 1 event, CPU=3.09h, the output is the following:
![Screenshot_2021-05-06_at_09.46.11](/uploads/63cb47b36de5f75bb8e9e6a275602971/Screenshot_2021-05-06_at_09.46.11.png)
which is correct, but when skipping athen...For example in a case with 1 event, CPU=3.09h, the output is the following:
![Screenshot_2021-05-06_at_09.46.11](/uploads/63cb47b36de5f75bb8e9e6a275602971/Screenshot_2021-05-06_at_09.46.11.png)
which is correct, but when skipping athena:
![Screenshot_2021-05-06_at_09.45.42](/uploads/c01022a7990d31535fb7cca7aa2e6a4c/Screenshot_2021-05-06_at_09.45.42.png)
the
```
printGood -f "\tOK: CI job time estimate: $cpu hours, but athena will not run in the CI"
```
message is not printed because the script never reaches that point.S1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/141check_unique_controlFile.sh fails when it shouldn't?2021-05-13T09:00:06+02:00Jeff Shahiniancheck_unique_controlFile.sh fails when it shouldn't?[check_unique_controlFile.sh](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/check_unique_controlFile.sh) is apparently a new part of the CI. I noticed that it fails even when given symlinks. For example, whe...[check_unique_controlFile.sh](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/check_unique_controlFile.sh) is apparently a new part of the CI. I noticed that it fails even when given symlinks. For example, when uploading JOs (with symlinks to one control file) that look like this:
```
$ ls -a *
100001:
myJO_1.py
myControlFile.py
100002:
myJO_2.py
myControlFile.py -> ../100001/myControlFile.py
```
The CI job fails and recommends that you use symlinks (even if you already are):
```
ERROR: Duplicate file(s) found:
./100xxx/100001/myControlFile.py
If the files have exactly the same content, please only keep one physical file replacing the rest with symbolic links.
If the files have differences consider renaming the files that you added.
You can check for differences with diff -w file1 file2
```
Perhaps we need to add ```-type f``` to [this line](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/check_unique_controlFile.sh#L23) as well?
Here's an example of a failing CI job:
https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/13818890
Tagging @sargyrop
Best,
JeffS1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/142Follow-up from "LO EFT samples for 4top"2021-05-18T11:32:21+02:00Spyros ArgyropoulosFollow-up from "LO EFT samples for 4top"The following discussions from !1161 should be addressed:
> This shouldn't be part of a JobOption. The first part was fixed properly in 21.6.60 and the second part is obviously gonna cause problems. `ATHENA_PROC_NUMBER` is set to 8 ...The following discussions from !1161 should be addressed:
> This shouldn't be part of a JobOption. The first part was fixed properly in 21.6.60 and the second part is obviously gonna cause problems. `ATHENA_PROC_NUMBER` is set to 8 because the machine has 8 cores, it shouldn't be set to 80 in the JOs.
Should we add the following checks/changes:
- if ATHENA_PROC_NUMBER > 1 and release < 21.2.60 => ERROR
- if ATHENA_PROC_NUMBER > 1 => run only 1 event in CI
- change the way we check whether the jO changes ATHENA_PROC_NUMBER - this would only be safe to catch in the transform btw, but until it is implemented there we could change the check to not use anywhere ATHENA_PROC_NUMBER (not even printing it), so e.g. look in the jO and if there is an uncommented line with "ATHENA_PROC_NUMBER" in it then give error
@cgutschoS1.2021Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/143Follow-up from ATLMCPROD-93322021-06-07T14:15:29+02:00Spyros ArgyropoulosFollow-up from ATLMCPROD-9332The following discussion from !1198 should be addressed:
- [ ] @cgutscho started a [discussion](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/1198#note_4542318): (+5 comments)
> Hi @sargyrop - this might b...The following discussion from !1198 should be addressed:
- [ ] @cgutscho started a [discussion](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/1198#note_4542318): (+5 comments)
> Hi @sargyrop - this might be a long shot, but do you think this is something we can catch in the CI? e.g. if there's a variable in MadGraph JOs that has `gridpack` or `grid_pack` in the name and it's still set to `True`, we put out a warning ... ?
>
> Cheers,
> Chris
Make logParser fail if the info in Chris's message below appearsS1.2021https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/144Pipelines failing when only links are included?2021-06-21T16:50:31+02:00Spyros ArgyropoulosPipelines failing when only links are included?The following discussion from !1225 should be addressed:
- [ ] @jshahini started a [discussion](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/1225#note_4588898): (+1 comment)
> Hi @cgutscho
>
> I...The following discussion from !1225 should be addressed:
- [ ] @jshahini started a [discussion](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/1225#note_4588898): (+1 comment)
> Hi @cgutscho
>
> Indeed it is a duplicate, but this is by design in order to clear the CI. To give some context, these JOs are for a SUSY grid expansion.
>
> I originally tried to upload everything using only symlinks to that control file, but the CI pipelines were failing, claiming that the jobs couldn't find ```MadGraphControl_SimplifiedModel_GG_directRPVLQD.py```
>
> So I duplicated the control file you pointed to and included it in this MR so that the pipelines would succeed. After the MR gets accepted, I was going to make another one where I change all the control files to be symlinks to ```/502xxx/502416/MadGraphControl_SimplifiedModel_GG_directRPVLQD.py```. That way, there would be no duplicated control files floating around.
>
> I realize this is remarkably convoluted, so I'm more than happy to hear other ideas about preparing the JOs for grid expansions in R21.
>
> Cheers,
> Jeff
Failed pipeline: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/pipelines/2741834
![Screenshot_2021-06-21_at_14.50.38](/uploads/1b1ebf50941d6c15803a23b2ad2bcd32/Screenshot_2021-06-21_at_14.50.38.png)S1.2021Spyros ArgyropoulosSpyros Argyropoulos2021-06-27https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/145Powheg-specific check needs adjusting2021-07-22T12:24:42+02:00Christian GutschowPowheg-specific check needs adjustingSeen in !1257: If the process is bb4l, i.e. it includes `PowhegControl/PowhegControl_bblvlv_Common.py`, then it won't need to include the Main31 include as well, which is what the logParser is currently checking.
cc @jkretzSeen in !1257: If the process is bb4l, i.e. it includes `PowhegControl/PowhegControl_bblvlv_Common.py`, then it won't need to include the Main31 include as well, which is what the logParser is currently checking.
cc @jkretzhttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/147Support for 7-digit DSIDs2022-04-01T15:44:37+02:00Christian GutschowSupport for 7-digit DSIDsDraft proposal discussed in [GIT meeting](https://indico.cern.ch/event/1061701/#23-support-for-7-digit-dsids-i):
* Stick to current repo, look into sparse checkouts in case response time becomes unsatisfactory
* keep 6-digit structure ...Draft proposal discussed in [GIT meeting](https://indico.cern.ch/event/1061701/#23-support-for-7-digit-dsids-i):
* Stick to current repo, look into sparse checkouts in case response time becomes unsatisfactory
* keep 6-digit structure as is, group 7-digits in millions, then in thousands
* Retain grouping by generator (?) for 7-digits (`15xxxxx` Madgraph, `16xxxxx` Powheg etc.)
* can use `10xxxxx` - `14xxxxx` for large BSM scans for instance
* keep using 6-digits for now, but introduce simple mechanism to disable 6-digit booking altogether at some pointChristian GutschowChristian Gutschowhttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/148use unweighted filter efficiency to calculate number of required input events2021-08-02T12:04:37+02:00Jan Kretzschmaruse unweighted filter efficiency to calculate number of required input eventsIn !1289 I had the issue, that the commit script appears to use the "weighted filter efficiency" to compute the number or needed input events as opposed to the "unweighted" one. This is not correct, take the example (attached) of highly-...In !1289 I had the issue, that the commit script appears to use the "weighted filter efficiency" to compute the number or needed input events as opposed to the "unweighted" one. This is not correct, take the example (attached) of highly-weighted input events, where the filter is removing preferrentially high-weight events, thus we get
Filter Efficiency = 0.570255 [10000 / 17536]
Weighted Filter Efficiency = 0.014686 [26912615882800.000000 / 1832511495909600.000000]
The relevant number to see if the job runs is the unweighted number, as this really tells us how many events need to pass.
[log.generate.gz](/uploads/b16f899af8781a4312d63c03ce12cf79/log.generate.gz)
Maybe a separate issue: it appears there is a blanket 10% safety applied. Note while I have not correctly calculated the right number, this can be both to little and too much and a safety of 4*sqrt[target output events]/[filter eff] would probably be better (this would be ~4sigma)
Example 1: number of output events is 10000, so "4sigma" would be ~400 events, or just 4%
Example 2: number of output events is 50, so "4sigma" would be ~14 events, or 14%Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/149NLHE check not working well in CI2022-03-12T09:33:06+01:00Christian GutschowNLHE check not working well in CIAs seen in !1304.
Maybe this check could be based on the extrapolated number of events?
> For a sufficiently large `nEventsPerJob` the 10% is more conservative than 4 sigma, so no harm done, albeit a bit more inefficient. For low `nEve...As seen in !1304.
Maybe this check could be based on the extrapolated number of events?
> For a sufficiently large `nEventsPerJob` the 10% is more conservative than 4 sigma, so no harm done, albeit a bit more inefficient. For low `nEventsPerJob` 4sigma is more conservative and people would probably anyway notice this in their local runs and choose a larger margin (e.g. factors of 25 rather than 1.1). So arguably the 4sigma thing is probably not needed in practice, provided people actually test their setups locally...
Maybe @jkretz should indicate what his preference is for the long run, but at the moment this new rule is causing the CI to crash frequently for no good reason, so we need to patch somehow.
Another option is to remove the 4sigma rule and go back to the previous check with 10% extra events, or that we have the 4sigma rule only for jobs that have a small number of requested output events (although it's actually impossible to know if people just test with a low number of events)https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/150Make CI job that sends email to conveners2021-09-30T13:43:46+02:00Spyros ArgyropoulosMake CI job that sends email to conveners- when commit message contains [skip modfiles]
- also when files are actually modified ? (we probably want this as well - some people add skip modfiles when there's no reason to)- when commit message contains [skip modfiles]
- also when files are actually modified ? (we probably want this as well - some people add skip modfiles when there's no reason to)Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/151Support for multiple tarballs with different COM energy2021-10-12T15:31:16+02:00Christian GutschowSupport for multiple tarballs with different COM energyThis should not have crashed:
Job [#16844556](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/16844556) failed for 9efbabb68119e3a4a2bf19113c4d8f3ffeb2b9e9:
It used to be working, so not sure what's changed?This should not have crashed:
Job [#16844556](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/16844556) failed for 9efbabb68119e3a4a2bf19113c4d8f3ffeb2b9e9:
It used to be working, so not sure what's changed?Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/152Skip the check for 10% extra LHE events in case of LHE-only jobs2021-11-10T11:51:30+01:00Jan KretzschmarSkip the check for 10% extra LHE events in case of LHE-only jobsHi,
In preparing some LHE-only jobs with MG (e.g. https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/1462) I hit the issue that this check https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/lo...Hi,
In preparing some LHE-only jobs with MG (e.g. https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/1462) I hit the issue that this check https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/logParser.py#L307-316 demands "10% more" LHE events. Obviously, in an LHE-only job this can never be passed, as the number of events requested is the number of events in the LHE file. Can this check be disabled? I guess the condition would be sth like "--outputTXTFile" present and "--outputEVNTFile" absent.
Looking at this again, I wonder if skipping this check in case of externally supplied LHE files actually makes sense as stated in the comment `# This check only makes sense if no external LHE inputs are used` - you'd normally want this to be checked also in this case?
Thanks, Jan
PS: for the above MR I circumvented the issue by hacking logParser locally to remove these lines(!)Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/153Do not throw warning from dummy CI job if a jO has not been committed2021-12-08T11:47:38+01:00Spyros ArgyropoulosDo not throw warning from dummy CI job if a jO has not been committedSpyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/154CI cannot deal with integration grids for several ECM being present2021-11-16T21:10:54+01:00Jan KretzschmarCI cannot deal with integration grids for several ECM being presentI'm trying to commit new jO that come with integration grids. https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/1485
As we want to run it at different sqrt(s) value, we need several of those. the Gen_tf.py transform ...I'm trying to commit new jO that come with integration grids. https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/1485
As we want to run it at different sqrt(s) value, we need several of those. the Gen_tf.py transform works this out correctly.
However, the CI bails on trying to copy the GRID file, because it expects just a single file https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/run_athena.sh#L136
Probably the best solution would be to copy the right file for the srt(s) value in question that would be sth like `mc_${sqrts}TeV.*.GRID.tar.gz` instead of `*.GRID.tar.gz`.
And ${sqrts} should be either an integer like 5,7,8,13,14 or 8p16 or 13p6 for special values.Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/155MR not being linked on JIRAs2021-12-03T14:57:38+01:00Matthew GignacMR not being linked on JIRAsIn some recent requests, it was noticed that the MRs are not being linked on JIRA. For example see: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/1510In some recent requests, it was noticed that the MRs are not being linked on JIRA. For example see: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/1510Spyros ArgyropoulosSpyros Argyropoulos2021-12-05https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/156commit_new_dsid.sh replace correct symbolic links2022-02-20T19:39:51+01:00Yiming Abulaiticommit_new_dsid.sh replace correct symbolic linksDear,
The commit_new_dsid.sh script replaces the DSID path in the symbolic event if the symbolic link is correct.
for example, my symbolic link is "../../510xxx/510250/file". When the script copy 100xx/* to DSID directory the "../../510...Dear,
The commit_new_dsid.sh script replaces the DSID path in the symbolic event if the symbolic link is correct.
for example, my symbolic link is "../../510xxx/510250/file". When the script copy 100xx/* to DSID directory the "../../510xxx/510250/" is replaced by the first new DSID (for example "../../511xxx/511424/").
Is it possible to let commit_new_dsid.sh keep the symbolic links when it is in 500xx-999xx range?
Cheers,
AbletSpyros ArgyropoulosSpyros Argyropoulos2022-02-20https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/157logParser.py didn't grep inputGeneratorFile2022-03-02T09:41:23+01:00Yiming AbulaitilogParser.py didn't grep inputGeneratorFileHi
---- This is for LHE files as inputs -----
Bellow is the content of log.generate.short, this log file is generated with 21.6.75
---------
- estimated CPU for CI job = 0.00 hrs
- using release = AthGeneration-21.6.75
- ecmEnergy = ...Hi
---- This is for LHE files as inputs -----
Bellow is the content of log.generate.short, this log file is generated with 21.6.75
---------
- estimated CPU for CI job = 0.00 hrs
- using release = AthGeneration-21.6.75
- ecmEnergy = 13000.0
- inputGeneratorFile = 09:20:14 Py:Gen_tf INFO inputGeneratorFile = TXT.440365._000001.tar.gz
- randomSeed = 1234
- EVNT to EVNT = False
- LHEonly = False
---------------
The inputGeneratorFile field is messed up here.
But the logParser.py works fine with old releases like 21.6.56. The reason is that athena print out changed in new release (this is what I observed):
Print out from 21.6.56:
--> 16:06:18 Py:Gen_tf INFO inputGeneratorFile used TXT.440329._000001.tar.gz
Print out from 21.6.75:
--> 09:09:41 Py:Gen_tf INFO inputGeneratorFile = TXT.440363._000001.tar.gz
You can see the changed from "used" to "=".
The line [L159](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/logParser.py#L159) has to check both "used" and "=" to accommodate changes made in athena.
Cheers,
AbletSpyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/158Handle purple list in logParser2022-05-05T16:15:51+02:00Spyros ArgyropoulosHandle purple list in logParserTBD if this is something that we can catch and report e.g. in the parser
See !1673TBD if this is something that we can catch and report e.g. in the parser
See !1673Spyros ArgyropoulosSpyros Argyropoulos2022-04-01https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/159Allow run_athena to run 13p6 TeV2022-05-10T11:04:58+02:00Spyros ArgyropoulosAllow run_athena to run 13p6 TeVCurrently only `mc15_13TeV` and `mc16_13TeV` external LHE files are handled in run_athena. Should be extended to `mc16_13p6TeV` and `mc21_13p6TeV`.
Check if we can automate this so that we don't need to hard-code things as done now.Currently only `mc15_13TeV` and `mc16_13TeV` external LHE files are handled in run_athena. Should be extended to `mc16_13p6TeV` and `mc21_13p6TeV`.
Check if we can automate this so that we don't need to hard-code things as done now.Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/160check_unique_controlFile too loose2022-04-10T13:51:04+02:00Spyros Argyropouloscheck_unique_controlFile too looseI notice in !1694 that due to mistakes in previous MRs there are spurious errors, check https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/20185882
This is because https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/...I notice in !1694 that due to mistakes in previous MRs there are spurious errors, check https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/20185882
This is because https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/507xxx/507035/MadGraphControl_Py8EG_2HDMa_monoH_common.py is a file rather than a link.
This means that all the future MRs that will point to this control file (and it's almost certain we'll have more) will all have the same spurious errors.
In fact this is going to happen for all files that should be links and weren't so the situation is going to become worse with time. I would therefore propose to make the test stricter, i.e. we should not allow two physical files with the same name and the same content. If the content is different then the name should be adjusted. And we should not allow the CI job to fail.
Is there any objection @jkretz @mgignac @cgutscho ?Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/161logParser expects randomSeed arg even for E2E jobs2022-03-25T14:51:01+01:00Jan KretzschmarlogParser expects randomSeed arg even for E2E jobsHi,
in https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/1725 I think I 'discovered' an issue that the logParser requires the argument 'randomSeed' to be present also for EVNT-to-EVNT (afterburner filtering) jobs. T...Hi,
in https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/1725 I think I 'discovered' an issue that the logParser requires the argument 'randomSeed' to be present also for EVNT-to-EVNT (afterburner filtering) jobs. The problem is: even if you give the argument, it is not printed to the log file. Not sure if this is an issue of the transform, but in any case these jobs are not really supposed to need ranfdom numbers in most case.
Thanks, JanSpyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/162Job Failed #208263692022-04-04T13:40:29+02:00Christian GutschowJob Failed #20826369Job [#20826369](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/20826369) failed for 06fa2b7f0bbd6b1dd30f9dd4e073b3b2206ff634:Job [#20826369](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/20826369) failed for 06fa2b7f0bbd6b1dd30f9dd4e073b3b2206ff634:https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/163ecmEnergy in logParser fixed to 13000 GeV?2022-04-06T11:25:01+02:00Jan KretzschmarecmEnergy in logParser fixed to 13000 GeV?I have the impression that logParser fixes the ecmEnergy to 13000 GeV in the log.generate.short irrespective what the given log file gives
https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/logParser.py#L471
Thi...I have the impression that logParser fixes the ecmEnergy to 13000 GeV in the log.generate.short irrespective what the given log file gives
https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/logParser.py#L471
This impacted https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/1748
(I can fix that MR myself now)https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/164Some improvements for commit script2022-04-10T13:43:26+02:00Spyros ArgyropoulosSome improvements for commit script- The removal of the `.sh` extension means that 1) the OS thinks it's an executable and will try to run it instead of opening it with an editor 2) the editor doesn't recognise that it's python so there is no highlighting etc. Generally I...- The removal of the `.sh` extension means that 1) the OS thinks it's an executable and will try to run it instead of opening it with an editor 2) the editor doesn't recognise that it's python so there is no highlighting etc. Generally I see no reason for having removed the extension and I think we should put it back (also same for other scripts where extensions have been removed)
- [L164](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/commit_new_dsid#L164) seems to be a debug statement which was forgotten to be removed, I don't see the reason for having a long dump of dsids that are going to be processed and this is more clearly visible from the following printout statements which are clearer
![Screenshot_2022-04-10_at_11.12.20](/uploads/eeabafaa609b285bd55e375886126cd8/Screenshot_2022-04-10_at_11.12.20.png)Christian GutschowChristian Gutschowhttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/165Decoding issue in commit script / logParser ?2022-04-10T17:20:38+02:00Spyros ArgyropoulosDecoding issue in commit script / logParser ?![Screenshot_2022-04-10_at_11.44.33](/uploads/8e97d70a5350e335d2005d379cc8d393/Screenshot_2022-04-10_at_11.44.33.png)
appears with python 3.6.8 (same as on lxplus) - content at /afs/cern.ch/user/a/aivina/public/JO_TEST_DIR
Not reproduc...![Screenshot_2022-04-10_at_11.44.33](/uploads/8e97d70a5350e335d2005d379cc8d393/Screenshot_2022-04-10_at_11.44.33.png)
appears with python 3.6.8 (same as on lxplus) - content at /afs/cern.ch/user/a/aivina/public/JO_TEST_DIR
Not reproducible on lxplus or locally
@aivinaSpyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/166Add checks for python2/3 compatibility of jO2022-04-21T17:21:24+02:00Spyros ArgyropoulosAdd checks for python2/3 compatibility of jOan example DSID is 830099: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/830xxx/830099/mc.H7EG_jetjet_72_Cluster_JZ1.py
R21: 21.6.85
R22: You can try 22.6.13 (later releases have issues with EvtGen_i — should be ...an example DSID is 830099: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/830xxx/830099/mc.H7EG_jetjet_72_Cluster_JZ1.py
R21: 21.6.85
R22: You can try 22.6.13 (later releases have issues with EvtGen_i — should be fixed soon).Spyros ArgyropoulosSpyros Argyropoulos2022-04-25https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/167Fix logParser bug when running in CI2022-04-20T22:28:20+02:00Spyros ArgyropoulosFix logParser bug when running in CI![Screenshot_2022-04-20_at_21.08.27](/uploads/ed6760a1cbe977212c0904faf484fd7a/Screenshot_2022-04-20_at_21.08.27.png)![Screenshot_2022-04-20_at_21.08.27](/uploads/ed6760a1cbe977212c0904faf484fd7a/Screenshot_2022-04-20_at_21.08.27.png)Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/168logParser unsupported locale setting2022-04-21T20:53:54+02:00Spyros ArgyropouloslogParser unsupported locale settingFull error message:
INFO: New DSID directory: 100xxx/100001 ...
OK: log.generate file found.
Traceback (most recent call last):
File "scripts/logParser.py", line 8, in <module>
locale.setlocale(locale.LC_CTYPE, f'{lang}.UTF-8'...Full error message:
INFO: New DSID directory: 100xxx/100001 ...
OK: log.generate file found.
Traceback (most recent call last):
File "scripts/logParser.py", line 8, in <module>
locale.setlocale(locale.LC_CTYPE, f'{lang}.UTF-8')
File "/usr/lib64/python3.6/locale.py", line 598, in setlocale
return _setlocale(category, locale)
locale.Error: unsupported locale setting
ERROR: logParser run failed.
Need output of
- locale
- locale -a
- env
- which machine you are running on
@yanlin @nishuSpyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/169CI job to check if EVNTs have been obsoleted to run for modified jO2022-11-16T14:32:57+01:00Spyros ArgyropoulosCI job to check if EVNTs have been obsoleted to run for modified jO- if a jO is modified
- check if EVNT containers exist & have > 0 associated files, if so throw an error
e.g. https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/1795- if a jO is modified
- check if EVNT containers exist & have > 0 associated files, if so throw an error
e.g. https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/1795Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/170New commit script failing for directories with arbitrary names2022-05-06T12:59:13+02:00Spyros ArgyropoulosNew commit script failing for directories with arbitrary names![Screenshot_2022-05-06_at_11.16.03](/uploads/a87f3ca9c98c1d19745c55cc15380114/Screenshot_2022-05-06_at_11.16.03.png)
This is because of:
```
newDSID += parseDSIDList(args.DSID)
```
seems to be adding an empty item in the list![Screenshot_2022-05-06_at_11.16.03](/uploads/a87f3ca9c98c1d19745c55cc15380114/Screenshot_2022-05-06_at_11.16.03.png)
This is because of:
```
newDSID += parseDSIDList(args.DSID)
```
seems to be adding an empty item in the listChristian GutschowChristian Gutschowhttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/171CI download to check for grid input LHEs with mc21_13p6TeV scope2022-05-10T07:28:54+02:00Jan KretzschmarCI download to check for grid input LHEs with mc21_13p6TeV scopeIn https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/1824 I hit the problem, that Grid input LHEs are always trying to be downloaded with mc15_13TeV or mc16_13TeV scope, I think the origin of this is here:
https://gi...In https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/1824 I hit the problem, that Grid input LHEs are always trying to be downloaded with mc15_13TeV or mc16_13TeV scope, I think the origin of this is here:
https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/run_athena.sh#L115-138
First, it would probably be preferable, if we get the CME string from the CME value rather than fixing it to 13 TeV.
Second, we now also need to allow mc21
As shortest patch we could just keep extending the list of allowed scopes, here mc21_13p6TeV being the obvious one to add immediately.
Many thanks, Janhttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/172EVNTtoEVNT and ECM: fails to download input EVNT in CI2022-05-17T16:03:17+02:00Matthew GignacEVNTtoEVNT and ECM: fails to download input EVNT in CIHi,
A few of the CI jobs fail for some E2E jOs in mc21: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/21697381
Looks like the $ECMENERGY environment variable is not correctly set. In this type of transform, the ECM is no...Hi,
A few of the CI jobs fail for some E2E jOs in mc21: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/21697381
Looks like the $ECMENERGY environment variable is not correctly set. In this type of transform, the ECM is not required and not written into the log, which is probably the underlying issue. Any ideas on how to detect this?
As an aside, I noticed that the log.generate.short is claiming ecmEnergy of 13000 GeV, though the transform was run with 13600 TeV (not that it matters...). I guess the 13000 is a default somewhere, which maybe is causing some issues?
Cheers,
MatthewSpyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/173logParser crash when using gridpacks2022-05-17T10:36:18+02:00Dominic HirschbuehllogParser crash when using gridpacksWhen running the logParser for a log.generate which used a gridpack, it crashes with:
"/afs/cern.ch/work/d/dhirsch/stop13TeV/mcjoboptions/scripts/logParser.py", line 368, in powhegChecks
if not glob.glob(f"{os.path.dirname(logFile)}/...When running the logParser for a log.generate which used a gridpack, it crashes with:
"/afs/cern.ch/work/d/dhirsch/stop13TeV/mcjoboptions/scripts/logParser.py", line 368, in powhegChecks
if not glob.glob(f"{os.path.dirname(logFile)}/mc_*TeV.*.GRID.tar.gz"):
File "/cvmfs/atlas.cern.ch/repo/sw/software/22.6/sw/lcg/releases/LCG_101_ATLAS_18/Python/3.9.6/x86_64-centos7-gcc11-opt/lib/python3.9/posixpath.py", line 152, in dirname
p = os.fspath(p)
TypeError: expected str, bytes or os.PathLike object, not list
The problem is that the function
powhegChecks(logFile)
expects a filename, but gets the file content.
A similar problem should happen for Madgraph, since the function is defined properly
with def madgraphChecks(logContent), but the variable logFile used to check the gridpack
is nowwhere defined.
Cheers
DominicSpyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/174Arithmetic expression error in check_logParser.sh2022-05-17T16:23:22+02:00Spyros ArgyropoulosArithmetic expression error in check_logParser.shSee https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/21845100See https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/21845100https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/175Follow-up from "Allow to run mc21_13p6 E2E jobs in CI"2022-05-17T11:08:25+02:00Spyros ArgyropoulosFollow-up from "Allow to run mc21_13p6 E2E jobs in CI"Use grep to extract scope of EVNT files to download in E2E jobs instead of hardcoding
The following discussion from !1831 should be addressed:
- [ ] @cgutscho started a [discussion](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions...Use grep to extract scope of EVNT files to download in E2E jobs instead of hardcoding
The following discussion from !1831 should be addressed:
- [ ] @cgutscho started a [discussion](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/1831#note_5560365): (+8 comments)
> The energy might not be in the log, but it is in the file, no? If we grep the log for the release and set it up we could run `checkMetaSG.py` on the file, e.g.
> ```
> checkMetaSG EVNT.24960681._002005.pool.root.1 | grep beam_energy
> beam_energy | [6500000.0]
> | beam_energy | int | 6500000
> ```
> which is a bit more faff but should work "in general"?https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/176Job Failed #220519692022-05-26T00:34:53+02:00Jan KretzschmarJob Failed #22051969Job [#22051969](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/22051969) failed for 6964c28821e9ec0e33f4548a73191e02293e6c5b:
Hi @sargyrop , this pipeline appears to throw a couple of warnings and errors eventually. This i...Job [#22051969](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/22051969) failed for 6964c28821e9ec0e33f4548a73191e02293e6c5b:
Hi @sargyrop , this pipeline appears to throw a couple of warnings and errors eventually. This is using a pretty old SL6 release, so maybe it cannot be fixed...? Maybe you can just have a quick look and let us know.
thanks, JanSpyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/177Allow Pythia8_ShowerWeights to be used with Pythia >= 8.3072022-06-06T07:01:42+02:00Spyros ArgyropoulosAllow Pythia8_ShowerWeights to be used with Pythia >= 8.307See AGENE-1478See AGENE-1478Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/178Catch users trying to include files in private paths2022-06-27T20:18:19+02:00Matthew GignacCatch users trying to include files in private pathsI doubt this happens again in the future, so probably pretty low priority, but it would be nice to introduce some checks into the CI to ensure users aren't doing something crazy, like trying to import from user defined afs paths...
See...I doubt this happens again in the future, so probably pretty low priority, but it would be nice to introduce some checks into the CI to ensure users aren't doing something crazy, like trying to import from user defined afs paths...
See: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/1885
Cheers,
Matthewhttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/179run_athena.sh: ERROR: could not find scope2022-06-24T16:59:38+02:00Yiming Abulaitirun_athena.sh: ERROR: could not find scopeHi,
When I try to commit new JOs which use LHE files as input. I got ERROR: could not find scope from run_athena.sh scripts.
I tried different formats in log.generate.short file:
inputGeneratorFile = mc15_13TeV:TXT.16395453._000001.tar....Hi,
When I try to commit new JOs which use LHE files as input. I got ERROR: could not find scope from run_athena.sh scripts.
I tried different formats in log.generate.short file:
inputGeneratorFile = mc15_13TeV:TXT.16395453._000001.tar.gz.1
or
inputGeneratorFile = TXT.16395453._000001.tar.gz.1
or
inputGeneratorFile = TXT.16395453._000001.events
None of the these worked. The run_athena.sh is searching `*"mc15"*` in inputGeneratorFile file name, but if I add "mc15_13TeV:" to the inputGeneratorFile then latter it complain that it can not downlond "mc15_13TeV:mc15_13TeV:<files>" because of two scop.
In the past this "inputGeneratorFile = TXT.16395453._000001.tar.gz.1" works fine. But now I am not sure how to pass the LHE file names.
The original JIRA request: https://its.cern.ch/jira/browse/ATLMCPROD-10013
one of the failed pipeline: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/22771311
my commit branch: dsid_yabulait_601385
Please let me know if I missed something or if there is a potential bug in the run_athena.sh.
Cheers,
AbletSpyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/180Bug in check for non-reachable files2022-06-29T10:45:47+02:00Spyros ArgyropoulosBug in check for non-reachable filesATLMCPROD-10024
```
grep include mc.Ph_PDF4LHC21_WpH125J_Wincl_MINLO_LHE.py
include("PowhegControl/PowhegControl_HWj_Common.py")
include("Pythia8_i/Pythia8_A14_NNPDF23LO_EvtGen_Common.py")
include("Pythia8_i/Pythia8_Powheg_Main31.py"...ATLMCPROD-10024
```
grep include mc.Ph_PDF4LHC21_WpH125J_Wincl_MINLO_LHE.py
include("PowhegControl/PowhegControl_HWj_Common.py")
include("Pythia8_i/Pythia8_A14_NNPDF23LO_EvtGen_Common.py")
include("Pythia8_i/Pythia8_Powheg_Main31.py")
include("Pythia8_SMHiggs125_inc.py")
```
This appears as a bug because `PowhegControl` and `Pythia8_i` are known to `Gen_tf` but are obviously not present in the DSID directory. To do this properly one would actually need to run `Gen_tf` (where `Gen_tf` looks for the jO is based on what is in the cmake file which might change).
So basically all tests should be removed. Perhaps one which can stay is to check if there is any include pointing to `afs` but this only happened once in 1500 MRs, so I would prefer to completely remove this check.
@mgignac any objection?Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/181Job Failed #230158082022-07-05T15:02:11+02:00Yanlin LiuJob Failed #23015808Dear experts,
I got an error for the pipeline:
Job [#23015808](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/23015808) failed for ad8e70db868cd7e4408ea26a93d08be00e2bc4a7:
Actually I used the softlink to the control file...Dear experts,
I got an error for the pipeline:
Job [#23015808](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/23015808) failed for ad8e70db868cd7e4408ea26a93d08be00e2bc4a7:
Actually I used the softlink to the control file previously registered in 502xxx/502547/MGCtrl_Py8EG_NNPDF30nlo_Leptophilicmutau_2muZp_4mu_4pt2.py. I'm wondering what is the cause here.
Thank you!
Best,
YanlinSpyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/182Job Failed #230315882022-07-05T20:49:55+02:00Yanlin LiuJob Failed #23031588Job [#23031588](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/23031588) failed for b97c45079ee2667cd494db53c8f56fcaee597831:
Hi @sargyrop ,
Can I bother you with another question related to this error? It seems that the...Job [#23031588](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/23031588) failed for b97c45079ee2667cd494db53c8f56fcaee597831:
Hi @sargyrop ,
Can I bother you with another question related to this error? It seems that the event generation of pipeline succeeded, but the log parser step failed and it says "that Number of input LHE events: 174 <-- Needs to be higher than 233". I wonder what's the motivation behind for 233 events here.
Tagging the requester @dai .
Thank you!
Best,
Yanlinhttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/183Automatic creation of MR2022-08-23T08:37:22+02:00Spyros ArgyropoulosAutomatic creation of MR- [x] Create script that adds checks and creates MR
- [x] Approve MR
- [x] Merge open MR
- [x] Incorporate the script into CI
- [x] Add relevant parts in commit script- [x] Create script that adds checks and creates MR
- [x] Approve MR
- [x] Merge open MR
- [x] Incorporate the script into CI
- [x] Add relevant parts in commit scriptSpyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/184Job Failed #240867952022-08-22T12:01:09+02:00Yanlin LiuJob Failed #24086795Hi @aivina, there seems an error for the pipeline after you merged the JOs. I'm not sure what caused the issue. Do you have any idea?
Also tagging @sargyrop in case he has any insights on this.
Job [#24086795](https://gitlab.cern.ch/atl...Hi @aivina, there seems an error for the pipeline after you merged the JOs. I'm not sure what caused the issue. Do you have any idea?
Also tagging @sargyrop in case he has any insights on this.
Job [#24086795](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/24086795) failed for e886f2393fba727fbd83c8c44eb0db19f88926cc:https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/185Rebasing branching for automatically created MR when merging2024-02-18T07:24:20+01:00Spyros ArgyropoulosRebasing branching for automatically created MR when mergingWhen trying to merge automatically created MR we can get
```
./scripts/merge_request_api.sh -m
MR: 1993 - approvals left: 0
Merging 1993
{"message":"Branch cannot be merged"}
```
when previous MRs were merged in between.
One would have...When trying to merge automatically created MR we can get
```
./scripts/merge_request_api.sh -m
MR: 1993 - approvals left: 0
Merging 1993
{"message":"Branch cannot be merged"}
```
when previous MRs were merged in between.
One would have to rebase:
![Screenshot_2022-08-23_at_09.34.00](/uploads/d2d8219eaf65ec850b6169a63c62b2c8/Screenshot_2022-08-23_at_09.34.00.png)
from the CI which will launch another pipeline.
Need to see how to treat this in the pipeline.Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/186Automatically merging MR from pipeline2022-08-25T09:35:17+02:00Spyros ArgyropoulosAutomatically merging MR from pipelineWhen running `merge_request_api.sh -m` from the pipeline we get a method not allowed because the pipeline is still running.
Need to fix this in order to enable an automatic merging - perhaps turn it into a "merge when pipeline succeeds...When running `merge_request_api.sh -m` from the pipeline we get a method not allowed because the pipeline is still running.
Need to fix this in order to enable an automatic merging - perhaps turn it into a "merge when pipeline succeeds"
See !2001Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/187Job Failed #241681202022-08-25T18:07:12+02:00Yanlin LiuJob Failed #24168120Hi @sargyrop , @jkretz and @mgignac,
Sorry to bother you. These JOs will use the LHE file as inputs. The official LHE datasets are being produced in this request: https://its.cern.ch/jira/browse/ATLMCPROD-10109. Therefore, for the local...Hi @sargyrop , @jkretz and @mgignac,
Sorry to bother you. These JOs will use the LHE file as inputs. The official LHE datasets are being produced in this request: https://its.cern.ch/jira/browse/ATLMCPROD-10109. Therefore, for the local validation, we used the privately generated LHE files.I wonder whether the error seen here is related to this.
Thanks a lot!
Best,
Yanlin
Job [#24168120](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/24168120) failed for f16364ca1b7a88cb648fc4b31be83ebea3d6b4ef:https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/188Allow to skip Nfiles check in gridpack2022-09-06T20:27:53+02:00Spyros ArgyropoulosAllow to skip Nfiles check in gridpackThe following discussion from !2014 should be addressed:
- [ ] @narayan started a [discussion](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/2014#note_5975654): (+3 comments)
> Hi @sargyrop
>
> S...The following discussion from !2014 should be addressed:
- [ ] @narayan started a [discussion](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/merge_requests/2014#note_5975654): (+3 comments)
> Hi @sargyrop
>
> Shall I do ``[skip Athena]`` in order to get this merged ?
>
> Cheers
> RohinSpyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/189Job Failed #249610542022-10-05T09:10:18+02:00Yang LiuJob Failed #24961054Job [#24961054](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/24961054) failed for 492434176f479e1ec018b7891f3b211cd8387eb0:
Hi @sargyrop, it seems sometimes ago, when uploading the squark version of the 2step decay via s...Job [#24961054](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/24961054) failed for 492434176f479e1ec018b7891f3b211cd8387eb0:
Hi @sargyrop, it seems sometimes ago, when uploading the squark version of the 2step decay via sleptons the corresponding person skip the checks of the modifies causing the CO is duplicated in 505622 and 505945.
While this time, I need to upload the CO and JO files for the extended points of the GG_N2_2LN1 which have the same CO as in the 505622 and 505945.
The ideal way to do this is to fix the duplication existing before by redirecting the CO in 505945 to the one in 505622 as this commit did.
But as I can see, changing the existing file need some extra approvement.
So I'm contacting u here to see what I should do to make this happen.
Many thanks
Cheers
YangSpyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/190Remove possibility to skip all pipelines2022-10-08T10:37:24+02:00Spyros ArgyropoulosRemove possibility to skip all pipelinesWe should remove the `[skip all]` option since it is abused with no reason.
Need to think how to redesign the pipeline to make this happen.We should remove the `[skip all]` option since it is abused with no reason.
Need to think how to redesign the pipeline to make this happen.Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/191`notify_changes` doesn't work properly if pipeline has run before MR is made2022-11-16T14:32:57+01:00Spyros Argyropoulos`notify_changes` doesn't work properly if pipeline has run before MR is madeWhen the job runs on the branch it will not find an open MR and will therefore not send an email to the conveners.
Probable solution:
make a rule to only run this job on MR pipelines.When the job runs on the branch it will not find an open MR and will therefore not send an email to the conveners.
Probable solution:
make a rule to only run this job on MR pipelines.Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/192setup_athena pipeline failing2022-11-17T10:18:51+01:00Spyros Argyropoulossetup_athena pipeline failingSee !2152
The branch pipeline succeeds: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/pipelines/4772445
Merge request pipeline succeeds when `log.generate.short` is present: https://gitlab.cern.ch/atlas-physics/pmg/mcjobopti...See !2152
The branch pipeline succeeds: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/pipelines/4772445
Merge request pipeline succeeds when `log.generate.short` is present: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/25883448
Merge request pipeline fails when `log.generate.short` is not present: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/25883547
Branch pipeline also fails when `log.generate.short` is not present: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/25883631Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/193Turn off branch pipelines2023-01-03T08:22:29+01:00Spyros ArgyropoulosTurn off branch pipelinesIf someone opens a MR when a branch pipeline is still running 2 concurrent pipelines are created.
One will fail since the one that finishes first will push to the branch and then the last CI job will try to push to a branch that is beh...If someone opens a MR when a branch pipeline is still running 2 concurrent pipelines are created.
One will fail since the one that finishes first will push to the branch and then the last CI job will try to push to a branch that is behind.
![Screenshot_2022-11-17_at_16.36.11](/uploads/afe31bf16cbf3496ccb1bf6f2703d7ac/Screenshot_2022-11-17_at_16.36.11.png)
We should turn off all branch pipelines.Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/194Notify changes assigns MR to wrong person2022-11-20T17:08:34+01:00Spyros ArgyropoulosNotify changes assigns MR to wrong personneed to assign to convenersneed to assign to convenersSpyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/195Job Failed #262950132022-12-06T16:23:29+01:00Arpan GhosalJob Failed #26295013Job [#22933691](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/22933691) failed for ea50f46043f0c61e3433e23f573da905e5273a06:
Hi @mcgensvc,
My pipeline fails because of no space left on device. Any suggestions on how I ...Job [#22933691](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/22933691) failed for ea50f46043f0c61e3433e23f573da905e5273a06:
Hi @mcgensvc,
My pipeline fails because of no space left on device. Any suggestions on how I might mitigate this issue?
Thankshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/196ERROR: /cvmfs/.../gridpack files are not readable to all users2022-12-13T09:37:44+01:00Yiming AbulaitiERROR: /cvmfs/.../gridpack files are not readable to all usersHi,
When we addding gripack files via symbolic links, the GIT CI reported the following errors.
At first it said mcgensvc and atlcvmfs can read the file. Then it complaining that not all users can read the file.
Isn't the permission to ...Hi,
When we addding gripack files via symbolic links, the GIT CI reported the following errors.
At first it said mcgensvc and atlcvmfs can read the file. Then it complaining that not all users can read the file.
Isn't the permission to mcgensvc and atlcvmfs enough? or should we make it readable to all ATLAS users?
ERRORs are taken from here: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/26431913
EOS Console [root://eosuser.cern.ch] |/>
The access rights are set to: egroup:atlas-phys-exotics-tchannel-neutrinos:rxm!dq,u:atlcvmfs:rxm!dq,u:mcgensvc:rxm!dq,u:jneundor:rwx
OK: mcgensvc can read /eos/user/j/jneundor/HeavyNGeneration/gridpacks_regeneration
OK: atlcvmfs can read /eos/user/j/jneundor/HeavyNGeneration/gridpacks_regeneration
ERROR: file /eos/user/j/jneundor/HeavyNGeneration/gridpacks_regeneration/mc_13TeV.aMCPy8EG_NNPDF3NLO_HeavyN_ee_mN10TeV.GRID.tar.gz is not readable by all users.
The permissions are set to:Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/197check_jo_consistency failed2022-12-19T16:20:55+01:00Yiming Abulaiticheck_jo_consistency failedHi,
When I commit a second time after fix something in Control file, the consistency check failed.
The consistency is checking some other files that is not a part of my commits (I am trying to register 518405-518446 range).
But job is fa...Hi,
When I commit a second time after fix something in Control file, the consistency check failed.
The consistency is checking some other files that is not a part of my commits (I am trying to register 518405-518446 range).
But job is failed due to some errors related to 421xxx/421100/..
So How can I fix this?
Cheers,
Ablet
Error part:
OK: No generator full name is found
Generators used: ['Py8', 'EG']
ERROR: file /builds/atlas-physics/pmg/mcjoboptions/scripts/../421xxx/421100/mc.Py8EG_A14NNPDF23LO_Ztautau.py contains includes pointing to MC15JobOptions
Failed Job
https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/26543765Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/198Investigate usage of sparse checkout2023-09-22T08:26:33+02:00Spyros ArgyropoulosInvestigate usage of sparse checkout# Old solution
```
mkdir mcjoboptions
cd mcjoboptions
git init
git remote add -f origin ssh://git@gitlab.cern.ch:7999/atlas-physics/pmg/mcjoboptions.git
git config core.sparseCheckout true
echo scripts > .git/info/sparse-checkout
echo c...# Old solution
```
mkdir mcjoboptions
cd mcjoboptions
git init
git remote add -f origin ssh://git@gitlab.cern.ch:7999/atlas-physics/pmg/mcjoboptions.git
git config core.sparseCheckout true
echo scripts > .git/info/sparse-checkout
echo common >> .git/info/sparse-checkout
echo .gitignore >> .git/info/sparse-checkout
git pull origin master
```
This works - **notice that the `700xxx` directory is not cloned - the script automatically finds the correct DSID directory that should be created**
![Screenshot_2023-01-03_at_09.39.15](/uploads/bbf3ad31b4bdc38bf29c5e8723b1fa4b/Screenshot_2023-01-03_at_09.39.15.png)
# New solution
Needs git 2.26 or higher
```
mkdir mcjoboptions
cd mcjoboptions
git init
git remote add -f origin ssh://git@gitlab.cern.ch:7999/atlas-physics/pmg/mcjoboptions.git
git sparse-checkout init
git sparse-checkout set common scripts .gitignore
git pull origin master
```Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/199Branch CI pipline is not running!2023-01-04T18:07:30+01:00Yiming AbulaitiBranch CI pipline is not running!Hi @sargyrop
I pushed a new branch an hour ago.
dsid_yabulait_519065
But CI pipeline is not running! Is it disabled?
I used sparse checkout, and it is working much faster than before.
mkdir mcjoboptions
cd mcjoboptions
git ...Hi @sargyrop
I pushed a new branch an hour ago.
dsid_yabulait_519065
But CI pipeline is not running! Is it disabled?
I used sparse checkout, and it is working much faster than before.
mkdir mcjoboptions
cd mcjoboptions
git init
git remote add -f origin ssh://git@gitlab.cern.ch:7999/atlas-physics/pmg/mcjoboptions.git
git sparse-checkout init
git sparse-checkout set common scripts .gitignore
git pull origin masterSpyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/200Failures on grid coming from treatment of runArgs.jobConfig not caught in CI2023-02-02T07:45:25+01:00Spyros ArgyropoulosFailures on grid coming from treatment of runArgs.jobConfig not caught in CIOriginal file: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/9938e84d065b3b479dd57e063889896daa7ff9e7/521xxx/521163/MadGraphControl_TRSM_HHH.py#L21
Original MR: !2251
Pipeline passed: https://gitlab.cern.ch/atlas-physic...Original file: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/9938e84d065b3b479dd57e063889896daa7ff9e7/521xxx/521163/MadGraphControl_TRSM_HHH.py#L21
Original MR: !2251
Pipeline passed: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/pipelines/5008939
https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/9938e84d065b3b479dd57e063889896daa7ff9e7/521xxx/521163/MadGraphControl_TRSM_HHH.py#L21
This then failed on the grid: see ATLMCPROD-10348 example log: https://bigpanda.cern.ch//media/filebrowser/a8f96442-63a3-4574-aee8-53704fce19da/mc15_13TeV/tarball_PandaJob_5732336628_SiGNET/log.generate
Fix in: !2288
New file in: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/521xxx/521163/MadGraphControl_TRSM_HHH.py
Offending line seems to be:
```
f_list = os.listdir(runArgs.jobConfig[0])
```
where `runArgs.jobConfig[0]` on the grid seems to evaluate to `521163` while on the CI it would evaluate to `../521163`. Not clear from which level `Gen_tf.py` runs on the grid.
@mborodin could you point me to the code that executes `Gen_tf.py` on the grid?Spyros ArgyropoulosSpyros Argyropoulos2023-02-05https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/201Understand how `runArgs.jobConfig` is evaluated on the grid2023-02-01T14:36:56+01:00Spyros ArgyropoulosUnderstand how `runArgs.jobConfig` is evaluated on the gridSpyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/202Ensure that `run_athena` calls `Gen_tf` the same way as it would run on the grid2023-02-01T14:37:04+01:00Spyros ArgyropoulosEnsure that `run_athena` calls `Gen_tf` the same way as it would run on the gridSpyros ArgyropoulosSpyros Argyropoulos