MC Job Options issueshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues2020-04-29T21:17:04+02:00https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/106Job Failed #81849032020-04-29T21:17:04+02:00Xiaohu SunJob Failed #8184903Job [#8184903](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/8184903) failed for 127510e74ddbca868a29efecbd1b8c6144bf63b8:Job [#8184903](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/8184903) failed for 127510e74ddbca868a29efecbd1b8c6144bf63b8:https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/100Make -m obligatory in commit script2020-04-28T18:32:35+02:00Spyros ArgyropoulosMake -m obligatory in commit script* [x] Remove current parsing logic
* [x] Check that skipping athena,logParser works as before* [x] Remove current parsing logic
* [x] Check that skipping athena,logParser works as beforeS1.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/94Automatic assignment of dsids2020-04-17T19:56:01+02:00Spyros ArgyropoulosAutomatic assignment of dsids* [x] implement logic to only assign blocks of DSIDs
* [x] test 1: 10 DSIDs that don't fill in a gap
* [x] test 2: continuous set of DSIDs that fills in a gap
* [x] test 3: 1 DSID should be assigned to the lowest possible DSID
**Onc...* [x] implement logic to only assign blocks of DSIDs
* [x] test 1: 10 DSIDs that don't fill in a gap
* [x] test 2: continuous set of DSIDs that fills in a gap
* [x] test 3: 1 DSID should be assigned to the lowest possible DSID
**Once the above works**
* [x] test `git fetch origin master && git checkout -b new origin/master` - if it works replace in master
* [x] implement flag for specifying branch name (`--branch`)
* [x] replace `-n` with `--dry`
* [x] implement flag for moving DSIDs but not pushing to git (`--nogit`)
* [x] keep track of the DSIDs that have changed and suggest the command to run after dry run
* [x] update list of DSIDs in commit script (provided with `-d` argument)
* [x] Update READMES1.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/97Handling of jO files processed in non-Unix filesystems2020-04-17T13:57:23+02:00Spyros ArgyropoulosHandling of jO files processed in non-Unix filesystemsFrom @avroy
> calculating [nEvents](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/run_athena.sh#L70) failed with the following errors:
```
(standard_in) 1: illegal character: ^M
(standard_in) 1: illegal c...From @avroy
> calculating [nEvents](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/scripts/run_athena.sh#L70) failed with the following errors:
```
(standard_in) 1: illegal character: ^M
(standard_in) 1: illegal character: ^M
```
> Naively, this is due to carriage return and not uniformly processed across operating systems
## Todo
See how to handle this:
* during commit script? : probably not ideal since not everyone uses it
* doing a `dos2unix` in the CI? : might require special image - need to see if `dos2unix` is available in the images we use
* doing a `sed 's/^M//g'` as described [here](https://stackoverflow.com/questions/2658931/why-error-illegal-character-m?answertab=votes#tab-top) in all CI jobs? : @avroy can you test whether this works for you?S1.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/90Job Failed #7575152 'RunArguments' object has no attribute 'inputGeneratorFile'2020-04-16T13:32:39+02:00Xiaohu SunJob Failed #7575152 'RunArguments' object has no attribute 'inputGeneratorFile'Job [#7575152](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/7575152) failed for c3035481c6ec0ec00926a14dc33427bb6b590fb1:
Dear experts,
To my understanding this relates to external LHE files. Please correct me if I am ...Job [#7575152](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/7575152) failed for c3035481c6ec0ec00926a14dc33427bb6b590fb1:
Dear experts,
To my understanding this relates to external LHE files. Please correct me if I am wrong.
The only connection between the JO / DSID and the external LHE files seems only in the spreadsheet.
The spreadsheet is not known by the CI system while uploading JOs.
Do we need some special setups to point to some afs or eos location of the external LHE files?
Thanks!
Best,
Xiaohuhttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/99Throw error in CI if no `evgenConfig.nEventsPerJob` is used in the file2020-04-08T10:51:29+02:00Spyros ArgyropoulosThrow error in CI if no `evgenConfig.nEventsPerJob` is used in the filePerhaps better to incorporate into #98Perhaps better to incorporate into #98S1.2020Spyros ArgyropoulosSpyros Argyropoulos2020-04-05https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/96Reviewing changes in control files etc2020-04-03T14:45:59+02:00Spyros ArgyropoulosReviewing changes in control files etcI am wondering whether it makes sense to introduce [CODEOWNERS](https://docs.gitlab.com/ee/user/project/code_owners.html) so that changes such as !317 can be verified by the appropriate people (e.g. in Exotics normally theory contacts wo...I am wondering whether it makes sense to introduce [CODEOWNERS](https://docs.gitlab.com/ee/user/project/code_owners.html) so that changes such as !317 can be verified by the appropriate people (e.g. in Exotics normally theory contacts would be responsible for a control file).
Tagging @amoroso @cgutscho @fsiegert @gstarkFuturehttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/93CI addition to JO name check: no minus signs allowed2020-03-26T11:10:45+01:00Christian GutschowCI addition to JO name check: no minus signs allowedIt looks like the production system doesn't allow "-" in the JO name, can we get the CI to check this?It looks like the production system doesn't allow "-" in the JO name, can we get the CI to check this?S1.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/61Potential harmonisation of evgen transform checks with CI2020-03-20T09:37:59+01:00Spyros ArgyropoulosPotential harmonisation of evgen transform checks with CIThe checks implemented here: https://gitlab.cern.ch/atlas/athena/blob/21.6/Generators/EvgenJobTransforms/share/skel.GENtoEVGEN.py
are supposed to be the same as the checks we use in the CI: https://gitlab.cern.ch/atlas-physics/pmg/mcjob...The checks implemented here: https://gitlab.cern.ch/atlas/athena/blob/21.6/Generators/EvgenJobTransforms/share/skel.GENtoEVGEN.py
are supposed to be the same as the checks we use in the CI: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/blob/master/scripts/check_jo_consistency.py, however given that they are coded in 2 completely independent parts they are bound to go out of sync.
We should see if there's a way to tie them together so that they never go out of sync.
Carrying over from #2S1.2020https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/91Commit script: block usage of DSIDs that are already used in any remote branch2020-03-16T13:55:42+01:00Spyros ArgyropoulosCommit script: block usage of DSIDs that are already used in any remote branchCurrently the commit script only checks that a DSID is not used in remote branches **only if the DSID is outside the allowed range**.
We should extend this to cover all DSIDs, so that users who try to assign DSIDs themselves do not cr...Currently the commit script only checks that a DSID is not used in remote branches **only if the DSID is outside the allowed range**.
We should extend this to cover all DSIDs, so that users who try to assign DSIDs themselves do not create problems with other MRs.S1.2020Spyros ArgyropoulosSpyros Argyropoulos2020-03-15https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/89logParser fail due to inputGeneratorFile argument2020-03-12T17:57:53+01:00Serena Palazzoserena.palazzo@cern.chlogParser fail due to inputGeneratorFile argumentDear all,
I just added a branch (spalazzo_600017) with 4 new JOs to register. These JOs depend on existing LHE and I tested them locally by givin the argument --inputGeneratorFile=../mc15_13TeV/410659/TXT.15180944._016918.tar.gz.1. These...Dear all,
I just added a branch (spalazzo_600017) with 4 new JOs to register. These JOs depend on existing LHE and I tested them locally by givin the argument --inputGeneratorFile=../mc15_13TeV/410659/TXT.15180944._016918.tar.gz.1. These files are therefore outside the directories with the JOs. Locally, the log parser run currectly but now that I pushed the branch I got this error:
Logfile error in log.generate: "AttributeError: 'RunArguments' object has no attribute 'inputGeneratorFile'"
PyJobTransforms.transform.execute 2020-03-12 16:49:47,279 WARNING Transform now exiting early with exit code 65 (Non-zero return code from generate (8); Logfile error in log.generate: "AttributeError: 'RunArguments' object has no attribute 'inputGeneratorFile'")
here the full pipeline: https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/jobs/7564892
I guess this error is due to the fact that inside the JO directory I should put the soft link with the LHE file (as done for the gridpacks). If this is the case, I was wondering if it would be possible/useful to add a warning in the parser check to advise whether the inputGeneratorFile is linked correctly or not.
Thanks in advance!
Cheers,
SerenaS1.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/86Check that external files are world-readable2020-03-12T17:18:40+01:00Christian GutschowCheck that external files are world-readableCan we implement a check that sym-linked files are world-readable with something like
```
"$(find "$filename" -perm -004)"
```
in case the cvmfs sync script cannot easily be patched? Not clear to me whether this is better done in the C...Can we implement a check that sym-linked files are world-readable with something like
```
"$(find "$filename" -perm -004)"
```
in case the cvmfs sync script cannot easily be patched? Not clear to me whether this is better done in the CI or as part of the commit script. If the latter is possible, perhaps that would be a good point to flag this up, but if people sneakily try to bypass the commit script, perhaps we should also check it in the CI?S1.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/88Issue with logParser due to the nEventsPerJob2020-03-12T09:19:07+01:00Serena Palazzoserena.palazzo@cern.chIssue with logParser due to the nEventsPerJobDear all,
I am preparing an MC request and I have an issue with the parser step. In my JO I have nEventsPerJob=10000. But, if I run the following command line:
./scripts/commit_new_dsid.sh -d=600017 -n
I have the following error:
-----...Dear all,
I am preparing an MC request and I have an issue with the parser step. In my JO I have nEventsPerJob=10000. But, if I run the following command line:
./scripts/commit_new_dsid.sh -d=600017 -n
I have the following error:
---------------------
Performance metrics:
---------------------
- actual CPU (500 events) = 0.02 hrs
- CPU extrapolated to 10000 events = 0.3 hrs
- CPU = 0.33 hrs <-- ERROR: Too low CPU time - should be between 6-12h. Adjust nEventsPerJob!
- estimated CPU for CI job = 0.00 hrs
- Virtual memory = 1497.668 Mb
---------------------
Others:
---------------------
- Effective lumi (fb-1): 0.0133816141973574 <-- WARNING: low effective luminosity
- Total no. of events: 500 <-- WARNING: This total is low enough that the mu profile may be problematic - INFORM MC PROD
---------------------
Summary:
---------------------
Errors : 1 , Warnings : 2 -> Errors encountered! Not ready for production!
ERROR: log.generate contains errors
Fix them before committing anything!
So it seems that 10k is not enough. But on the other hand, I cannot increase the number because otherwise there will be issues in production (according to @dhirsch). How can I solve this issue?
Attached you can find the log.generate and the JO.
Thanks in advance!
Cheers,
Serena[log.generate](/uploads/c0da6de6b6c075e818791e4216e1b511/log.generate)[mc.PhH7EG_H7UE_716_tchan_lept_antitop.py](/uploads/32012a54748990b2cfc5f9fb44b46561/mc.PhH7EG_H7UE_716_tchan_lept_antitop.py)S1.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/87Allo jO files to be links in whitelist2020-03-09T15:22:42+01:00Spyros ArgyropoulosAllo jO files to be links in whitelistAs needed in !265
`mc.*.py` should be allowed as a link tooAs needed in !265
`mc.*.py` should be allowed as a link tooS1.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/81Remove check that CountHepMC == nEventsPerJob2020-03-04T10:14:12+01:00Spyros ArgyropoulosRemove check that CountHepMC == nEventsPerJobCurrently in the `logParser` we check that `CountHepMC == nEventsPerJob`
where `CountHepMC` is extracted from the following line:
```
11:06:58 CountHepMC INFO Events passing all checks and written = 2
```
and `nEventsPerJob` ...Currently in the `logParser` we check that `CountHepMC == nEventsPerJob`
where `CountHepMC` is extracted from the following line:
```
11:06:58 CountHepMC INFO Events passing all checks and written = 2
```
and `nEventsPerJob` is taken from
```
10:47:13 Py:Gen_tf INFO .nEventsPerJob = 200 # (Integer) number of input events per job. Possible values: value >= 0
```
I have a vague recollection that this helped to identify some cases for MG where the multiplier was not adjusting for the generation efficiency properly and the transform still managed to run, however I do not exactly remember the cases and I would not be able to reproduce it.
Since this check has been a source of some headaches recently we would propose to remove it unless someone thinks that this is necessary.
Tagging MG experts: @zmarshal @mcfayden @hmildner @svonbudd and @ewelina @lcorpe in case you want to discuss in the git meeting (also @fsiegert @cgutscho with who we discussed this already)S1.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/73Read nEventsPerJob from JO instead of logfile2020-03-04T10:14:12+01:00Frank SiegertRead nEventsPerJob from JO instead of logfileCurrently the logParser checks whether `nEventsPerJob` is set correctly to fulfil the runtime limits of the grid job. It uses the `nEventsPerJob` written out to the `log.generate`, but that might not be the correct one in cases where `nE...Currently the logParser checks whether `nEventsPerJob` is set correctly to fulfil the runtime limits of the grid job. It uses the `nEventsPerJob` written out to the `log.generate`, but that might not be the correct one in cases where `nEventsPerJob` had to be changed after the test job. Instead the value in the JO should be used.
These checks should probably be removed from the logParser and moved into an individual check of the JO (together with the `log.generate.short`?).
I'm attaching one [log.generate](/uploads/467a185ec3c90313d200154de32fd980/log.generate) file for a [recent merge request (DSID 700008)](https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/merge_requests/222) which without modifications leads to:
```
- CPU = 0.88 hrs <-- ERROR: Too low CPU time - should be between 6-12h. Adjust nEventsPerJob!
```S1.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/85Add check that only 1 jO is inside a DSID directory2020-03-03T13:33:33+01:00Spyros ArgyropoulosAdd check that only 1 jO is inside a DSID directoryCheck that only one file called `mc.*.py` is in a DSID directory.Check that only one file called `mc.*.py` is in a DSID directory.S1.2020Spyros ArgyropoulosSpyros Argyropouloshttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/82logParser throw error if release used is not AthGeneration2020-03-02T08:58:06+01:00Spyros ArgyropouloslogParser throw error if release used is not AthGenerationRecently a jO with `AthGenerationExternals` was used and this was not caught in logParser or any other test.
Should add the test and throw an error.Recently a jO with `AthGenerationExternals` was used and this was not caught in logParser or any other test.
Should add the test and throw an error.S1.2020Ewelina Maria LobodzinskaEwelina Maria Lobodzinskahttps://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/80Improve following of recursive links in eos and run_athena CI jobs2020-03-01T09:26:02+01:00Spyros ArgyropoulosImprove following of recursive links in eos and run_athena CI jobsSuggestion from Frank: use `readlink -f`
Need to see if this is accessible in the CI bash version or perhaps find a bash version where it is (if it's small enough).Suggestion from Frank: use `readlink -f`
Need to see if this is accessible in the CI bash version or perhaps find a bash version where it is (if it's small enough).S1.2020Spyros ArgyropoulosSpyros Argyropoulos2020-03-01https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/issues/79Recursive extraction of GRID links2020-02-24T10:11:18+01:00Spyros ArgyropoulosRecursive extraction of GRID linksThis is causing the failure in !256.
It seems that the scripts are for some reason not using enough steps to resolve
```
950xxx/950031/mc_13TeV.Sh_228_ttbar_AllHadronic_EnhMaxHTavrgTopPT.GRID.tar.gz -> ../../700xxx/700050/mc_13TeV.Sh_2...This is causing the failure in !256.
It seems that the scripts are for some reason not using enough steps to resolve
```
950xxx/950031/mc_13TeV.Sh_228_ttbar_AllHadronic_EnhMaxHTavrgTopPT.GRID.tar.gz -> ../../700xxx/700050/mc_13TeV.Sh_228_ttbar_AllHadronic_EnhMaxHTavrgTopPT.GRID.tar.gz
```
back to
```
/eos/user/c/cgutscho/mc/700047/mc_13TeV.Sh_228_ttbar_AllHadronic_EnhMaxHTavrgTopPT.GRID.tar.gz
```
By the way @cgutscho @fsiegert isn't this workflow problematic? You have multiple files pointing to `/eos/user/c/cgutscho/mc/700047/mc_13TeV.Sh_228_ttbar_AllHadronic_EnhMaxHTavrgTopPT.GRID.tar.gz`, which has already been transferred to cvmfs, so in principle Chris can decide to remove this file from his eos area and CI again would fail.S1.2020Spyros ArgyropoulosSpyros Argyropoulos2020-02-22