Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
athena
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Requirements
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Locked files
Deploy
Releases
Package Registry
Container Registry
Model registry
Operate
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
Repository analytics
Code review analytics
Issue analytics
Insights
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Peter Sherwood
athena
Commits
883f385f
Commit
883f385f
authored
5 years ago
by
Mark Stockton
Browse files
Options
Downloads
Patches
Plain Diff
Add log file checking to find failed child jobs and set mother return code to this result
Tested by killing by force a child process
parent
48ffd9ba
No related branches found
Branches containing commit
No related tags found
Tags containing commit
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
HLT/Trigger/TrigTransforms/TrigTransform/python/trigRecoExe.py
+22
-3
22 additions, 3 deletions
...rigger/TrigTransforms/TrigTransform/python/trigRecoExe.py
with
22 additions
and
3 deletions
HLT/Trigger/TrigTransforms/TrigTransform/python/trigRecoExe.py
+
22
−
3
View file @
883f385f
...
...
@@ -17,7 +17,7 @@ import subprocess
from
PyJobTransforms.trfExe
import
athenaExecutor
#imports for preExecute
from
PyJobTransforms.trfUtils
import
asetupReport
,
cvmfsDBReleaseCheck
from
PyJobTransforms.trfUtils
import
asetupReport
,
cvmfsDBReleaseCheck
,
lineByLine
import
PyJobTransforms.trfEnv
as
trfEnv
import
PyJobTransforms.trfExceptions
as
trfExceptions
from
PyJobTransforms.trfExitCodes
import
trfExit
as
trfExit
...
...
@@ -195,8 +195,27 @@ class trigRecoExecutor(athenaExecutor):
def
postExecute
(
self
):
#TODO
#need to check for HLTMPPU.*Child Issue in the log file and throw an error message if there so we catch that the child died
#Adding check for HLTMPPU.*Child Issue in the log file
# Throws an error message if there so we catch that the child died
# Also sets the return code of the mother process to mark the job as failed
# Is based on trfValidation.scanLogFile
log
=
self
.
_logFileName
msg
.
debug
(
'
Now scanning logfile {0}
'
.
format
(
log
))
# Using the generator so that lines can be grabbed by subroutines if needed for more reporting
try
:
myGen
=
lineByLine
(
log
,
substepName
=
self
.
_substep
)
except
IOError
as
e
:
msg
.
error
(
'
Failed to open transform logfile {0}: {1:s}
'
.
format
(
log
,
e
))
for
line
,
lineCounter
in
myGen
:
# Check to see if any of the hlt children had an issue
if
'
Child Issue
'
in
line
>
-
1
:
try
:
signal
=
int
((
re
.
search
(
'
signal ([0-9]*)
'
,
line
)).
group
(
1
))
except
AttributeError
:
#text signal not found so just return 0
signal
=
0
msg
.
error
(
'
Detected issue with HLTChild, setting mother return code to %s
'
%
(
signal
)
)
self
.
_rc
=
signal
msg
.
info
(
"
Check for expert-monitoring.root file
"
)
#the BS-BS step generates the files:
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment