Skip to content
Snippets Groups Projects

pseudomerge of TrigHLTMonitoring packages from rel 21 to master, ATR-18999

Pseudomerge of the TrigHLTMonitoring packages from rel 21 to master, as described in ATR-18999

Command to run the pseudomerge:
git-package-pseudomerge.py --packages Trigger/TrigMonitoring/TrigHLTMonitoring --source upstream/21.0-TrigMC --target upstream/master --stage 1

The following files are affected:

Files where the rel 21.0 version has been kept unaltered:
Trigger/TrigMonitoring/TrigHLTMonitoring/python/HLTMonFlags.py (new)
Trigger/TrigMonitoring/TrigHLTMonitoring/python/PackagesToInterrogate.py (new)
Trigger/TrigMonitoring/TrigHLTMonitoring/python/ToolInterrogator.py (new)
Trigger/TrigMonitoring/TrigHLTMonitoring/share/runMaM.py (new)
Trigger/TrigMonitoring/TrigHLTMonitoring/java/GUI/TrigMaMGUI.java
Trigger/TrigMonitoring/TrigHLTMonitoring/java/TrigMaMGUI_P1.sh
Trigger/TrigMonitoring/TrigHLTMonitoring/java/TrigMaMGUI_TRIGGERDBREPR.sh
Trigger/TrigMonitoring/TrigHLTMonitoring/python/HLTMonTriggerList.py
Trigger/TrigMonitoring/TrigHLTMonitoring/python/MenuAwareMonitoring.py
Trigger/TrigMonitoring/TrigHLTMonitoring/python/MenuAwareMonitoringStandalone.py
Trigger/TrigMonitoring/TrigHLTMonitoring/python/OracleInterface.py
Trigger/TrigMonitoring/TrigHLTMonitoring/python/scripts/MCKtoCOOLmanual.py
Trigger/TrigMonitoring/TrigHLTMonitoring/run/TrigHLTMon_tf.py
Trigger/TrigMonitoring/TrigHLTMonitoring/share/HLTMonitoring_topOptions.py
Trigger/TrigMonitoring/TrigHLTMonitoring/share/TrigHLTMonCommon_jobOptions.py
Trigger/TrigMonitoring/TrigHLTMonitoring/share/addMonTools.py
Trigger/TrigMonitoring/TrigHLTMonitoring/share/skeleton.HLTMon_tf.py

Files where master has been kept unaltered:
Trigger/TrigMonitoring/TrigHLTMonitoring/CMakeLists.txt
Trigger/TrigMonitoring/TrigHLTMonitoring/src/IHLTMonTool.cxx

Deleted file:
Trigger/TrigMonitoring/TrigHLTMonitoring/cmt/requirements

File where master has been used with alterations:
Trigger/TrigMonitoring/TrigHLTMonitoring/src/HLTMonTool.cxx
where the lines

if (chain.find("gsc") != std::string::npos) { 
 continue;
 }

have been removed to match the jet trigger update from the pseudomerge of https://gitlab.cern.ch/atlas/athenaprivate1/merge_requests/16753

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • This merge request affects 1 package:

    • Trigger/TrigMonitoring/TrigHLTMonitoring
  • Thanks Elin! I think this should also resolve ATR-19096.

    For reviewers: please note that this is a sweep of existing code from release 21.

    Cheers,
    Rafal

  • :negative_squared_cross_mark: CI Result FAILURE

    Athena AthSimulation
    externals :white_check_mark: :white_check_mark:
    cmake :white_check_mark: :white_check_mark:
    make :white_check_mark: :white_check_mark:
    required tests :o: :white_check_mark:
    optional tests :cloud: :white_check_mark:

    Full details available at NICOS MR-20813-2019-02-01-14-01
    :white_check_mark: Athena: number of compilation errors 0, warnings 0
    :white_check_mark: AthSimulation: number of compilation errors 0, warnings 0
    :pencil: CI Jenkins server is switched to https://atlas-sit-ci.cern.ch. It is accessible world-wide (behind CERN SSO). In old links to Jenkins server aibuild080.cern.ch:8080 should be replaced with atlas-sit-ci.cern.ch For experts only: Jenkins output [CI-MERGE-REQUEST 33128]

  • It looks like the q431 test failed because of:

    PyJobTransforms.trfValidation.scanLogFile 2019-02-01 13:11:49,862 INFO Error message "AlgErrorAuditor                                     ERROR Illegal Return Code: Algorithm HLTMonManager reported an ERROR, but returned a StatusCode "SUCCESS"" was ignored at line 5360 (structured match)
    PyJobTransforms.trfExe.validate 2019-02-01 13:11:50,071 ERROR Fatal error in athena logfile (level ERROR)
    PyJobTransforms.transform.execute 2019-02-01 13:11:50,071 CRITICAL Transform executor raised TransformLogfileErrorException: Fatal error in athena logfile: "Logfile error in log.ESDtoAOD: "HLTMonManager.HLTMuonMon                            ERROR Exception thrown by fillChainDQA""
    PyJobTransforms.transform.execute 2019-02-01 13:11:53,222 WARNING Transform now exiting early with exit code 68 (Fatal error in athena logfile: "Logfile error in log.ESDtoAOD: "HLTMonManager.HLTMuonMon                            ERROR Exception thrown by fillChainDQA"")

    @ebergeas, do you have any ideas where this could be happening? I think the test can be rerun with Reco_tf.py --AMI q431.

  • More relevant lines from the full log (https://atlas-sit-ci.cern.ch/job/CI-test-driver/23170/consoleFull):

    13:36:10 ESDtoAOD 13:11:08 AthenaEventLoopMgr                                   INFO   ===>>>  start processing event #1183769295, run #330470 21 events processed so far  <<<===
    13:36:10 ESDtoAOD 13:11:08 CutFlowSvc                                           INFO calling addEvent(334130731, 1)
    13:36:10 ESDtoAOD 13:11:08 HLTMonManager.xAODCaloClusters                    WARNING no valid HLT cell links
    13:36:10 ESDtoAOD 13:11:08 HLTMonManager.xAODCaloClusters                    WARNING no valid HLT cell links
    13:36:10 ESDtoAOD 13:11:08 HLTMonManager.xAODCaloClusters                    WARNING no valid HLT cell links
    13:36:10 ESDtoAOD 13:11:08 HLTMonManager.xAODCaloClusters                    WARNING no valid HLT cell links
    13:36:10 ESDtoAOD 13:11:08 HLTMonManager.xAODCaloClusters                    WARNING no valid HLT cell links
    13:36:10 ESDtoAOD 13:11:08 HLTMonManager.xAODCaloClusters                    WARNING no valid HLT cell links
    13:36:10 ESDtoAOD 13:11:08 HLTMonManager.xAODCaloClusters                    WARNING no valid HLT cell links
    13:36:10 ESDtoAOD 13:11:08 HLTMonManager.xAODCaloClusters                    WARNING no valid HLT cell links
    13:36:10 ESDtoAOD 13:11:08 HLTMonManager.xAODCaloClusters                    WARNING no valid HLT cell links
    13:36:10 ESDtoAOD 13:11:08 HLTMonManager.xAODCaloClusters                    WARNING no valid HLT cell links
    13:36:10 ESDtoAOD 13:11:08 StoreGateSvc                                      WARNING retrieve(const): No valid proxy for object HLT_CaloCellContainer_TrigCaloCellMaker  of type CaloCellContainer(CLID 2802)
    13:36:10 ESDtoAOD 13:11:08 HLTMonManager.CaloCells                              INFO could not retrieve the CaloCellContainer: HLT_CaloCellContainer_TrigCaloCellMaker
    13:36:10 ESDtoAOD 13:11:08 HLTMonManager.HLTMuonMon                            ERROR Exception thrown by fillChainDQA
    13:36:10 ESDtoAOD 13:11:08 HLTMonManager.HLTMuonMon                            ERROR fill() for HLTMonManager.HLTMuonMon, returned StatusCode::FAILURE!
    13:36:10 ESDtoAOD 13:11:08 StoreGateSvc                                      WARNING retrieve(const): No valid proxy for object HLT_xAOD__TrigMissingETContainer_TrigEFMissingET_mht_em  of type xAOD::TrigMissingETContainer(CLID 1134334)
    13:36:10 ESDtoAOD 13:11:08 HLTMonManager.HLTMETMon                           WARNING Could not retrieve TrigMissingETContainer with key HLT_xAOD__TrigMissingETContainer_TrigEFMissingET_mht_em from TDS
    13:36:10 ESDtoAOD 13:11:08 HLTMonManager.HLTBjetMon                             INFO  ===> No trigger fired neither for TriggerChainBjet of size: 8 nor for TriggerChainMujet of size: 4 RETURN from HLTBjetMonTool::fill() !
    13:36:10 ESDtoAOD 13:11:08 AlgErrorAuditor                                     ERROR Illegal Return Code: Algorithm HLTMonManager reported an ERROR, but returned a StatusCode "SUCCESS"
    13:36:10 ESDtoAOD 13:11:08 Error policy described in https://twiki.cern.ch/twiki/bin/view/AtlasComputing/ReportingErrors
    13:36:10 ESDtoAOD 13:11:09 AthenaEventLoopMgr                                   INFO   ===>>>  done processing event #1183769295, run #330470 22 events processed so far  <<<===
  • Oops, that does not look good. It seems like the errors come from HLTMuonMon so I am guessing some of my choices of 21.0 or master did not work with the choices made by muons. I alert @markowen @nakahama . I will try to test this too of course.

  • Hi, thanks @ebergeas for letting us know. I tag @ynoguchi who is responsible on muon monitoring package, in case he can kindly post any hints or possibly solutions.

  • Hi,

    This will also be investigated from my side.

    Regards,

    Yohei

  • Hi again, So I can replicate the problem in lxplus using ``Reco_tf.py --AMI q431

    Now, there are two separate problems here, the fillChainDQA issue which seems related with the MuonMon and the error error from HLTMonManager.

    The fillChainDQA issue:
    This string only occurs in the following files:

    TrigMuonMonitoring/TrigMuonMonitoring/HLTMuonMonTool.h 
    TrigMuonMonitoring/src/CommonMon.cxx
    TrigMuonMonitoring/src/HLTMuonMonTool.cxx

    The error message is printed in the last of these.

    When I look at the pseudomerge MR for these files, https://gitlab.cern.ch/atlas/athenaprivate1/merge_requests/15693/diffs#56d4ed1ef6a8a48bc493d474a9b59f96698de452 it looks like one important change is that what in rel 21.0 was called MuFast is now called LSMuonSA. However, neither of these concepts are explicitly mentioned in TrigHLTMonitoring/ so I can't really see how a change in those files would trigger an error here. I will try to recompile with just MuonMon.

    As for the error error:
    I have problems locating where HLTMonManager is defined. It is mentioned in many files in TrigMonitoring/ but only in two files in TrigMonitoring/TrigHLTMonitoring:

    TrigHLTMonitoring/share/HLTIDtrkMon_DumpTDT.py
    TrigHLTMonitoring/share/addMonTools.py

    and in neither of these places is it defined, i.e. I cannot change how errors are handled from this function. @fwinkl @smh @wiedenma @tamartin any advice?

    Thanks!

  • Hmm I couldn't reproduce it now, so I'll give the CI another run, but maybe I missed something on my side.

    I'm pretty sure the exception is thrown somewhere in TrigMuonMonitoring/src/CommonMon.cxx, but this file has 4600 lines and it's terrible to read.

    As for the "error error" - it looks related to the use of StatusCode::RECOVERABLE in the code handling the exception. It should be changed, but not as part of this MR. Let's see when we solve the exception.

    I'll keep looking, but I'm quite busy today. Others are welcome to jump in.

    Cheers,
    Rafal

  • Jenkins please retry a build

  • This merge request affects 1 package:

    • Trigger/TrigMonitoring/TrigHLTMonitoring
  • Hi @ebergeas, could you add ATLAS Robot as Developer to your new fork? (https://atlassoftwaredocs.web.cern.ch/gittutorial/gitlab-fork/#add-your-friendly-build-bot)

    Thanks, Tadej (L1)

  • Hi @tadej done now, thought I did it upon forking. Thanks!

  • Hi @rbielski , where is the StatusCode::RECOVERABLE statement? If it should be fixed, why not now? (And if it isn't in TrigHLTMonitoring why didn't it appear before?)

    The exception in TrigMuonMonitoring/src/CommonMon.cxx, what I can't understand is that this error appears now and not when the muon pseudomerged. So it must be something with the interplay to my packages.

    I'll try to compile just TrigMuonMonitoring now.

  • I believe lines 800-917 here:
    https://acode-browser.usatlas.bnl.gov/lxr/source/athena/Trigger/TrigMonitoring/TrigMuonMonitoring/src/HLTMuonMonTool.cxx#0800
    are what causes:

    ERROR Illegal Return Code: Algorithm HLTMonManager reported an ERROR, but returned a StatusCode "SUCCESS""

    but the program wouldn't even go into this state if there was no exception thrown from line 802.

  • Hi,

    sorry for the delay. I am starting to look at this. Do I understand correctly that the error occurs when "Reco_tf --AMI q431" is used with the latest nightly of master and this branch to be merged?

    Regards,

    Yohei

    Edited by Yohei Noguchi
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
Please register or sign in to reply
Loading