ChronoStatusSvc: Add locks for MT safety.
Add a lock around m_chronoEntities to prevent crashes occasionally seen in MT jobs. Note that the timing measurements will still be garbage in MT jobs --- this just tries to avoid the observed crashes.
Merge request reports
Activity
- Resolved by Marco Clemencic
If this still produces wrong results, isn't a crash better as it won't be used rather to it being used while the output is wrong?
Also note that there was an effort for a new timing in !787 (merged), maybe it's worth pushing that forward instead?
- [2018-12-04 08:07] Validation started with lhcb-gaudi-merge#647
- [2018-12-05 00:07] Automatic merge failed in lhcb-lcg-dev4#736
- [2018-12-05 00:07] Automatic merge failed in lhcb-dd4hep#80
- [2018-12-05 00:07] Automatic merge failed in lhcb-lcg-dev3#732
- [2018-12-05 00:10] Automatic merge failed in lhcb-tdr-test#381
- [2018-12-05 00:11] Automatic merge failed in lhcb-gaudi-head#2081
- [2018-12-05 00:12] Automatic merge failed in lhcb-sanitizers#85
- [2018-12-06 00:03] Automatic merge failed in lhcb-dd4hep#81
- [2018-12-06 00:04] Automatic merge failed in lhcb-tdr-test#382
- [2018-12-06 00:05] Automatic merge failed in lhcb-lcg-dev3#733
- [2018-12-06 00:06] Automatic merge failed in lhcb-lcg-dev4#737
- [2018-12-06 00:06] Automatic merge failed in lhcb-gaudi-head#2082
- [2018-12-06 00:07] Automatic merge failed in lhcb-sanitizers#86
- [2018-12-06 13:05] Automatic merge failed in lhcb-tdr-test#383
- [2018-12-07 00:04] Automatic merge failed in lhcb-lcg-dev4#738
- [2018-12-07 00:06] Automatic merge failed in lhcb-dd4hep#82
- [2018-12-07 00:06] Automatic merge failed in lhcb-tdr-test#384
- [2018-12-07 00:06] Automatic merge failed in lhcb-lcg-dev3#734
- [2018-12-07 00:08] Automatic merge failed in lhcb-sanitizers#87
- [2018-12-07 00:09] Automatic merge failed in lhcb-gaudi-head#2083
- [2018-12-07 09:06] Automatic merge failed in lhcb-lcg-dev4#739
- [2018-12-07 09:12] Automatic merge failed in lhcb-gaudi-head#2084
- [2018-12-07 09:31] Automatic merge failed in lhcb-lcg-dev4#739
- [2018-12-08 00:07] Automatic merge failed in lhcb-lcg-dev4#740
- [2018-12-08 00:07] Automatic merge failed in lhcb-tdr-test#385
- [2018-12-08 00:07] Automatic merge failed in lhcb-lcg-dev3#735
- [2018-12-08 00:07] Automatic merge failed in lhcb-sanitizers#88
- [2018-12-08 00:09] Automatic merge failed in lhcb-gaudi-head#2085
- [2018-12-09 00:04] Automatic merge failed in lhcb-dd4hep#84
- [2018-12-09 00:04] Automatic merge failed in lhcb-lcg-dev4#741
- [2018-12-09 00:05] Automatic merge failed in lhcb-tdr-test#386
- [2018-12-09 00:06] Automatic merge failed in lhcb-lcg-dev3#736
- [2018-12-09 00:06] Automatic merge failed in lhcb-sanitizers#89
- [2018-12-09 00:07] Automatic merge failed in lhcb-gaudi-head#2086
- [2018-12-10 00:04] Automatic merge failed in lhcb-lcg-dev4#742
- [2018-12-10 00:04] Automatic merge failed in lhcb-lcg-dev3#737
- [2018-12-10 00:04] Automatic merge failed in lhcb-dd4hep#85
- [2018-12-10 00:08] Automatic merge failed in lhcb-tdr-test#387
- [2018-12-10 00:08] Automatic merge failed in lhcb-sanitizers#90
- [2018-12-10 00:09] Automatic merge failed in lhcb-gaudi-head#2087
- [2018-12-11 00:04] Automatic merge failed in lhcb-lcg-dev3#738
- [2018-12-11 00:04] Automatic merge failed in lhcb-dd4hep#86
- [2018-12-11 00:05] Automatic merge failed in lhcb-tdr-test#388
- [2018-12-11 00:05] Automatic merge failed in lhcb-lcg-dev4#743
- [2018-12-11 00:10] Automatic merge failed in lhcb-sanitizers#91
- [2018-12-11 00:13] Automatic merge failed in lhcb-gaudi-head#2088
- [2018-12-12 00:04] Automatic merge failed in lhcb-lcg-dev4#744
- [2018-12-12 00:04] Automatic merge failed in lhcb-lcg-dev3#739
- [2018-12-12 00:05] Automatic merge failed in lhcb-dd4hep#87
- [2018-12-12 00:05] Automatic merge failed in lhcb-tdr-test#389
- [2018-12-12 00:08] Automatic merge failed in lhcb-gaudi-head#2089
- [2018-12-12 00:13] Automatic merge failed in lhcb-sanitizers#92
- [2018-12-12 14:43] Validation started with lhcb-gaudi-head#2090
- [2018-12-13 00:03] Validation started with lhcb-lcg-dev4#745
- [2018-12-13 00:04] Validation started with lhcb-dd4hep#88
- [2018-12-13 00:05] Validation started with lhcb-tdr-test#390
- [2018-12-13 00:05] Validation started with lhcb-lcg-dev3#740
- [2018-12-13 00:10] Validation started with lhcb-sanitizers#93
- [2018-12-13 00:15] Validation started with lhcb-gaudi-head#2091
- [2018-12-14 00:04] Validation started with lhcb-lcg-dev3#741
- [2018-12-14 00:04] Validation started with lhcb-lcg-dev4#746
- [2018-12-14 00:05] Validation started with lhcb-dd4hep#89
- [2018-12-14 00:05] Validation started with lhcb-sanitizers#94
- [2018-12-14 00:08] Validation started with lhcb-tdr-test#391
- [2018-12-14 00:18] Validation started with lhcb-gaudi-head#2092
- [2018-12-15 00:03] Validation started with lhcb-lcg-dev4#747
- [2018-12-15 00:03] Validation started with lhcb-lcg-dev3#742
- [2018-12-15 00:04] Validation started with lhcb-dd4hep#90
- [2018-12-15 00:05] Validation started with lhcb-tdr-test#392
- [2018-12-15 00:10] Validation started with lhcb-sanitizers#95
- [2018-12-15 00:12] Validation started with lhcb-gaudi-head#2093
- [2018-12-15 08:25] Validation started with lhcb-gaudi-head#2094
- [2018-12-16 00:03] Validation started with lhcb-gaudi-head#2095
- [2018-12-16 00:04] Validation started with lhcb-lcg-dev3#743
- [2018-12-16 00:05] Validation started with lhcb-lcg-dev4#748
- [2018-12-16 00:05] Validation started with lhcb-tdr-test#393
- [2018-12-16 00:06] Validation started with lhcb-sanitizers#96
- [2018-12-17 00:04] Validation started with lhcb-dd4hep#92
- [2018-12-17 00:04] Validation started with lhcb-lcg-dev4#749
- [2018-12-17 00:06] Validation started with lhcb-lcg-dev3#744
- [2018-12-17 00:07] Validation started with lhcb-tdr-test#394
- [2018-12-17 00:07] Validation started with lhcb-sanitizers#97
- [2018-12-17 00:08] Validation started with lhcb-gaudi-head#2096
- [2018-12-17 01:08] Validation started with lhcb-lcg-dev3#744
- [2018-12-18 00:04] Validation started with lhcb-lcg-dev4#750
- [2018-12-18 00:04] Validation started with lhcb-lcg-dev3#745
- [2018-12-18 00:05] Validation started with lhcb-dd4hep#93
- [2018-12-18 00:06] Validation started with lhcb-sanitizers#98
- [2018-12-18 00:07] Validation started with lhcb-tdr-test#395
- [2018-12-18 00:09] Validation started with lhcb-gaudi-head#2097
- [2018-12-19 00:04] Automatic merge failed in lhcb-lcg-dev4#751
- [2018-12-19 00:04] Automatic merge failed in lhcb-lcg-dev3#746
- [2018-12-19 00:04] Automatic merge failed in lhcb-dd4hep#94
- [2018-12-19 00:06] Automatic merge failed in lhcb-tdr-test#396
- [2018-12-19 00:10] Automatic merge failed in lhcb-sanitizers#99
- [2018-12-19 00:12] Automatic merge failed in lhcb-gaudi-head#2098
- [2018-12-20 00:03] Automatic merge failed in lhcb-lcg-dev4#752
- [2018-12-20 00:04] Automatic merge failed in lhcb-lcg-dev3#747
- [2018-12-20 00:07] Automatic merge failed in lhcb-dd4hep#95
- [2018-12-20 00:08] Automatic merge failed in lhcb-sanitizers#100
- [2018-12-20 00:09] Automatic merge failed in lhcb-tdr-test#397
- [2018-12-20 00:13] Automatic merge failed in lhcb-gaudi-head#2099
- [2018-12-21 00:03] Automatic merge failed in lhcb-lcg-dev3#748
- [2018-12-21 00:05] Automatic merge failed in lhcb-lcg-dev4#753
- [2018-12-21 00:08] Automatic merge failed in lhcb-dd4hep#96
- [2018-12-21 00:08] Automatic merge failed in lhcb-tdr-test#398
- [2018-12-21 00:16] Automatic merge failed in lhcb-sanitizers#101
- [2018-12-21 00:18] Automatic merge failed in lhcb-gaudi-head#2100
- [2018-12-22 00:04] Automatic merge failed in lhcb-dd4hep#97
- [2018-12-22 00:05] Automatic merge failed in lhcb-lcg-dev3#749
- [2018-12-22 00:06] Automatic merge failed in lhcb-lcg-dev4#754
- [2018-12-22 00:06] Automatic merge failed in lhcb-tdr-test#399
- [2018-12-22 00:09] Automatic merge failed in lhcb-gaudi-head#2101
- [2018-12-22 00:11] Automatic merge failed in lhcb-sanitizers#102
- [2018-12-23 00:03] Automatic merge failed in lhcb-dd4hep#98
- [2018-12-23 00:03] Automatic merge failed in lhcb-sanitizers#103
- [2018-12-23 00:03] Automatic merge failed in lhcb-lcg-dev3#750
- [2018-12-23 00:03] Automatic merge failed in lhcb-lcg-dev4#755
- [2018-12-23 00:04] Automatic merge failed in lhcb-tdr-test#400
- [2018-12-23 00:06] Automatic merge failed in lhcb-gaudi-head#2102
- [2018-12-24 00:04] Automatic merge failed in lhcb-sanitizers#104
- [2018-12-24 00:04] Automatic merge failed in lhcb-dd4hep#99
- [2018-12-24 00:06] Automatic merge failed in lhcb-gaudi-head#2103
- [2018-12-24 00:06] Automatic merge failed in lhcb-tdr-test#401
- [2018-12-24 00:06] Automatic merge failed in lhcb-lcg-dev4#756
- [2018-12-24 00:07] Automatic merge failed in lhcb-lcg-dev3#751
- [2018-12-25 00:04] Automatic merge failed in lhcb-lcg-dev4#757
- [2018-12-25 00:04] Automatic merge failed in lhcb-lcg-dev3#752
- [2018-12-25 00:04] Automatic merge failed in lhcb-dd4hep#100
- [2018-12-25 00:04] Automatic merge failed in lhcb-tdr-test#402
- [2018-12-25 00:06] Automatic merge failed in lhcb-sanitizers#105
- [2018-12-25 00:09] Automatic merge failed in lhcb-gaudi-head#2104
- [2018-12-26 00:03] Automatic merge failed in lhcb-lcg-dev4#758
- [2018-12-26 00:04] Automatic merge failed in lhcb-lcg-dev3#753
- [2018-12-26 00:04] Automatic merge failed in lhcb-dd4hep#101
- [2018-12-26 00:06] Automatic merge failed in lhcb-sanitizers#106
- [2018-12-26 00:07] Automatic merge failed in lhcb-tdr-test#403
- [2018-12-26 00:11] Automatic merge failed in lhcb-gaudi-head#2105
- [2018-12-27 00:04] Automatic merge failed in lhcb-dd4hep#102
- [2018-12-27 00:05] Automatic merge failed in lhcb-tdr-test#404
- [2018-12-27 00:05] Automatic merge failed in lhcb-lcg-dev4#759
- [2018-12-27 00:05] Automatic merge failed in lhcb-sanitizers#107
- [2018-12-27 00:06] Automatic merge failed in lhcb-lcg-dev3#754
- [2018-12-27 00:08] Automatic merge failed in lhcb-gaudi-head#2106
- [2018-12-28 00:03] Automatic merge failed in lhcb-sanitizers#108
- [2018-12-28 00:03] Automatic merge failed in lhcb-lcg-dev4#760
- [2018-12-28 00:06] Automatic merge failed in lhcb-lcg-dev3#755
- [2018-12-28 00:07] Automatic merge failed in lhcb-tdr-test#405
- [2018-12-28 00:08] Automatic merge failed in lhcb-gaudi-head#2107
- [2018-12-29 00:04] Automatic merge failed in lhcb-lcg-dev3#756
- [2018-12-29 00:04] Automatic merge failed in lhcb-lcg-dev4#761
- [2018-12-29 00:05] Automatic merge failed in lhcb-tdr-test#406
- [2018-12-29 00:06] Automatic merge failed in lhcb-sanitizers#109
- [2018-12-29 00:07] Automatic merge failed in lhcb-gaudi-head#2108
- [2018-12-30 00:03] Automatic merge failed in lhcb-dd4hep#105
- [2018-12-30 00:03] Automatic merge failed in lhcb-lcg-dev4#762
- [2018-12-30 00:03] Automatic merge failed in lhcb-lcg-dev3#757
- [2018-12-30 00:05] Automatic merge failed in lhcb-tdr-test#407
- [2018-12-30 00:06] Automatic merge failed in lhcb-sanitizers#110
- [2018-12-30 00:06] Automatic merge failed in lhcb-gaudi-head#2109
- [2018-12-31 00:03] Automatic merge failed in lhcb-lcg-dev4#763
- [2018-12-31 00:03] Automatic merge failed in lhcb-sanitizers#111
- [2018-12-31 00:04] Automatic merge failed in lhcb-lcg-dev3#758
- [2018-12-31 00:05] Automatic merge failed in lhcb-tdr-test#408
- [2018-12-31 00:05] Automatic merge failed in lhcb-dd4hep#106
- [2018-12-31 00:08] Automatic merge failed in lhcb-gaudi-head#2110
- [2019-01-01 00:03] Automatic merge failed in lhcb-sanitizers#112
- [2019-01-01 00:03] Automatic merge failed in lhcb-lcg-dev3#759
- [2019-01-01 00:04] Automatic merge failed in lhcb-lcg-dev4#764
- [2019-01-01 00:05] Automatic merge failed in lhcb-dd4hep#107
- [2019-01-01 00:06] Automatic merge failed in lhcb-tdr-test#409
- [2019-01-01 00:08] Automatic merge failed in lhcb-gaudi-head#2111
- [2019-01-02 00:04] Automatic merge failed in lhcb-lcg-dev4#765
- [2019-01-02 00:05] Automatic merge failed in lhcb-lcg-dev3#760
- [2019-01-02 00:05] Automatic merge failed in lhcb-tdr-test#410
- [2019-01-02 00:07] Automatic merge failed in lhcb-sanitizers#113
- [2019-01-02 00:07] Automatic merge failed in lhcb-dd4hep#108
- [2019-01-02 00:10] Automatic merge failed in lhcb-gaudi-head#2112
- [2019-01-03 00:03] Automatic merge failed in lhcb-lcg-dev4#766
- [2019-01-03 00:05] Automatic merge failed in lhcb-sanitizers#114
- [2019-01-03 00:06] Automatic merge failed in lhcb-dd4hep#109
- [2019-01-03 00:06] Automatic merge failed in lhcb-lcg-dev3#761
- [2019-01-03 00:06] Automatic merge failed in lhcb-tdr-test#411
- [2019-01-03 00:08] Automatic merge failed in lhcb-gaudi-head#2113
- [2019-01-04 00:03] Automatic merge failed in lhcb-lcg-dev4#767
- [2019-01-04 00:04] Automatic merge failed in lhcb-lcg-dev3#762
- [2019-01-04 00:05] Automatic merge failed in lhcb-dd4hep#110
- [2019-01-04 00:06] Automatic merge failed in lhcb-tdr-test#412
- [2019-01-04 00:06] Automatic merge failed in lhcb-sanitizers#115
- [2019-01-04 00:07] Automatic merge failed in lhcb-gaudi-head#2114
- [2019-01-05 00:03] Automatic merge failed in lhcb-sanitizers#116
- [2019-01-05 00:03] Automatic merge failed in lhcb-dd4hep#111
- [2019-01-05 00:04] Automatic merge failed in lhcb-lcg-dev4#768
- [2019-01-05 00:04] Automatic merge failed in lhcb-lcg-dev3#763
- [2019-01-05 00:05] Automatic merge failed in lhcb-gaudi-head#2115
- [2019-01-05 00:08] Automatic merge failed in lhcb-tdr-test#413
- [2019-01-06 00:03] Automatic merge failed in lhcb-sanitizers#117
- [2019-01-06 00:03] Automatic merge failed in lhcb-lcg-dev3#764
- [2019-01-06 00:04] Automatic merge failed in lhcb-lcg-dev4#769
- [2019-01-06 00:05] Automatic merge failed in lhcb-dd4hep#112
- [2019-01-06 00:05] Automatic merge failed in lhcb-gaudi-head#2116
- [2019-01-06 00:06] Automatic merge failed in lhcb-tdr-test#414
- [2019-01-07 00:04] Automatic merge failed in lhcb-dd4hep#113
- [2019-01-07 00:04] Automatic merge failed in lhcb-tdr-test#415
- [2019-01-07 00:05] Automatic merge failed in lhcb-lcg-dev4#770
- [2019-01-07 00:06] Automatic merge failed in lhcb-lcg-dev3#765
- [2019-01-07 00:06] Automatic merge failed in lhcb-sanitizers#118
- [2019-01-07 00:07] Automatic merge failed in lhcb-gaudi-head#2117
- [2019-01-07 16:00] Automatic merge failed in lhcb-lcg-dev4#771
- [2019-01-08 00:04] Automatic merge failed in lhcb-lcg-dev4#772
- [2019-01-08 00:06] Automatic merge failed in lhcb-lcg-dev3#766
- [2019-01-08 00:06] Automatic merge failed in lhcb-dd4hep#114
- [2019-01-08 00:07] Automatic merge failed in lhcb-tdr-test#416
- [2019-01-08 00:07] Automatic merge failed in lhcb-sanitizers#119
- [2019-01-08 00:08] Automatic merge failed in lhcb-gaudi-head#2118
- [2019-01-09 00:03] Automatic merge failed in lhcb-lcg-dev4#773
- [2019-01-09 00:04] Automatic merge failed in lhcb-dd4hep#115
- [2019-01-09 00:05] Automatic merge failed in lhcb-tdr-test#417
- [2019-01-09 00:05] Automatic merge failed in lhcb-sanitizers#120
- [2019-01-09 00:06] Automatic merge failed in lhcb-lcg-dev3#767
- [2019-01-09 00:09] Automatic merge failed in lhcb-gaudi-head#2119
- [2019-01-10 00:04] Validation started with lhcb-sanitizers#121
- [2019-01-10 00:04] Validation started with lhcb-lcg-dev4#774
- [2019-01-10 00:05] Validation started with lhcb-dd4hep#116
- [2019-01-10 00:05] Validation started with lhcb-tdr-test#418
- [2019-01-10 00:05] Validation started with lhcb-lcg-dev3#768
- [2019-01-10 00:08] Validation started with lhcb-gaudi-head#2120
- [2019-01-10 08:44] Validation started with lhcb-gaudi-head#2121
- [2019-01-10 08:49] Validation started with lhcb-lcg-dev4#774
Edited by Software for LHCbassigned to @clemenci
@clemenci the proper error or exception approach is, of course, better, but this isn't what's implemented now, right?
Making something seem like it could be used (passed the it "doesn't crash" test) while producing wrong results seems dangerous...Edited by Christoph Hasse@ssnyder could one not you just throw an
exception
or better return aStatusCode::FAILURE
when you go into a multithreaded environment with this service activated?- Resolved by Marco Clemencic
hi -
The situation is that in Atlas, we have many job configurations, and we to start being able to run them with MT. ChronoStatSvc is enabled in many different places. From one test job, i see that if ChronoStatSvc is enabled in a MT job, i get random crashes due to it in ~ a few percent of runs.
I know there have been discussions about fixing ChronoStatSvc properly, but as far as i know, there's nothing immediately ready. (If it is just about ready, then by all means we should do that instead.) Nevertheless, i'm assuming that this will in fact get fixed in the not-to-distant future. However, random crashes cause difficultly in validating other things, so one one would like to go and get rid of them if it can be done easily.
As Christoph says, we could go through the configurations and add a test on MT everywhere ChronoStatSvc is configured. I'm reluctant to do that though since it's probably a bunch of places but more importantly because one would then need to remember to undo those changes once a properly fixed ChronoStatSvc is available.
We could consider suppressing the ChronoStatSvc output for MT jobs. However, since we are expecting a proper fix, i wanted to keep this change relatively minimal. (My first try was actually to just replace the map with a tbb::concurrent_hash_map. However, this broke one of the Gaudi tests because a std::string was being leaked, and it didn't seem worth the effort to try to understand exactly why.) Similarly, Synced might be nicer, but as i'm expecting this code won't have a long lifetime (famous last words...), i don't think it's really worth changing.
In summary, this is a (hopefully!) short-term fix to prevent random ChronoStatSvc crashes from interfering with other validation until a proper solution is ready.
added 1 commit
- 7916407a - Add a warning to ChonoStatSvc in MT jobs that statistics are unreliable.
mentioned in merge request !809 (merged)
changed milestone to %v31r0