TrkGlobalChi2Fitter: Fix handling of bad fits.
In GlobalChi2Fitter::calculateTrackParameters, we propagate track parameters along the track from one measurement on a track to another. The track parameters are also assocated with a surface. After the propagation we store the parameters in the GXFTrajectory along with a measurement. Later, there will be an assertion that the surface of the parameter matches the surface of the measurement.
There is, however, a maximum path length for the propagation, which is usually set to a hard-coded length of 10m. Once this length is reached, we stop and return a CurvilinearParameters object a dummy surface corresponding to the point at which the propagation stopped. But if we store this result in GXFTrajectory, then we can fail later with an assertion in TrackStateOnSurface.
We can exceed the maximum path length in cases where the fit is unstable.
Detect cases where the assertion in TrackStateOnSurface would fail and instead treat as a failed fit. Fixes test failure seen in ActsConfig in the dbg build.
Merge request reports
Activity
added frozen-tier0-violating label
Running the ActsConfig test in the dbg build takes a long time. But the crash can be seen in the opt build if this change is applied to TrkTrack:
diff --git a/Tracking/TrkEvent/TrkTrack/CMakeLists.txt b/Tracking/TrkEvent/TrkTrack/CMakeLists.txt index f1fd43ffa632..5c891a4f9164 100644 --- a/Tracking/TrkEvent/TrkTrack/CMakeLists.txt +++ b/Tracking/TrkEvent/TrkTrack/CMakeLists.txt @@ -25,3 +25,10 @@ atlas_add_executable(TrkTrack_test #Tests atlas_add_test(ut_TrkTrack_test SCRIPT TrkTrack_test) + +if ( "${CMAKE_BUILD_TYPE}" STREQUAL "RelWithDebInfo" ) + set_source_files_properties( + ${CMAKE_CURRENT_SOURCE_DIR}/src/TrackStateOnSurface.cxx + PROPERTIES + COMPILE_FLAGS " -UNDEBUG " ) +endif()
- Resolved by Shaun Roe
Hey
For me is fine to call this a failed fit since now we have a reason for it. Actually , these @sroe was trying to hunt down, thus the consistency check. Usually we hit those limits when the propagation went "awol"
I think he added also the
isSane
for the GXFTrackState that prints in this case [https://acode-browser1.usatlas.bnl.gov/lxr/source/athena/Tracking/TrkEvent/TrkTrack/src/GXFTrackState.cxx#0112] but do not see that used much.The trick you mention above might be of interest for him, although I think is the same issue. I am almost sure we had a JIRA ... maybe he recalls
Anyhow I recall he had seen rare issues in Muons not being at the "right" surface aka the assert failing. My only question is if we want to have a WARNING with the kind of message we have for the isSane() i.e say
"Fit failed due to inconcistent surfaces" and then print the associated surfaces?
GXFTrackState::isSane() const{ 0113 const GXFMaterialEffects* eff = m_materialEffects.get(); 0114 if (not consistentSurfaces(m_measurement.get(),m_trackpar.get(), eff) ){ 0115 std::cerr << "GXFTrackState::isSane. With :" << '\n'; 0116 std::cerr << "Types : " << m_tsType.to_string() << '\n'; 0117 std::cerr << "Surfaces differ! " << std::endl; 0118 if (m_trackpar) { 0119 std::cerr << "ParamSurf: [" << &(m_trackpar->associatedSurface()) 0120 << "] " << m_trackpar->associatedSurface() << std::endl; 0121 } 0122 if (m_measurement) { 0123 std::cerr << "measSurf: [" << &(m_measurement->associatedSurface()) 0124 << "] " << m_measurement->associatedSurface() 0125 << std::endl; 0126 } 0127 if (m_materialEffects) { 0128 std::cerr << "matSurf: [" 0129 << &(m_materialEffects->associatedSurface()) << "] " 0130 << m_materialEffects->associatedSurface() << std::endl; 0131 } 0132 return false; 0133 } 0134 return true; 0135 }
bool 0203 TrackStateOnSurface::isSane() const 0204 { 0205 bool surfacesDiffer = 0206 not Trk::consistentSurfaces(trackParameters(), 0207 measurementOnTrack(), 0208 materialEffectsOnTrack()); .......... 0232 }
Edited by Christos Anastopoulos
added Tracking main review-pending-level-1 labels
added Run3-DataReco-output-changed label
CI Result FAILURE (hash 9bfb3b70)Athena externals cmake make tests Full details available on this CI monitor view. Check the JIRA CI status board for known problems
Athena: number of compilation errors 0, warnings 0
For experts only: Jenkins output [CI-MERGE-REQUEST-EL9 9996] (remote access info)I think this is consistent with @sroe was hunting down. I suspect a few muon fits were hitting this. Anyhow, lets see if he has an opinion also in the code.
removed frozen-tier0-violating label
added offline-sw-review-approved label
Btw since this is main. I removed the FT0 label . The Run-3 output label should stay. Adding the magic label.
For the code the only thing is if we should also print a WARNING but depends how "often" it happens .... avoid spamming logs...
Edited by Christos Anastopoulosadded RC Attention Required label
removed review-pending-level-1 label
added review-user-action-required label
The CI pipeline is still in failure. The final review-approved label should only be applied once the CI passes. Changing tag to review-user-action-required
Hi @joany please review the code and say if is fine from your side
The user can do nothing (and even if could should absolutely not do something) about the references, this is the reason for adding offline-sw-review-approved and RC Attention Required ...
And if we are to update better if the code has passed code review or the comments are there, so do the least CI runs
In detail:
RC should tag Reconstruction (Simulation) and Software Coordinators on the MR, who may tag other experts if necessary.
The above should agree that the change is OK and also on the order in which concurrent output-changing MRs will be accepted.
All decisions should be recorded on the MR. Once the changes are agreed upon by the appropriate responsible persons according to the nature of the changes (Software/Reconstruction/Simulation Coordinators) the offline-sw-review-approved label should be added by those persons.
RC (or someone instructed by them) will update the reference files on eos and ask the developer to update the MR accordingly.
Due to the tricky sequencing related to reference file updates, other developers should not touch the reference files
None of them really involve the user or the shifter or relate to code .... the label if not so importand but please review the code ignoring the reference changes. If there other CI failures (not related to reference update) also comment on them ...
Edited by Christos Anastopoulos