Draft: Converter to Pr::Velo::Tracks

marked this merge request as draft

Thank you for this!

I have verified that it compiles and runs. I did a quick test in which i put its output into an instance of TrackBeamLineVertexFinderSoA. The counters for the TBLV instance indicate that it did find PVs.

I want to do more precise testing before marking it ready for merge.

If you try it, please let me know if it suits your needs.

Yes, sure! @spradlin

@mgiza please get this MR, and run Moore TBLV with PVChecker on these converted Allen VELO tracks as an input?

Hello, I have been trying to compile the stack using the following change and the latest release v55r12p4 and I obtain the error:

File "/afs/cern.ch/work/m/mgiza/private/stack/Gaudi/InstallArea/x86_64_v3-el9-gcc13+detdesc-opt+g/bin/gaudirun.py", line 584, in <module>
    exec(o, g, l)
  File "<string>", line 1, in <module>
  File "/afs/cern.ch/work/m/mgiza/private/stack/Gaudi/InstallArea/x86_64_v3-el9-gcc13+detdesc-opt+g/bin/gaudirun.py", line 543, in __call__
    importOptions(arg)
  File "/afs/cern.ch/work/m/mgiza/private/stack/Gaudi/InstallArea/x86_64_v3-el9-gcc13+detdesc-opt+g/python/GaudiKernel/ProcessJobOptions.py", line 552, in importOptions
    _import_function_mapping[ext](optsfile)
  File "/afs/cern.ch/work/m/mgiza/private/stack/Gaudi/InstallArea/x86_64_v3-el9-gcc13+detdesc-opt+g/python/GaudiKernel/ProcessJobOptions.py", line 486, in _import_python
    exec(code, {"__file__": file})
  File "/afs/cern.ch/work/m/mgiza/private/stack/Moore/Hlt/RecoConf/options/examples/mooretblv_mc_2024_allen_tracks.py", line 19, in <module>
    from PyConf.Algorithms import GaudiAllenTrackViewsToV3Tracks
ImportError: cannot import name 'GaudiAllenTrackViewsToV3Tracks' from 'PyConf.Algorithms' (unknown location)

I have been trying recompiling the stack, but this issue persists

The algorithm name should be GaudiAllenVeloToPrVeloTracks.

The source file GaudiAllenTrackViewsToV3Tracks.cpp defines a set of track converters with different input and output types.

What is in your mooretblv_mc_2024_allen_tracks.py?

I have incorporated the conversion into the hlt1_allen python in Moore!3836. Attached to this comment is the quickly hacked script for the quick test that i mentioned above. It uses the converted tracks through the Allen-in-Moore interface implemented in Moore!3836.

allen_gaudi_prvelo_with_TBLV.py

Hello Patrick - I have switched the GaudiAllenTrackViewsToV3Tracks to GaudiAllenVeloToPrVeloTracks in the code and after a few small tweaks it is processing, thank you!

We see rather small efficiency and large fake rate:

PrimaryVertexChecker_d521d67f          INFO 00 all                    :   440272 from   565845 (  803234-237389  ) [ 77.81 %], false 323988 from reco.   764260 (  440272+323988) [ 42.39 %] 
PrimaryVertexChecker_d521d67f          INFO 01 isolated               :   233375 from   287466 (  407974-120508  ) [ 81.18 %], false 49071 from reco.   282446 (  233375+49071) [ 17.37 %] 
PrimaryVertexChecker_d521d67f          INFO 02 close                  :   206897 from   278379 (  395260-116881  ) [ 74.32 %], false 274917 from reco.   481814 (  206897+274917) [ 57.06 %] 
PrimaryVertexChecker_d521d67f          INFO 03 ntracks<10             :    38172 from    54605 (   54605-0       ) [ 69.91 %], false   40 from reco.    38212 (   38172+40  ) [  0.10 %] 
PrimaryVertexChecker_d521d67f          INFO 04 ntracks>=10            :   402100 from   511240 (  511240-0       ) [ 78.65 %], false 323948 from reco.   726048 (  402100+323948) [ 44.62 %] 
PrimaryVertexChecker_d521d67f          INFO 05 z<-50.0                :    90168 from   112565 (  161042-48477   ) [ 80.10 %], false 60613 from reco.   150781 (   90168+60613) [ 40.20 %] 
PrimaryVertexChecker_d521d67f          INFO 06 z in (-50.0, 50.0)     :   258345 from   339238 (  481253-142015  ) [ 76.15 %], false 190762 from reco.   449107 (  258345+190762) [ 42.48 %] 
PrimaryVertexChecker_d521d67f          INFO 07 z >=50.0               :    91759 from   114042 (  160939-46897   ) [ 80.46 %], false 72613 from reco.   164372 (   91759+72613) [ 44.18 %] 
PrimaryVertexChecker_d521d67f          INFO 08 decayBeauty            :     5782 from     7171 (    7175-4       ) [ 80.63 %], false 5084 from reco.   329770 (  324686+5084) [  1.54 %] 
PrimaryVertexChecker_d521d67f          INFO 09 decayCharm             :    65975 from    81754 (   81798-44      ) [ 80.70 %], false 48994 from reco.   389963 (  340969+48994) [ 12.56 %] 
PrimaryVertexChecker_d521d67f          INFO 10 decayStrange           :   438204 from   562831 (  593539-30708   ) [ 77.86 %], false 291318 from reco.   762192 (  470874+291318) [ 38.22 %] 
PrimaryVertexChecker_d521d67f          INFO 11 other                  :     2067 from     3013 (  209694-206681  ) [ 68.60 %], false 32670 from reco.   326055 (  293385+32670) [ 10.02 %] 
PrimaryVertexChecker_d521d67f          INFO 12 1MCPV                  :    86947 from   105065 (  105508-443     ) [ 82.76 %], false 62037 from reco.   148984 (   86947+62037) [ 41.64 %] 
PrimaryVertexChecker_d521d67f          INFO 13 2MCPV                  :    82605 from   102438 (  105130-2692    ) [ 80.64 %], false 54667 from reco.   137272 (   82605+54667) [ 39.82 %] 
PrimaryVertexChecker_d521d67f          INFO 14 3MCPV                  :    75005 from    95321 (  103655-8334    ) [ 78.69 %], false 48618 from reco.   123623 (   75005+48618) [ 39.33 %] 
PrimaryVertexChecker_d521d67f          INFO 15 4MCPV                  :    63566 from    82543 (   99879-17336   ) [ 77.01 %], false 41688 from reco.   105254 (   63566+41688) [ 39.61 %] 
PrimaryVertexChecker_d521d67f          INFO 16 5MCPV                  :    49176 from    65428 (   92555-27127   ) [ 75.16 %], false 34511 from reco.    83687 (   49176+34511) [ 41.24 %]

I will share my script here mooretblv_mc_2024_allen_tracks.py

@spradlin do you happen to know if I should change something in the file? I have also checked running with your options, with the exception that I need "v3" for PVChecher and I happen to obtain the same results

I will add the information about the PV tracks:

PrimaryVertexChecker_d521d67f          INFO 00 all                    : av. PV tracks: 268.33 [MC:  37.50]
PrimaryVertexChecker_d521d67f          INFO 01 isolated               : av. PV tracks: 258.90 [MC:  37.01]
PrimaryVertexChecker_d521d67f          INFO 02 close                  : av. PV tracks: 279.15 [MC:  38.03]
PrimaryVertexChecker_d521d67f          INFO 03 ntracks<10             : av. PV tracks:  51.67 [MC:   6.63]
PrimaryVertexChecker_d521d67f          INFO 04 ntracks>=10            : av. PV tracks: 289.09 [MC:  40.81]
PrimaryVertexChecker_d521d67f          INFO 05 z<-50.0                : av. PV tracks: 249.90 [MC:  34.49]
PrimaryVertexChecker_d521d67f          INFO 06 z in (-50.0, 50.0)     : av. PV tracks: 269.24 [MC:  37.50]
PrimaryVertexChecker_d521d67f          INFO 07 z >=50.0               : av. PV tracks: 284.18 [MC:  40.43]
PrimaryVertexChecker_d521d67f          INFO 08 decayBeauty            : av. PV tracks: 504.44 [MC:  68.37]
PrimaryVertexChecker_d521d67f          INFO 09 decayCharm             : av. PV tracks: 444.37 [MC:  62.89]
PrimaryVertexChecker_d521d67f          INFO 10 decayStrange           : av. PV tracks: 269.36 [MC:  37.69]
PrimaryVertexChecker_d521d67f          INFO 11 other                  : av. PV tracks:  62.40 [MC:   7.71]
PrimaryVertexChecker_d521d67f          INFO 12 1MCPV                  : av. PV tracks: 522.34 [MC:  75.32]
PrimaryVertexChecker_d521d67f          INFO 13 2MCPV                  : av. PV tracks: 331.23 [MC:  46.87]
PrimaryVertexChecker_d521d67f          INFO 14 3MCPV                  : av. PV tracks: 226.50 [MC:  32.91]
PrimaryVertexChecker_d521d67f          INFO 15 4MCPV                  : av. PV tracks: 169.93 [MC:  24.64]
PrimaryVertexChecker_d521d67f          INFO 16 5MCPV                  : av. PV tracks: 138.25 [MC:  19.78]

Thanks a lot for the check @mgiza !

Maybe it would help to check what the TrackChecker result on the converted velo tracks looks like. And also to plot the state distributions. Comparing both of these to the velo tracks from the Rec velo tracking algorithm could help track down why the PV efficiency is so low when using the converted Allen tracks.

Here is the TrackChecker result I have obtainedmooretblv_mc_2024_allen_tracks_mcchecking.txt

Thanks, the velo tracking efficiency looks reasonable (you can compare to the corresponding nightly tests for Allen and Rec velo tracking to be sure).

So maybe it has to do with the state information. Can you please plot the state variables to check them versus those obtained from Rec velo tracks?

Sorry for the delay, I had to slightly rewrite the PrimaryVertexChecker in order to obtain the proper track information. I have compiled the comparisons into the attached pdf. Red plots are corresponding to the PVs reconstructed using "converted" Allen (Hlt1) tracks in Moore TBLV (using this MR), blue is "the reference" of Moore TBLV (Hlt2) tracks. Left plots correspond to tracks of MC true (fake == 0) vertices, right plots to fake PVs (fake == 1).

I will need some help in proper understanding what we obtain in the end - e.g. there is an order of magnitude more tracks used for PV reconstruction than in the reference, which makes me wonder if the way we reconstruct PVs using converted tracks is correct - maybe we shouldn't pass all of them, or the tracks should be "merged" into some other form (class) of tracks. The true PVs now have a much bigger average number of tracks used for reconstruction Allen_tracks_comparison_plots.pdf

Thanks a lot for this detailed comparison @mgiza ! Seeing the differences in the vertex errors in x, y, z for true (and fake) PVs, I believe we need to also compare the covariance matrix elements of the tracks used as input. I suspect that we have large differences there, that could also explain the different number of tracks associated with a PV.

Hi Dorothea - in added plots the err's are sqrt(cov) elements, so e.g. errx is sqrt(covxx) so that information is there - I can plot it via this simple transformation to obtain the straight-forward covariance information if you'd like - I will add those plots in a second

Here I added the plots corresponding to covariance elements (cov_ii is err_i*err_i)Allen_tracks_comparison_plots_with_covariance.pdf

Thanks! Were the Allen and Moore velo tracks produced from the same number of events? I'm surprised that the number of entries varies so much between them, and also between the histgorams (errx, erry, errz). Is there lots of under/overflow?

Clearly we have a large difference in the state errors, and also a slight difference in the tx and ty distributions. I wonder if it could be an issue with the state location, and then an extrapolation that is missing too much. But the Allen velo Kalman state is the one extrapolated to the beamline, so that should in principle match the ClosestToBeam state set here.

It would be good to check the distributions of the velo states in the Allen framework as well. There is quite a bit of monitoring code already in https://gitlab.cern.ch/lhcb/Allen/-/blob/master/device/velo/simplified_kalman_filter/src/VeloKalmanFilter.cu. Would you feel comfortable running Allen and adding plots for the covariance matrix @mgiza ? If not, @bokutsen could take care of that.

@dovombru Is there a test or other script with an example of VeloKalmanFilter monitoring? I am also interested in them for running some tests to check the before-and-after of the converters more carefully.

Is it as easy as selecting a configuration with monitoring options?

@spradlin there is currently no test running the monitoring. However, it is the standard Gaudi monitoring histograms that are being filled (as used in data-taking). So to create them, you can follow the instructions given here: !1754 (diffs) (I realized this MR was targetting 2024-patches, while only master is deployed online. So this will become visible in the Allen documentation web page, once back-ported to master.

Would it be more appropriate to target master with this MR?

answering @dovombru:

in both cases I used: options.evt_max = 100000
the nb of entries differs depending on whether you look at track variable (trX, trY, trTx, trTy, ...) or PV variable (without tr.. in name). So there is ~8 times more tracks and this corresponds to 30% more vertices than in a scenario with the same reconstruction (Moore TBLV) using default Moore TBLV tracks
I don't see any under- or overflow

I have never used that piece of software so I might either ask @bokutsen for help or to check that (whichever you would find more convenient Bogdan)

@spradlin yes, it makes sense to target master. All developments that are no hot-fixes for data-taking any more should target master from now on.

@mgiza thanks for the explanation. But the 30% more PVs are for Moore tracks, not Allen tracks. Which makes sense given the lower efficiency we see for Allen tracks.

I'm also still confused though that the histograms for errx, erry etc. have the same number of entries as the PV histograms. I thought these were the track errors? If not, can you please plot the track covariance matrix?

The 8 times more tracks seems not physical, so we should understand that. Can you please point me to the input file you were using? I would like to check the velo tracking in standalone Allen and check the number of tracks.

Hello Dorothea - I looked at a sum of true and fake vertices, so 705 535 using converted tracks and 508 801 for Moore default tracks (but you are right, there is less true PVs for converted tracks than for default ones).

You are right, the covariance elements obtained from errx are created using information from reconstructed vertices, since I used the PVChecker - I will look into covariance information using "all" tracks, so the same information as tX, tY etc. (it wasn't easily accessible)

I am attaching the input file heremdf_input_and_conds_mc_2024.py (I only checked first 100k entries)

@bokutsen if you have some time, could you have a look at it?

Hi all, Thank you for the input file @mgiza, yes sure I can take a look at this later today or tomorrow and send you the plots for distributions of the velo states

@mgiza, I run Allen on mdf_input_and_conds_mc_2024.py. Other than default monitoring https://gitlab.cern.ch/lhcb/Allen/-/blob/master/device/velo/simplified_kalman_filter/src/VeloKalmanFilter.cu, the histogram for each of the state at beamline from Kalman fit is plotted KF_states_Allen_mc_2024.pdf. Also, the root file with histograms is attached KF_states_Allen_mc_2024.root. Let me know if I missed anything or something else needs to be plotted

Ah, right. And I was running only on 100 events, of course, the statistics can be increased if needed

hi @bokutsen, here you can ask @dovombru

Thanks @bokutsen for plotting the track state for the Allen tracks. We would now need the same for the velo tracks obtained from Rec velo tracking, and from converted Allen tracks.

@mgiza or @spradlin would you be able to check that? @spradlin this could be part of the check for comparing track information before and after the converter you mentioned above.

Maybe @bokutsen can push his additions of monitoring histograms to a branch and give you the exact Allen compilation and command used to run?

In the PV finding, we only use cxx (c00) and cyy (c11). Digging up an older presentation from @mgiza on the Kalman filter intitialization I found the distributions for Rec velo tracks in p. 8.

It looks like the distributions of c00 and c11 of Rec velo tracks reach larger values (around 0.03) than those of Allen velo tracks (around 0.01). Assuming that nothing has changed in the Kalman fit since these plots were made, that can explain the difference when using Allen velo tracks as input for Moore TBLV PV finding.

The different track covariance matrix distributions should be confirmed with the last version of the code. If they persist, we need to follow up on them and understand them. Differences in the cxx and cyy can affect some of the cuts used in TBLV, and the fitter itself.

Sure, I can share the branch with additional monitoring. I was running the Allen algorithms through Moore with :

./Moore/run gaudirun.py Moore/Hlt/Moore/tests/options/mdf_input_and_conds_mc_2024.py Moore/Hlt/RecoConf/options/allen_gaudi_pv.py |& tee MooreLog.log

With this line added to allen_gaudi_pv.py: options.histo_file = 'test_output.root'

Of course, it's also possible to get the monitoring plots from running Allen standalone, or through the testbench as long as :

The velo tracking is running (VeloKalmanFilter)
options.histo_file is set somhere

All of the changes for the plots can be found for Allen on this branch And for Moore on this branch

@bokutsen Are the additional monitoring options in the branches https://gitlab.cern.ch/lhcb/Allen/-/tree/bokutsen_velo_states_monitoring?ref_type=heads and https://gitlab.cern.ch/lhcb/Moore/-/tree/bokutsen_velo_states_monitoring?ref_type=heads included in MRs?

No, though I can create a MR if needed

added RTA label

added Event model enhancement new feature labels

mentioned in issue Moore#832

mentioned in merge request Moore!3836

changed the description

changed target branch from 2024-patches to master

added 123 commits

6e7f2857...fc847628 - 121 commits from branch master
383d8249 - First draft of PrVelo tracks converter
31368f0a - Fix output container size check

Compare with previous version

Detailed check of converted tracks

I was able to perform a detailed comparison of the Allen VELO tracks and the converted tracks to my satisfaction. Here i describe the comparison.

This analysis is based debugging output hacked into private stack. The hacks are not intended for inclusion in the codebase and do not appear in any pushed version of the projects.

As one of the key use-cases for the conversion from Allen consolidated VELO tracks to LHCb::Pr::Velo::Tracks is a comparison of the performance of the TBLV PV reconstruction algorithms in Allen and Rec, i wanted to ensure that the track information consumed by the Allen PV reconcstruction sequence was identical to the track information seen by the Rec TrackBeamLineVertexFinderSoA when running on the output of the converter. To accomplish this, i hacked some debugging output into three functions:

Allen velo_kalman_filter::velo_kalman_filter(), whose output goes into the Allen PV reconstruction sequence,

Output track hits and metainformation from the input Allen::Views::Velo::Consolidated::Tracks
Output beamline state computed by the algorithm.

The Allen-LHCb converter GaudiAllenVeloToPrVeloTracks::operator() created by this MR,

Output track hits and metainformation from the input Allen::Views::Velo::Consolidated::MultiEventTracks
Output beamline state information from the input Allen::Views::Physics::KalmanStates

Rec TrackBeamLineVertexFinderSoA::operator(), which performs PV reconstruction with the output of the Allen-LHCb converter defined in this MR, GaudiAllenVeloToPrVeloTracks

Output track hits, track metainformation, and beamline states from both of the input LHCb::Pr::Velo::Tracks containers
(One container for forward tracks and one container for backward tracks)

Output format

An identical YAML format was used for the new debugging output in each of the algorithms. I used YAML because i have been using it for other things recently and because it is relatively easy to parse out of a logfile and analyze. An example of logfile output from my hacked velo_kalman_filter for a single track:

velo_kalman_filter:
MyTrackRecord ---
reporter: velo_kalman_filter
container: velo_tracks_view
ordinal: 396
backward: true
nhits: 3
hits:
  - 551095031
  - 683246768
  - 748422865
state:
  - 0.00184726715
  - -0.00693941116
  - 11.164587
  - -0.301015288
  - -0.0801150575
covx:
  - 0.00178974541
  - 3.9701139e-05
  - 9.41126928e-07
covy:
  - 0.00178974541
  - 3.9701139e-05
  - 9.41126928e-07
... MyTrackRecord

The output YAML record for the track is everything between MyTrackRecord --- and ... MyTrackRecord (which were included as delimiters to aid parsing).

The output YAML for the corresponding track from GaudiAllenVeloToPrVeloTracks is

GaudiAllenVeloToPrVeloTracks_ad9...   DEBUG
MyTrackRecord ---
reporter: GaudiAllenVeloToPrVeloTracks
container: dev_velo_multi_event_tracks_view
ordinal: 396
backward: true
nhits: 3
hits:
  - 551095031
  - 683246768
  - 748422865
state:
  - 0.00184726715
  - -0.00693941116
  - 11.164587
  - -0.301015288
  - -0.0801150575
covx:
  - 0.00178974541
  - 3.9701139e-05
  - 9.41126928e-07
covy:
  - 0.00178974541
  - 3.9701139e-05
  - 9.41126928e-07
... MyTrackRecord

And the output YAML for the corresponding converted track from TrackBeamLineVertexFinderSoA is

TrackBeamLineVertexFinderSoA_f19...   DEBUG
MyTrackRecord ---
reporter: TrackBeamLineVertexFinderSoA
container: TracksBackwardLocation
ordinal: 143
backward: true
nhits: 3
hits:
  - 551095031
  - 683246768
  - 748422865
state:
  - 0.00184726715
  - -0.00693941116
  - 11.164587
  - -0.301015288
  - -0.0801150575
covx:
  - 0.00178974541
  - 3.9701139e-05
  - 9.41126928e-07
covy:
  - 0.00178974541
  - 3.9701139e-05
  - 9.41126928e-07
... MyTrackRecord

Track associations for single event

I ran a slightly modified version of @mgiza's mooretblv_mc_2024_allen_tracks.py over the first event of mdf_input_and_conds_mc_2024.py and performed detailed comparisons of the YAML track records from the three hacked algorithms.

After parsing all of the YAML records and partitioning them into collections by the reporting algorithm (reporter keyword in my YAML record format), i found that each of the three algorithms reported 421 distinct tracks, as i would hope if the track containers are faithfully converted.

I then made association maps for each pair of collections based on the list of hits in the tracks. If the list of hits are identical, then the two tracks are associated. My methods allowed for many-to-many associations. Each track of each collection was associated with exactly one track from each of the other collections. Again, a good sign and what i would expect from a faithful conversion.

Test state equivalence for associated tracks

For each pair of associated tracks, i compared the reported states and covariance matrixes with the python/NumPy isclose() method for floating point comparisons. There were no detected differences!

I also checked that the forward/backward classifications for associated tracks were identical. They were.

Conclusion

I was able to perform a very detailed before-and-after comparison for 421 tracks in a single event. The new converter seems to do exactly what was intended!

That is awesome @spradlin ! Really huge thank you for this detailed debugging!

added 75 commits

31368f0a...6c96b00a - 70 commits from branch master
a3a2e398 - First draft of PrVelo tracks converter
eecc6e53 - Fix output container size check; Add counters; Update state transfer code.
1fd49885 - Additional velo states monitoring
204c2634 - Updated the binning of the additional monitoring histograms to match that of...
50da03f5 - Updated names and titles of additional monitoring histograms

Compare with previous version

added 1 commit

ea753b5a - Better treatment of EndVelo state for VeloForward tracks.

Compare with previous version

added 11 commits

ea753b5a...21c9c6c9 - 4 commits from branch master
f38ccc86 - First draft of PrVelo tracks converter
eb1dd4e5 - Fix output container size check; Add counters; Update state transfer code.
5c87cb92 - Additional velo states monitoring
d654a4f2 - Updated the binning of the additional monitoring histograms to match that of...
61722858 - Updated names and titles of additional monitoring histograms
26a2e1c2 - Better treatment of EndVelo state for VeloForward tracks.
2542174a - Simplified state covariance transfer

Compare with previous version

Draft: Converter to Pr::Velo::Tracks

Related MRs

Description

Activity

Detailed check of converted tracks

Output format

Track associations for single event

Test state equivalence for associated tracks

Conclusion

Admin message

Draft: Converter to Pr::Velo::Tracks

Related MRs

Description

Merge request reports

Activity

Detailed check of converted tracks

Output format

Track associations for single event

Test state equivalence for associated tracks

Conclusion