One could consider this as a duplicate of #106; it depends how we view ProbNN.
The DLLs are effectively created as part of the reconstruction and so we can unconditionally include them in the ('proto-')'particle' definition. I'd say ProbNNs are so ubiquitous now that we could do the say for them.
However ProbNNs have a history of different tunings which the DLLs do not. If one wishes to replace the ProbNNs held by a 'particle' then the approach in #106 is more flexible (you can create a new particle by dropping the existing extra-info column and adding a new column with some specific ProbNN values).
A third option is a hybrid: define 'ProbNN' as the output of 'the reconstruction', i.e. the thing we run in HLT2, and then any other quantities are extra info by definition, which will include later ProbNN tunings. The downside with this approach is that the analyst and analysis tools will then have to always reference the 'new' ProbNNs when they want them rather than just doing particle.probnn_proton() or whatever.
That is the most flexible way, yes. It is also more cumbersome though, so if we think some ProbNN tuning is always going to be available after the reconstruction we could consider making it a 'permanent' member of the Charged{Basic,Neutral} objects.
We might be able to do this with ProbNN but we cannot rule out other PID classifiers coming along in the future, so I am afraid whatever we do we need to support things which exist in only part of the dataset.
I am debating whether the 'default' ProbNN is treated in this way or if we make it special in the same we do for e.g. the DLLs.
My guess is that we would be happy making the output of the 'default' ProbNN tuning a property of the particle, i.e. assuming that it will always be run as part of 'the reconstruction'. If folks want to 'replace' it with other tunings we can consider either rebuilding the whole particle or adding specific ExtraInfo columns (once we know how to do that).
Right. That's a reasonable and pragmatic way forward, the other approach would be to prioritise sorting out #106 and get ProbNN support as a byproduct. I can see pros and cons either way.
prioritise sorting out #106 and get ProbNN support as a byproduct
as that solves a structural problem, and not just this one 'incidental' case.
At some point, the structural problem has to be addressed anyway, so ignoring it will just waste effort and generate 'hacks' which in turn will make things more complicated and painful to maintain and evolve.