One difference between v1 and v2 Particles that I've noticed: in Particle::v1, the particleID() method returns LHCb::ParticleID&, while in Particle::v2, the pid() accessor returns int. I suspect that dealing with LHCb::ParticleID is more correct (e.g. it overloads == operator such that one can check ID==211 or ID=='pi+'). I'm not sure how to best deal with that in v2. Should we leave std::vector<int> in the ParticleIDs container and then somehow convert it to the vector of LHCb::ParticleID on-the-fly?
Designs
Child items
0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
The SOACollection (assuming this is an SOACollection) only has two possible types: float and int. The correct thing to do I think would be to create a strong type that has int as internal representation.
fix the particle v2 accessor, i.e. make it return the 'right' type instead of int... the fact that the 'back end' happens to store an int is an internal detail, and the rest of the world shouldn't have to know that. (and please add a static_assert that verifies that the type is std::trivially_copyable so that one can indeed just copy its memory representation to/from eg. an int).
Thanks @decianm, @graven. Finally coming back to this. Implementing ID and ABSID returning int was indeed trivial. I.e. I can now do things like F.ID == 211 and it even passes qmtests. However, looking closer to the code I'm not sure whether we need to change the return value to LHCb::ParticleID and if yes, how to deal with it:
I suspect to make python configuration to deal with LHCb::ParticleID will require some changes at the python level. And I'm not sure this is actually better than dealing with int. E.g. to be able to write LoKi style in python F.ABSID == "pi+" will probably require some nontrivial changes to https://gitlab.cern.ch/lhcb/Rec/-/blob/master/Phys/FunctorCore/python/Functors/grammar.py, am I right?
Alternatively, to allow for string representations of ID, one can foresee a helper function, e.g. F.ID == int_id('pi+') which returns integer ID given a string ID. In that case we won't need to deal with LHCb::ParticleID at all.
Finally, if for some reason we decide to go ahead with converting to LHCb::ParticleID instead of using int, overloading operators etc. If I understand correctly, the Particle::v2 accessor should return std::vector<LHCb::ParticleID> instead of std::vector<int> that is stored in SOACollection. Since LHCb::ParticleID is nothing but additional interface over simple int, it is conceptually straightforward to convert one into another, but is it possible w/o recreating the std::vector? E.g. if LHCb::ParticleID was inherited from int one could probably have done some cast, but in reality it is a class that has int as the (only) private member. Is there a machinery in C++ to allow such a conversion? Is it what you mean @graven when you mention std::trivially_copyable? Sorry for possibly a dumb question, but I'm only starting to discover the depths of C++17.
let's start with the last point -- I wouldn't create an std::vector<LHCb::ParticleID> in the first place... what you want is to provide a 'facade' (aka 'proxy') on top of some storage which stores some range of ints (potentially as SIMD types), but never lets the storage 'leak' out into the rest of the code, and instead provides a 'vector-like' interface which pretends that the content are LHCb::ParticleID -- see eg here for such an example. Note that this works (efficiently) because the data which is stored and which are presented can be trivially converted between one another, and it is 'small'.
Then the middle point -- your int_id is basically just a means of creating an ParticleID from a string -- i.e. it is (one of the) ParticleID constructors... Given that this F.ID == "pi+" only happens once during construction, there is nothing wrong with taking the rhs of the == and doing a one-time conversion to some internal representation, i.e. the == should take the right hand side, on try to construct an object of the type on the lhs from whatever it gets. So as long as you can create an LHCb::ParticleID from a string, this would be fine.
Finally, the first point: yes, it will be a bit of work to support F.ABSID == "pi+" but I'm sure it will be worthwhile, as that is a one-time effort that will avoid lots of people from having to look up the PDG documentation again and again - so surely integrated over the lifetime of this code, it will save time, effort and mistakes (eg. I wouldn't be able to spot it if someone used 3122 instead of 3212, but if it said 'Lambda' and 'Sigma0' I would have a better chance of realizing whether its used correctly). So yes, it's a (one time) investment, but lots of people will be (or should be!) grateful (albeit they will probably never explicitly acknowledge it).
Thanks @graven. My 1st point did not actually mean I want to drop the ID == 'pi+' functionality, but rather that one could think of using the int representation instead of ParticleID internally, and delegate the string-to-int conversion to the python part, or explicitly to the RHS part of the equality (2nd point). But your answer to the 3rd bullet clarifies things, I think. I'll need to digest it (I understand what you mean conceptually, but need to see how this applies to the actual ThOr code, and whether this needs any changes on the python side).
l164 in the ParticleIDs structure, which returns a scalar. Fine, that would be easy to fix.
l180 in ParticleIDProxy. As I understand, Proxy is a SIMD version of a parent object, and looking at the load_vector implementation, it can only deal with either int or float. So this should probably be left untouched.
So at this point I'm not sure where the ParticleID interface should replace int such that it will be available to the python world.
Another point is that in fact ID=="pi+" functionality is not provided by the ParticleID, and one would need to use ParticlePropertySvc to convert string to int representation.
Another point is that in fact ID=="pi+" functionality is not provided by the ParticleID, and one would need to use ParticlePropertySvc to convert string to int representation.
The point is that overloading ParticleID's "==" operator is probably not a good idea because the string-to-int (or string-to-ParticleID) transformation would need to be done for every element of the container. So it would be better to make this conversion once, and I'm not sure where it should be placed.
What could be done more easily is e.g. the functor like ISID("pi+") that returns true/false, I could follow the example of MASSWITHHYPOTHESIS which uses similar idea. But that would introduce another ThOr-LoKi difference in syntax.
One doesn't exclude the other: don't forget that ID=="pi+" is not actually doing any work -- it creates an object which represents this operation, and that instance is subsequently called (again and again) with 'some type' and returns bool. So what is required is make sure that ID==somestring is (internally) represented as (or translated into) ISID("pi+"). So from that point of view, the problem can be split in two parts: first implement ISID(somestring) and once that works, figure out how to 'map' ID==somestring onto ISID(some string) -- i.e. ID==somestring is just 'syntactic sugar' which is effectively just another way to 'spell out' ISID(some string).
Any ideas how to debug the pipeline that is failing for MR !2575 (merged) (https://gitlab.cern.ch/lhcb/Rec/-/jobs/16644679) ? The problem seems to be with docstring generation, and the reason is that I had to add support for multiple argument types in grammar.py which are now specified in a tuple, see
!2575 (diffs)
(this is needed to be able to pass either integer or string particle ID to the IS(ABS)ID functors).
It is possible to reproduce this pipeline in a terminal?
Coming back to this MR, once the "check-formatting" pipeline is fixed. So what is implemented currently is the syntax like:
Functions returning integer ID: ID == 321, ABSID == 321
Predicates checking integer or symbolic ID: ISID('K+'), ISABSID(321)
I still don't get how to implement the syntax like ID == 'K+', because operator== in this case is handled by python, not by C++. I suspect one would need to modify the Python part in grammar.py in order for it to work, or am I misunderstanding something?
I would only support ISID('K+') and forget (for the time being) the pythonic syntactic sugar of writing ID == 'K+' instead...
(as I would not want any 321 to appear at the python level -- just not legible enough -- i.e. I would even drop ISID(321))
Answering to myself: looking closer at grammar.py it seems like pythonic __eq__ is delegated back to operator== in C++, so it seems that defining the friend bool operator==() might help.
I left the minimal functionality in !2575 (merged): only ISID and ISABSID functors which accept string PID (the C++ implementation accepts both integer and string, but the python declaration only accepts string).
TODO: find out how to overload operator== for whatever object Particle::pid() returns to allow ID == "K+". Still not obvious to me.