The problem turned out to be in GeoModelRead, in the caching of the volumes that were built.
The copyNumber was used to create a cache key, together with tableID and volumeID.
But the string tableID-volID-copyN was identifying each volume uniquely... so, shared volumes were created as unique instances when restoring the GeoModel tree from file.
I fixed that by removing the copyNumber from the key used for the cache. In that way, all volumes with the same tableID and volID, but with different copyNumber, share the same Volume instance.
Designs
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related or that one is blocking others.
Learn more.
@dellacqu@boudreau and @mbandier , following our discussion on this yesterday, this is the original ticket for the duplicated-volumes-issue.
The issue turned out to be caused by poor implementation of the cache algorithm by myself.
Basically, in GeoModelRead there is a cache mechanism taking notes of the volumes that were already built to re-use PhysVols instances when volumes are shared. But that caching mechanism was using the copy number as part of the label to identify the PhysVol instances. And that is wrong because the copy number identifies a single, unique PhysVol. Therefore, all shared PhysVols were seen as "unique".
As I recalled yesterday, I had fixed that in the MR !131 (merged) by removing the copy number from the label used to identify volumes in the cache.
But unfortunately, while the MR works very well on macOS, Joe found out that the same fix (which is a simple fix) did not work on Ubuntu because GeoModelRead ended up in an infinite loop somehow...
So, we ended up reverting the change.
Now, I'm trying to track down the problem and why a very simple change like this can cause the code to work fine on one platform and not at all on another one.
After much investigation we finally realised that we had two main issues:
In GeoModelRead, there was a cache to keep track of the nodes that had been restored; the cache was in place to correctly reuse the nodes when they were "shared". The problem was that the cache itself kept track of the number of copies of a given PhysVol node, and that copy-number was used for the cache key. So, in the end, all shared PhysVol nodes were restored as unique node. The issue didn't introduce any error in the GeoModel tree structure itself, the output geometry was correct (the GeoModel tree was, de facto, "flattened", without shared nodes), but the number of nodes and the resulting memory footprint were larger. This has been fixed in !131 (merged) .
After fixing point 1., the number of restored GeoPhys nodes was correct, but the 'read' operation entered in a very long loop that, on given machines, ended up with a crash because they went out of memory. After much investigation, we realised that, while fixing the cache issue, removing the copy number from the cache key introduced a problem in the traversal of the GeoModel tree. When writing back a restored GeoModel tree, the Write action, in fact, visited the shared nodes more than once, creating many more connections than needed. This happened only with the "leaves", the child volumes that did not have other child volumes. And that made the output SQLite file have a larger (HUGE!) number of connections between child volumes, which ended up in "out-of-memory" crashes on some platforms. This has been fixed by introducing a new cache to keep track of the visited volumes while traversing the tree for saving the child volumes. This has been fixed in !174 (merged).
Now geometries which are restored from an SQLite file have the same physical volume tree as when they were saved.
However, we need further work to assure that shared instances of IdTags,, Transforms, Alignable Transforms, and NameTags be shared after a save/restore cycle. This is addressed in #60 .