Improve jet loading times
Because we apply compression to the jets
datasets, we have to load all variables, when for most use cases a couple are only really needed at a time.
In a test I did a while ago I compared loading just jet pt for 500k jets in a standard dump, and a dump with all SV1, JF, SoftMuon variables removed. The loading time went from around 3 seconds to 0.6 seconds.
There are two ways of improving this:
- disabling compression for the jet dataset to allow us to read only specific variables. hopefully this wouldn't increase file sizes too much as jets are not the main bulk of the dumped files (tracks are), and we don't have any padding (which makes compression essential for the track-like groups)
- reduce the amount of jet variables we are saving. One possible way of doing this would be to further split configs to have a "training" config which includes minimal jet info and the tracks, and another "full jet" config which contains all the additional jet variables we are currently saving, but doesn't save any tracks.