Possible unpacker speed up
Summary
From unpacker profiling (see relevant logs section) it is becoming evident that the most time consuming operation is geo-tagging the VFAT block results. Currently it is done with a following function:
def update_key(self, key):
return f'{key[:5]}{self.slot}:{self.link}:{self.pos}{key[10:]}'
a key here is a string like VFAT:$:!:#:POS
One of the possible ways to improve it is to use tuples instead of strings:
(slot, link, position, 'FIELD NAME')
We have to consider how this will affect the selection of dataframe columns for further analysis (well, column renaming can be mapped and done once for resulting data frame). Other suggestions are very welcome.
Relevant logs and/or screenshots
3392054 function calls (3384732 primitive calls) in 4.620 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
1005048 0.848 0.000 0.848 0.000 vfat.py:47(update_key)
209 0.511 0.002 0.513 0.002 {pandas._libs.lib.maybe_convert_objects}
232650 0.454 0.000 0.828 0.000 generic_block.py:17(unpack_word)
111672 0.325 0.000 1.173 0.000 vfat.py:54(<dictcomp>)
214641 0.304 0.000 0.304 0.000 {method 'update' of 'dict' objects}
232650 0.195 0.000 0.195 0.000 {method 'unpack' of '_cbitstruct.CompiledFormatDict' objects}
9306 0.163 0.000 2.291 0.000 geb.py:47(unpack)
223344 0.160 0.000 0.160 0.000 geb.py:44(update_key)
111672 0.141 0.000 1.726 0.000 vfat.py:50(unpack)
9306 0.137 0.000 2.870 0.000 amc.py:91(unpack)