Skip to content

Feature/tuples as column names

Mykhailo Dalchenko requested to merge feature/tuples_as_column_names into develop

Description

This MR contains two commits, we have to decide which one we want to follow:

  • one with partial implementation of column names tuplization: ('FIELD NAME', slot, link, position)
  • second with extension of the tuple including block id and sub block: ('FIELD NAME', 'BLOCK ID', 'SUB BLOCK', slot, link, position)

Since the performance drop is not significant (and anyways it is an improvement over the f-strings), I'd go for the second option as this would greatly simplify the analysis. The end goal for analysis would be to convert the pandas.DataFrame returned by unpacker to pandas.MultiIndex and then it can easily be sliced or cross-sectioned to select required subset of data (e.g. select all VFATs, or all VFATs from AMC N GEB M etc.)

Note: the documentation will require an update. I do not include it here for two reasons:

  • First we need to decide on the tuple complexity
  • There will be separate MR on the documentation build

Related Issue

Closes #11 (closed)

How Has This Been Tested?

Tested by unpacking a test dataset. Performance result on MBP 13" Fall 2014:

python -m timeit -n 1 "from gdh import unpacker; unpacker.run('./doc/notebooks/test_data/test_sdram.dat','sdram')"
1 loop, best of 5: 3.39 sec per loop

For comparison, previous result was 3.9s

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

Merge request reports