Skip to content

Draft: Quantization Aware Training with Brevitas (for FPGA deployment) and Model Pruning

Sebastian Dittmeier requested to merge sachin_qat into dev

This branch requires some sort of clean up for example yaml files, but I'd like to get the discussion started if/how this can find its way into the CommonFramework. Features added, most of these tested with metric learning:

  • Quantization Aware Training with Brevitas for Metric Learning (implementation started also for Interaction Network, but not yet fully tested); includes optional input data quantization
  • Iterative modelpruning, which can take place after fixed # epochs, depending on validation_loss; rewinding of learning rate can be enabled; optional L1 loss can be included in the train loss
  • At the end of each epoch purity at a fixed efficiency of 98 % is evaluated (metric learning), and number of Bit Operations (BOPs) is calculated after qonnx export --> this is used for model size comparisions during parameter sweeps

I'm open for suggestions, if code pieces should be moved, implemented differently, etc.

Merge request reports