# Efficient inference: Model compression and hls4ml
In this part, we will start from the data and model you trained in Part 1. We will train a new model quantization aware using QKeras, compare the model performance to that of Part 1 and build the FPGA firmware of this quantized, sparse model using hls4ml.
We assume that you have already participated in Part 1, but if you have not, you can copy the neccessary files from