Skip to content
Snippets Groups Projects
Commit 0c28ec79 authored by ssummers's avatar ssummers
Browse files

Add the final ROC evaluation

parent 7dbff527
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id:3e175280 tags:
# CMSSW Emulator
In this exercise you will be guided through the steps to create, compile, and run the emulator of the hls4ml model you trained in part 2. The code in these steps should be executed from the command line on `lxplus` after doing `source setup.sh` from this `cms_mlatl1t_tutorial`.
When developing your own hls4ml NN emulators, you should compile and run your model emulator locally before delivering it to `cms-hls4ml`.
**Note** you need to run the steps described below in the terminal before going through the cells in this notebook!
## Prerequisite
You will need the HLS for the model of part 2.
## 1.
Copy the NN-specific part of the hls4ml project to the `cms-hls4ml` repo. We _don't_ copy `ap_types` since we'll reference them from the externals.
```shell
[ ! -d $MLATL1T_DIR/part3/cms-hls4ml/L1TMLDemo/L1TMLDemo_v1/NN ] && mkdir $MLATL1T_DIR/part3/cms-hls4ml/L1TMLDemo/L1TMLDemo_v1/NN
cp -r $MLATL1T_DIR/part2/L1TMLDemo_v1/firmware/{*.h,*.cpp,weights,nnet_utils} $MLATL1T_DIR/part3/cms-hls4ml/L1TMLDemo/L1TMLDemo_v1/NN
```
## 2.
As of `hls4ml` `0.8.1`, when run outside of Vivado HLS, the C++ code loads the weights from txt files. We need to force compilation of the weights from the header file instead.
This one liner will replace the `#define` that would cause the weights to be loaded from txt files with one that will load them from the header files when we compile instead.
If you don't do this, when you `cmsRun` you will see a runtime error like `ERROR: file w2.txt does not exist`
```shell
find $MLATL1T_DIR/part3/cms-hls4ml/L1TMLDemo/L1TMLDemo_v1/NN \( -type d -name .git -prune \) -o -type f -print0 | xargs -0 sed -i 's/#ifndef __SYNTHESIS__/#ifdef __HLS4ML_LOAD_TXT_WEIGHTS__/'
```
## 3.
`make` the hls4ml emulator interface shared object
```shell
cd $MLATL1T_DIR/part3/cms-hls4ml/hls4mlEmulatorExtras
make
mkdir lib64
mv libemulator_interface.so lib64
```
## 4.
`make` the `L1TMLDemo` model shared object
```shell
cd $MLATL1T_DIR/part3/cms-hls4ml/L1TMLDemo
make
```
*Note* you might benefit from adding `-g` to `CXXFLAGS` to compile with debugging while developing.
The Makefile line would change to `CXXFLAGS := -O3 -fPIC -std=$(CPP_STANDARD) -g`.
## 5.
`scram build` compile the CMSSW code
```shell
cd $CMSSW_BASE/src
scram b -j8
```
## 6.
Copy the `L1TMLDemo` model shared object to the CMSSW area.
```shell
mkdir $CMSSW_BASE/src/L1Trigger/L1TMLDemo/data
cp $MLATL1T_DIR/part3/cms-hls4ml/L1TMLDemo/L1TMLDemo_v1.so $CMSSW_BASE/src/L1Trigger/L1TMLDemo/data
```
## 7.
Run the test config over signal and background!
```shell
cd $CMSSW_BASE/src/L1Trigger/L1TMLDemo/test
cmsRun demoL1TMLNtuple.py signal=True
cmsRun demoL1TMLNtuple.py signal=False
```
We run over the same datasets as part 1:
- Signal: `/GluGlutoHHto2B2Tau_kl-1p00_kt-1p00_c2-0p00_TuneCP5_13p6TeV_powheg-pythia8/Run3Summer22MiniAODv4-130X_mcRun3_2022_realistic_v5-v2/MINIAODSIM`
- Background: `/SingleNeutrino_E-10-gun/Run3Summer23BPixMiniAODv4-130X_mcRun3_2023_realistic_postBPix_v2-v2/MINIAODSIM`
This will produce the files
- `L1TMLDemo_NanoAOD_signal.root`
- `L1TMLDemo_NanoAOD_background.root`
*Note* when developing your own models, you may unfortunately run into segmentation violations while developing. The most common reason is that the input and output data type set in the producer mismatch the types used by the model emulator. In this emulator workflow, this causes a runtime error rather than a compile time error.
## 8.
Run the notebook part3.ipynb
# Notebook
Now we can read the predictions from our Nano AOD ntuple and check they make sense compared to part 1 and part 2.
%% Cell type:code id:8d652e36 tags:
``` python
import numpy as np
import uproot
import awkward as ak
import matplotlib.pyplot as plt
import mplhep
import os
d = os.environ['MLATL1T_DIR']
```
%% Cell type:markdown id:f40b86af tags:
## Load data
Load our signal and background data with `uproot`
%% Cell type:code id:e8fc68f0 tags:
``` python
f_sig = uproot.open(d + '/part3/cmssw/src/L1Trigger/L1TMLDemo/test/L1TMLDemo_NanoAOD_signal.root')
f_bkg = uproot.open(d + '/part3/cmssw/src/L1Trigger/L1TMLDemo/test/L1TMLDemo_NanoAOD_background.root')
y_sig_cmssw = ak.flatten(f_sig['Events/L1TMLDemo_y'].array()).to_numpy()
y_bkg_cmssw = ak.flatten(f_bkg['Events/L1TMLDemo_y'].array()).to_numpy()
```
%% Cell type:markdown id:2069399f tags:
## Histogram
Plot the score distribution for signal and background
%% Cell type:code id:d55f97b8 tags:
``` python
bins=np.linspace(0, 1, 100)
w = bins[1]
h_sig, _ = np.histogram(y_sig_cmssw, bins=bins)
h_bkg, _ = np.histogram(y_bkg_cmssw, bins=bins)
h_sig = h_sig.astype('float') / np.sum(h_sig)
h_bkg = h_bkg.astype('float') / np.sum(h_bkg)
```
%% Cell type:markdown id:08e81144 tags:
## Plot
%% Cell type:code id:eda30259 tags:
``` python
mplhep.histplot(h_bkg, bins, label='Background')
mplhep.histplot(h_sig, bins, label='Signal')
plt.semilogy()
plt.legend()
plt.xlim(0,1)
plt.xlabel('CMSSW NN Emulator Prediction')
plt.ylabel('Frequency')
```
%% Cell type:code id:cc0de85f tags:
``` python
y_pred_hls = np.load(d+'/part2/y_pred_hls.npy')
y_pred_qkeras = np.load(d+'/part2/y_pred_qkeras.npy')
y_pred_float = np.load(d+'/part2/y_pred_float.npy')
y_test = np.load(d+'/part2/y_test.npy')
y_pred_cmssw = np.concatenate((y_sig_cmssw, y_bkg_cmssw))
ones_array = np.ones_like(y_sig_cmssw)
zeros_array = np.zeros_like(y_bkg_cmssw)
y_test_cmssw = np.concatenate((ones_array, zeros_array))
# Lets plot it!
from sklearn.metrics import roc_curve, roc_auc_score
def totalMinBiasRate():
LHCfreq = 11245.6
nCollBunch = 2544
return LHCfreq * nCollBunch / 1e3 # in kHz
fpr, tpr, thr = roc_curve(y_test, y_pred_float, pos_label=None, sample_weight=None, drop_intermediate=True)
roc_auc = roc_auc_score(y_test, y_pred_float)
hlsfpr, hlstpr, hlsthr = roc_curve(y_test, y_pred_hls, pos_label=1, sample_weight=None, drop_intermediate=True)
hlsroc_auc = roc_auc_score(y_test, y_pred_hls)
qfpr, qtpr, qthr = roc_curve(y_test, y_pred_qkeras, pos_label=None, sample_weight=None, drop_intermediate=True)
qroc_auc = roc_auc_score(y_test, y_pred_qkeras)
cmsswfpr, cmsswtpr, cmsswthr = roc_curve(y_test_cmssw, y_pred_cmssw, pos_label=None, sample_weight=None, drop_intermediate=True)
cmsswroc_auc = roc_auc_score(y_test_cmssw, y_pred_cmssw)
fpr *= totalMinBiasRate()
qfpr *= totalMinBiasRate()
hlsfpr *= totalMinBiasRate()
cmsswfpr *= totalMinBiasRate()
f, ax = plt.subplots(figsize=(8,6))
# plt.plot([0, 1], [0, 1], color='navy', lw=1, linestyle='--')
ax.tick_params(axis='both', which='major', labelsize=14)
ax.tick_params(axis='both', which='minor', labelsize=14)
ax.set_xlim(0,100)
ax.plot(fpr, tpr, color='#7b3294', lw=2, ls='dashed', label=f'Baseline (AUC = {roc_auc:.5f})')
ax.plot(qfpr, qtpr, color='#008837', lw=2, label=f'Quantized+Pruned (AUC = {qroc_auc:.5f})')
ax.plot(hlsfpr, hlstpr, color='#a6dba0', lw=2, ls='dotted', label=f'HLS Quantized+Pruned (AUC = {hlsroc_auc:.5f})')
ax.plot(cmsswfpr, cmsswtpr, color='red', lw=2, ls='dashed', label=f'CMSSW Quantized+Pruned (AUC = {cmsswroc_auc:.5f})')
ax.set_xlabel('L1 Rate (kHz)')
ax.set_ylabel('Signal efficiency')
ax.legend(loc="lower right")
ax.grid(True)
plt
```
%% Cell type:code id:3c594b5e tags:
``` python
```
......
This diff is collapsed.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment