Skip to content

Solving 8001 LHEScaleSumW related issues ONCE FOR ALL

The issue centered related to LHEScaleSumW has been there since forever. We have introduced some measures, which wasn't seem enough to mitigate this fully.

Recent several workflows being problematic:

  • CMSCOMPPR-63015
  • CMSCOMPPR-61363
  • CMSCOMPPR-60666
  • CMSCOMPPR-58351
  • CMSCOMPPR-59832

From @lviliani it is noted that some jobs failed because of an uncaught SIGKILL

Click to expand
[1;34mWARNING: program /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_4_8/external/slc7_amd64_gcc10/bin/python3 -O /srv/job/WMTaskSpace/cmsRun1/lheevent/process/bin/internal/systematics.py events.lhe ./tmp_2_events.lhe --start_id=1001 --pdf=325300,316200,306000@0,322500@0,322700@0,322900@0,323100@0,323300@0,323500@0,323700@0,323900@0,305800,303200@0,292200@0,331300,331600,332100,332300@0,332500@0,332700@0,332900@0,333100@0,333300@0,333500@0,333700@0,14000,14066@0,14067@0,14069@0,14070@0,14100,14200@0,14300@0,27400,27500@0,27550@0,93300,61200,42780,315000@0,315200@0,262000@0,263000@0 --mur=1,2,0.5 --muf=1,2,0.5 --together=muf,mur --dyn=-1 --start_event=10218 --stop_event=15327 --result=./log_sys_2.txt --lhapdf_config=/cvmfs/cms.cern.ch/slc7_amd64_gcc10/external/lhapdf/6.4.0-68defff11ffd434c73727d03802bfb85/share/LHAPDF/../../bin/lhapdf-config launch ends with non zero status: -9. Stop all computation [0m

And following the NanoAOD development, checking the number of produced weights become a reasonable approach. A quick implementation can be found here.

Going to:

  • Implement the above improvements in a gridpack
Edited by Sitian Qian