Error related to status register: "DC0:FPGA Measure"
For the tests conducted from 25th - 27th July on serenity-3, I noticed a prominent error as given below:
| [2] program (5.8 seconds) |
|----------------------------------------------------------------------------------------------------------------------|
| Parameters: |
| package=File("/data/bitfiles/emp-v0.6.4/serenity_vu9p_so2_runner-slu9p8x4-project-32301-concurrent-16_220326_1712/s… |
| » Extracting FW package |
| » Starting SMASH session |
| » Programming daughter card X0 |
| » An exception of type 'std::runtime_error' was thrown in Command::code(): Start-up failed. Run 'DC0:FPGA Measure |
| "Status Register"' for more information. |
| ❌ Error occurred in setup transition. (6.4s + 1.2s, progress 47%) |
I conducted tests with 20 iterations in one go performing: power-ON, programming and data input twice (once using known data and other time using random data). For every 20 iterations, I saw as large as 30% of the tests having this problem (observed only once of all the 20-chunk tests that I performed).
Following week, this was observed only twice in 10 sets of those 20-chunk iterations and was observed only on serenity-3. On serenity-1, it was not observed.
Tom suggested the following to debug:
* Add the '-x' option to the pytest command so that it exits on the first failure
* When it next exits with this "Start-up failed. Run ..." error in the program command, don't run any other commands after the that failed program command, SSH into the board, and then run this command:
SMASH_DEFAULT_CONFIG=/etc/serenity/board.smash /opt/smash/bin/smash.exe -q 'DC0:FPGA Measure "Status Register"'
That should print the decoded output of the FPGA's configuration status register
... and if that indicates that start up was successful, the start up check is probably just being run too soon on this line: https://gitlab.cern.ch/p2-xware/software/smash/-/blob/master/components/src/elements/XilinxFPGA.cpp#L692
This process gave the following output:
Status Register : STATUS REGISTER of SLR 0: 0x1080190c
CRC ERROR.....................false
DECRYPTOR ENABLED.............false
ALL MMCMs & PLLs LOCKED.......true
DCI-MATCH STATUS..............true
STARTUP COMPLETE..............false
GTS CFG_B STATUS..............false
GWE STATUS....................false
GHIGH_B STATUS................false
MODE-PIN SETTINGS.............Master SPI x1, x2, x4, x8
INTERNAL INIT. FINISHED.......true
INIT_B PIN....................HIGH
INTERNAL DONE STATUS..........Pin is actively held LOW
DONE PIN......................LOW
IDCODE ERROR..................false
SECURITY ERROR................false
SYSMON OVER-TEMP..............false
STARTUP STATE-MACHINE PHASE...Phase 0
CFG-BUS WIDTH DETECTION.......x1
STATUS REGISTER of SLR 1: 0x1080190c
CRC ERROR.....................false
DECRYPTOR ENABLED.............false
ALL MMCMs & PLLs LOCKED.......true
DCI-MATCH STATUS..............true
STARTUP COMPLETE..............false
GTS CFG_B STATUS..............false
GWE STATUS....................false
GHIGH_B STATUS................false
MODE-PIN SETTINGS.............Master SPI x1, x2, x4, x8
INTERNAL INIT. FINISHED.......true
INIT_B PIN....................HIGH
INTERNAL DONE STATUS..........Pin is actively held LOW
DONE PIN......................LOW
IDCODE ERROR..................false
SECURITY ERROR................false
SYSMON OVER-TEMP..............false
STARTUP STATE-MACHINE PHASE...Phase 0
CFG-BUS WIDTH DETECTION.......x1
STATUS REGISTER of SLR 2: 0x1080190c
CRC ERROR.....................false
DECRYPTOR ENABLED.............false
ALL MMCMs & PLLs LOCKED.......true
DCI-MATCH STATUS..............true
STARTUP COMPLETE..............false
GTS CFG_B STATUS..............false
GWE STATUS....................false
GHIGH_B STATUS................false
MODE-PIN SETTINGS.............Master SPI x1, x2, x4, x8
INTERNAL INIT. FINISHED.......true
INIT_B PIN....................HIGH
INTERNAL DONE STATUS..........Pin is actively held LOW
DONE PIN......................LOW
IDCODE ERROR..................false
SECURITY ERROR................false
SYSMON OVER-TEMP..............false
STARTUP STATE-MACHINE PHASE...Phase 0
CFG-BUS WIDTH DETECTION.......x1
STATUS REGISTER of SLR 3: 0x1080190c
CRC ERROR.....................false
DECRYPTOR ENABLED.............false
ALL MMCMs & PLLs LOCKED.......true
DCI-MATCH STATUS..............true
STARTUP COMPLETE..............false
GTS CFG_B STATUS..............false
GWE STATUS....................false
GHIGH_B STATUS................false
MODE-PIN SETTINGS.............Master SPI x1, x2, x4, x8
INTERNAL INIT. FINISHED.......true
INIT_B PIN....................HIGH
INTERNAL DONE STATUS..........Pin is actively held LOW
DONE PIN......................LOW
IDCODE ERROR..................false
SECURITY ERROR................false
SYSMON OVER-TEMP..............false
STARTUP STATE-MACHINE PHASE...Phase 0
CFG-BUS WIDTH DETECTION.......x1
So the start up itself is not complete as can be seen above. Needs to be addressed.
Edited by Shilpi Jain