Skip to content

Error related to status register: "DC0:FPGA Measure"

For the tests conducted from 25th - 27th July on serenity-3, I noticed a prominent error as given below:

| [2] program (5.8 seconds)                                                                                            |
|----------------------------------------------------------------------------------------------------------------------|
| Parameters:                                                                                                          |
| package=File("/data/bitfiles/emp-v0.6.4/serenity_vu9p_so2_runner-slu9p8x4-project-32301-concurrent-16_220326_1712/s… |
| » Extracting FW package                                                                                              |
| » Starting SMASH session                                                                                             |
| » Programming daughter card X0                                                                                       |
| » An exception of type 'std::runtime_error' was thrown in Command::code(): Start-up failed. Run 'DC0:FPGA Measure    |
| "Status Register"' for more information.                                                                             |
| ❌ Error occurred in setup transition. (6.4s + 1.2s, progress 47%)                                                    |

I conducted tests with 20 iterations in one go performing: power-ON, programming and data input twice (once using known data and other time using random data). For every 20 iterations, I saw as large as 30% of the tests having this problem (observed only once of all the 20-chunk tests that I performed).

Following week, this was observed only twice in 10 sets of those 20-chunk iterations and was observed only on serenity-3. On serenity-1, it was not observed.

Tom suggested the following to debug:

* Add the '-x' option to the pytest command so that it exits on the first failure
* When it next exits with this "Start-up failed. Run ..." error in the program command, don't run any other commands after the that failed program command, SSH into the board, and then run this command:

SMASH_DEFAULT_CONFIG=/etc/serenity/board.smash /opt/smash/bin/smash.exe -q 'DC0:FPGA Measure "Status Register"'

That should print the decoded output of the FPGA's configuration status register 

... and if that indicates that start up was successful, the start up check is probably just being run too soon on this line: https://gitlab.cern.ch/p2-xware/software/smash/-/blob/master/components/src/elements/XilinxFPGA.cpp#L692

This process gave the following output:

Status Register : STATUS REGISTER of SLR 0: 0x1080190c
  CRC ERROR.....................false
  DECRYPTOR ENABLED.............false
  ALL MMCMs & PLLs LOCKED.......true
  DCI-MATCH STATUS..............true
  STARTUP COMPLETE..............false
  GTS CFG_B STATUS..............false
  GWE STATUS....................false
  GHIGH_B STATUS................false
  MODE-PIN SETTINGS.............Master SPI x1, x2, x4, x8
  INTERNAL INIT. FINISHED.......true
  INIT_B PIN....................HIGH
  INTERNAL DONE STATUS..........Pin is actively held LOW
  DONE PIN......................LOW
  IDCODE ERROR..................false
  SECURITY ERROR................false
  SYSMON OVER-TEMP..............false
  STARTUP STATE-MACHINE PHASE...Phase 0
  CFG-BUS WIDTH DETECTION.......x1
STATUS REGISTER of SLR 1: 0x1080190c
  CRC ERROR.....................false
  DECRYPTOR ENABLED.............false
  ALL MMCMs & PLLs LOCKED.......true
  DCI-MATCH STATUS..............true
  STARTUP COMPLETE..............false
  GTS CFG_B STATUS..............false
  GWE STATUS....................false
  GHIGH_B STATUS................false
  MODE-PIN SETTINGS.............Master SPI x1, x2, x4, x8
  INTERNAL INIT. FINISHED.......true
  INIT_B PIN....................HIGH
  INTERNAL DONE STATUS..........Pin is actively held LOW
  DONE PIN......................LOW
  IDCODE ERROR..................false
  SECURITY ERROR................false
  SYSMON OVER-TEMP..............false
  STARTUP STATE-MACHINE PHASE...Phase 0
  CFG-BUS WIDTH DETECTION.......x1
STATUS REGISTER of SLR 2: 0x1080190c
  CRC ERROR.....................false
  DECRYPTOR ENABLED.............false
  ALL MMCMs & PLLs LOCKED.......true
  DCI-MATCH STATUS..............true
  STARTUP COMPLETE..............false
  GTS CFG_B STATUS..............false
  GWE STATUS....................false
  GHIGH_B STATUS................false
  MODE-PIN SETTINGS.............Master SPI x1, x2, x4, x8
  INTERNAL INIT. FINISHED.......true
  INIT_B PIN....................HIGH
  INTERNAL DONE STATUS..........Pin is actively held LOW
  DONE PIN......................LOW
  IDCODE ERROR..................false
  SECURITY ERROR................false
  SYSMON OVER-TEMP..............false
  STARTUP STATE-MACHINE PHASE...Phase 0
  CFG-BUS WIDTH DETECTION.......x1
STATUS REGISTER of SLR 3: 0x1080190c
  CRC ERROR.....................false
  DECRYPTOR ENABLED.............false
  ALL MMCMs & PLLs LOCKED.......true
  DCI-MATCH STATUS..............true
  STARTUP COMPLETE..............false
  GTS CFG_B STATUS..............false
  GWE STATUS....................false
  GHIGH_B STATUS................false
  MODE-PIN SETTINGS.............Master SPI x1, x2, x4, x8
  INTERNAL INIT. FINISHED.......true
  INIT_B PIN....................HIGH
  INTERNAL DONE STATUS..........Pin is actively held LOW
  DONE PIN......................LOW
  IDCODE ERROR..................false
  SECURITY ERROR................false
  SYSMON OVER-TEMP..............false
  STARTUP STATE-MACHINE PHASE...Phase 0
  CFG-BUS WIDTH DETECTION.......x1

So the start up itself is not complete as can be seen above. Needs to be addressed.

Edited by Shilpi Jain
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information