YARR issueshttps://gitlab.cern.ch/YARR/YARR/-/issues2024-01-09T20:23:55+01:00https://gitlab.cern.ch/YARR/YARR/-/issues/14No plots produced when InjVcalDiff range not divisible by step2024-01-09T20:23:55+01:00Magne Eik LauritzenNo plots produced when InjVcalDiff range not divisible by stepWhen performing a thresholdscan, the following plots will not be produced on completion if the InjVcalDiff range is not divisible by the step parameter:
Noisemap 2D plot
S-curve 2D plot
Thresholdmap 2D plotWhen performing a thresholdscan, the following plots will not be produced on completion if the InjVcalDiff range is not divisible by the step parameter:
Noisemap 2D plot
S-curve 2D plot
Thresholdmap 2D plotDaniel Joseph AntrimDaniel Joseph Antrimhttps://gitlab.cern.ch/YARR/YARR/-/issues/20Digital and Analog Injection prescan configuration2024-01-09T20:23:56+01:00Jia Jian TeohDigital and Analog Injection prescan configurationEnable/disable certain registers eg. ABC reg32[TestPatt, TestPattEnable, TestPulseEnable, LP/PR_Enable], reg33[CalPulseEnable] in prescan.
From ScanFactory::preScan(), it's suppose to be broadcasted to all ABC as StarChips::makeGlobal()...Enable/disable certain registers eg. ABC reg32[TestPatt, TestPattEnable, TestPulseEnable, LP/PR_Enable], reg33[CalPulseEnable] in prescan.
From ScanFactory::preScan(), it's suppose to be broadcasted to all ABC as StarChips::makeGlobal() set the global FE ID to 15.
Need some works to StarChips::writeNamedRegister(), StarChips::setAndWriteABCSubRegister(n, val, this_chipID), and StarCfg::setSubRegisterValue to get the correct chip_index and chip ID.
Set appropriate registers in prescan and test digital injection.
Using static test mode to verify:
* channel output pattern == input mask pattern
* number of trigger == number of hits.
*Test pulse mode will be opened in another issue.Strips integrationOlivier ArnaezOlivier Arnaezhttps://gitlab.cern.ch/YARR/YARR/-/issues/24Strobe Delay Scan (generic parameter scan)2024-01-09T20:23:56+01:00Jia Jian TeohStrobe Delay Scan (generic parameter scan)StarParameterLoop in principle could be used to do any scan on any desirable sub-register.
In this particular Strobe Delay Scan, the register to be scanned is ABC_reg03["STR_DEL"].
It controls the the timing (delay) of an injected calibr...StarParameterLoop in principle could be used to do any scan on any desirable sub-register.
In this particular Strobe Delay Scan, the register to be scanned is ABC_reg03["STR_DEL"].
It controls the the timing (delay) of an injected calibration pulse with respect to the arrival time of the command to actually issue that pulse.
The Strobe delay scan consists of 2 parts, a threshold scan at 2fC first, followed by the actual scan through the strobe delay. The threshold scan is used to get an approximation of the required threshold and isn't required if this is already known.
But, this issue is mainly about the latter.
Outputs:
1. STR_DEL distribution for each ABCStar. (2D_Hist-- x:Channel, y:STR_DEL, COLZ:hits)
2. Average Hit Probability per STR_DEL, fitted at the lower and higher end value (@ 50% efficiency) + resulting width of the working strobe delay range
3. Number of channels (y) of a certain upper end (falling edge) Strobe Delay DAC setting (x) of the working Strobe Delay region
4. An optimal STR_DEL setting at 25%/40%/50% of the working region, immediately take effect and/or save to the output config.json
Comment: (1) is minimum requirement, should work for other sub-register. (2)-(4) may be dealt with in different analysis/plotting scripts?Strips integrationJia Jian TeohRyan RobertsJia Jian Teohhttps://gitlab.cern.ch/YARR/YARR/-/issues/27Noise Occupancy Scan2024-01-09T20:23:56+01:00Jia Jian TeohNoise Occupancy ScanMeasures the noise occupancy (down to the level of 10<sup>-6</sup>) without injecting charge . Try
1. Default NoiseAnalysis
* create mask for noisy channel
2. Extension to (1). NoiseOcc as a function of threshold. The number of ...Measures the noise occupancy (down to the level of 10<sup>-6</sup>) without injecting charge . Try
1. Default NoiseAnalysis
* create mask for noisy channel
2. Extension to (1). NoiseOcc as a function of threshold. The number of triggers is a function of the occupancy (a minimum of 50 hits are seen in more than 50% of the active readout channels).
* NoiseOccupancy (per chip) vs threshold
* NoiseOccupancy map [x:channel, y:threshold, colz: NoiseOcc] (maybe in further analysis script)
* Do a linear fit to a plot of log(noise occupancy) vs. threshold<sup>2</sup> to estimate the gaussian noise for each module in ENC. (maybe in further analysis script)Strips integrationOlivier ArnaezJia Jian TeohRyan RobertsOlivier Arnaezhttps://gitlab.cern.ch/YARR/YARR/-/issues/31N-point gain + response curve2021-03-25T17:44:15+01:00Jia Jian TeohN-point gain + response curve**dependency: Threshold scan**
Threshold scans are performed for N- different injected charges (BCAL).
The corresponding response curve is fitted linearly/quadratically to estimate the discriminator offset (mV @ 0 fC) and the channels g...**dependency: Threshold scan**
Threshold scans are performed for N- different injected charges (BCAL).
The corresponding response curve is fitted linearly/quadratically to estimate the discriminator offset (mV @ 0 fC) and the channels gain (slope, mV/fC)Strips integrationZhengcheng TaoRyan RobertsZhengcheng Taohttps://gitlab.cern.ch/YARR/YARR/-/issues/37Distributed branch overview plan2021-03-24T17:54:31+01:00Bruce Joseph GallopDistributed branch overview planThe distributed branch has been a long running and useful branch that has got behind in merging.
I think it's useful to start from a clean slate while picking ideas from that branch.
The following is a break down of things to do, which...The distributed branch has been a long running and useful branch that has got behind in merging.
I think it's useful to start from a clean slate while picking ideas from that branch.
The following is a break down of things to do, which are hopefully small enough to be done in steps. It is intended that each of these becomes an issue in itself that can be further discussed as necessary. Some of this work has been done previously and it would also be useful to link that here.
[~~strikethrough~~ below indicates a separate issue has been filed]
* ~~The type of LoopActionBase::type and HistogramBase::getType. These are currently std::type_index, which isn't transferrable to another process~~
* Moving more code out of scanConsole, for instance registering histogram and analysis algorithms and connecting clipboard inputs and outputs
* ~~DataProcessor generalisation. How to identify histogram vs analysis vs eventdata processor and connect them together~~
* A remote DataProcessor to send/receive items to a remote process
* ~~Mechanism for feedback (scan engine waits on signal from analysis, so what form should that signal take)~~
* ~~What RPC services are needed (ie classes with methods to call)~~
* ~~The transport mechanism (zeromq vs nanomsg vs TCP)~~
* ~~The RPC mechanism, how to encode methods and data. Hopefully this should be independent of the core~~
* ~~Setting up of processes and connecting them together (configuration of)~~Distributed processinghttps://gitlab.cern.ch/YARR/YARR/-/issues/39Use of type_index for LoopAction type discovery2020-05-19T17:55:41+02:00Bruce Joseph GallopUse of type_index for LoopAction type discoveryWhen sending to remote machines we can't rely on std::type_index having the same representation.
This is a problem for LoopActionBase::type (and c another issue).
In LoopAction the type is mostly used to decide how to build histograms:...When sending to remote machines we can't rely on std::type_index having the same representation.
This is a problem for LoopActionBase::type (and c another issue).
In LoopAction the type is mostly used to decide how to build histograms:
* Events from one cycle of a trigger loop go into the same bin
* Events from one cycle of a mask loop go into the same bin
* Events from each step of a threshold loop go into different
* Something else?
Also some discussion here: atlas-itk/sw/itksw-design#1Distributed processinghttps://gitlab.cern.ch/YARR/YARR/-/issues/40Use of std::type_index in HistogramBase2020-05-05T23:16:41+02:00Bruce Joseph GallopUse of std::type_index in HistogramBasestd::type_index is used in HistogramBase, this makes it difficult to transfer to a different process.
The main use is to decide in an analysis algorithm if a particular histogram has been generated by the appropriate histogrammer (eg Oc...std::type_index is used in HistogramBase, this makes it difficult to transfer to a different process.
The main use is to decide in an analysis algorithm if a particular histogram has been generated by the appropriate histogrammer (eg OccupancyAnalysis processes OccupancyMaps rather than L1IDs).
So, what should the representation be instead? Some options:
* String (name of class that generated it)
* Some enum of particular types
* Defined by the scanConsole:
* OccupancyMap provides a token for the type it makes which is then used to notify OccupancyAnalysis of what to expect
* Some other method
* Split the stream generated by Histogrammer, so the stream from OccupancyMap can be connected directly to OccupancyAnalysisDistributed processinghttps://gitlab.cern.ch/YARR/YARR/-/issues/41DataProcessor generalisation2020-05-05T23:14:59+02:00Bruce Joseph GallopDataProcessor generalisationA DataProcessor takes a stream of data items from a ClipBoard, processes them and sends them to another ClipBoard (possibly a map).
Currently the chain of types is as follows:
```mermaid
graph LR
RxCore -- RawDataContainer --> FEDataPr...A DataProcessor takes a stream of data items from a ClipBoard, processes them and sends them to another ClipBoard (possibly a map).
Currently the chain of types is as follows:
```mermaid
graph LR
RxCore -- RawDataContainer --> FEDataProcessor
FEDataProcessor -- EventDataBase --> Histogrammer
Histogrammer -- HistogramBase --> Analysis
Analysis -- HistogramBase --> DATA
DATA[(Results)]
```
This is "hard-coded" in the types stored in the BookKeeper and the methods used to connect a pair of ClipBoard's to a DataProcessor.
From a distributed point of view the question is how rigidly these connect to each other. For instance, if a processor advertises (in some way) that it can produce histograms, does that make it a Histogrammer, or an Analyser?
As a first pass, a rigid connection should be OK.Distributed processinghttps://gitlab.cern.ch/YARR/YARR/-/issues/42Distributed feedback2023-07-28T17:28:07+02:00Bruce Joseph GallopDistributed feedbackFeedback is used to send information backwards in the data processing pipeline, so an earlier stage (the ScanEngine) reaches a certain point and waits for information on how to proceed (from the Analysis).
There are currently two feedba...Feedback is used to send information backwards in the data processing pipeline, so an earlier stage (the ScanEngine) reaches a certain point and waits for information on how to proceed (from the Analysis).
There are currently two feedback methods, global and pixel. GlobalFeedbackBase is used to provide a small bit of information to an earlier step (potentially a single boolean value to stop the scan). PixelFeedbackBase is used to provide a histogram for instance updates during tuning.
These can both be achieved using the ClipBoard mechanism. This is demonstrated in branch devel_Distributed, but it should be possible to use the same code for the local feedback case.Distributed processingBruce Joseph GallopBruce Joseph Gallophttps://gitlab.cern.ch/YARR/YARR/-/issues/43Remote methods2020-05-05T23:14:26+02:00Bruce Joseph GallopRemote methodsSome notes on what methods to implement on a remote server.
This is excluding bulk transfer of data which is to be handled separately (ie a remote ClipBoard).
I believe the core are following:
# Setup pipeline
For instance histogramm...Some notes on what methods to implement on a remote server.
This is excluding bulk transfer of data which is to be handled separately (ie a remote ClipBoard).
I believe the core are following:
# Setup pipeline
For instance histogramming, this would set up which actions to run some set of data. A description of an input and output for this data is provided and any listeners required for this data is set up before the method completes. No data will arrive before the end of this method.
This can additionally describe connections to be made for feedback.
# Run pipeline
Process data in the pipeline. This continues until processing of all data is complete. The finish signal arrives on the data input (ie the scan has finished).
# Pipeline status
Eg How many data items have been processed or queued. Connection status of data in/out
# Run scan
This runs a scan, including setup of LoopActions and HwControllers. Setting up of pipelines is done before this so that processors are alrready ready to receive data.
# Scan status
Report current location in scan hierarchy.
# FE Configuration
TBD which process is responsible for loading FrontEnd configuration.Distributed processinghttps://gitlab.cern.ch/YARR/YARR/-/issues/44RPC mechanism2021-03-25T20:44:53+01:00Bruce Joseph GallopRPC mechanismThere are many RPC (remote procedure call) mechanisms. The one implemented in devel_Distributed is based on jsonrpc, encoded in msgpack.
At least during development it should be possible to switch between implementations (CORBA, HTTP, D...There are many RPC (remote procedure call) mechanisms. The one implemented in devel_Distributed is based on jsonrpc, encoded in msgpack.
At least during development it should be possible to switch between implementations (CORBA, HTTP, DCOP etc (not real suggestions?) to enable evaluation of different options.Distributed processinghttps://gitlab.cern.ch/YARR/YARR/-/issues/45Transport mechanism2020-05-05T23:10:21+02:00Bruce Joseph GallopTransport mechanismThis is related to distributed processing. This is about how data is transferred between processes. RPC might be implemented on top of this, but the low level should be used to transfer data.
Options include:
* TCP (simple and well und...This is related to distributed processing. This is about how data is transferred between processes. RPC might be implemented on top of this, but the low level should be used to transfer data.
Options include:
* TCP (simple and well understood)
* zeroMQ (extends to larger scale, allows for sending receiving from multiple end-points)
* UDP (direct, no arrival guarantees)
* Files (eg writing and reading shared NFS disk) This is less efficient, but allows almost unbounded queuing when that's usefulDistributed processinghttps://gitlab.cern.ch/YARR/YARR/-/issues/46Distributed process configuration2020-05-05T23:10:04+02:00Bruce Joseph GallopDistributed process configurationHow to set up a set of processes to allow connections between them for distributed processing.
I've added a devel_remote_test branch (based on devel_Distributed branch), which sets up a server based on a small json file, describing host...How to set up a set of processes to allow connections between them for distributed processing.
I've added a devel_remote_test branch (based on devel_Distributed branch), which sets up a server based on a small json file, describing hostname/ip and how that should be used to make a connection. This may need to be divided into client and server configurations. A similar configuration could be produced to allow connection of data input and output.
I suggest that this could be done under control of scanConsole (the "master" process) by adding an additional pipeline configuration, so for instance only one part of a chain could be run remotely. The servers should be as stateless as possible.
This doesn't quite answer the initial questions, how to start processes running on remote hosts.Distributed processinghttps://gitlab.cern.ch/YARR/YARR/-/issues/52RD53B: Two calibration slopes2020-09-15T01:12:54+02:00Timon HeimRD53B: Two calibration slopesRD53B has two Vcal slopes, need to implement those to be able to use them.RD53B has two Vcal slopes, need to implement those to be able to use them.RD53B Implementation Advancedhttps://gitlab.cern.ch/YARR/YARR/-/issues/60Thread allocation2023-01-27T17:40:44+01:00Timon HeimThread allocationCurrent paradigm uses at least 2 threads per chip, but changing the decoder from one for many chips to one for each chips this rises to 3 threads. I.e. a large number of threads will need to be created when running with many chips.
A po...Current paradigm uses at least 2 threads per chip, but changing the decoder from one for many chips to one for each chips this rises to 3 threads. I.e. a large number of threads will need to be created when running with many chips.
A potential solution for this is to register the ``process()`` functions of each ``DataProcessor`` in a thread pool.https://gitlab.cern.ch/YARR/YARR/-/issues/62FPGA stuck in command loop2020-06-17T17:57:42+02:00Simon Kristian HuibertsFPGA stuck in command loopThis bug was observed in a default threshold scan on the linear FE.
It occurred when two parameters in the .json config file for the scan
was set to:
"time" = 60
"Count" = 0
It seemed like the FPGA would continuously send out Cal comma...This bug was observed in a default threshold scan on the linear FE.
It occurred when two parameters in the .json config file for the scan
was set to:
"time" = 60
"Count" = 0
It seemed like the FPGA would continuously send out Cal commands even after the scan was cancelled.
When starting a new scan, the scan would be stuck in mask Stage 0:
[10:58:02:778][ info ][ scanConsole ]: Starting scan!
[10:58:02:820][ info ][ Rd53aMaskLoop ]: ---> Mask Stage 0
The only thing that seemed to work, was to reset the firmware by flashing it and rebooting.https://gitlab.cern.ch/YARR/YARR/-/issues/64Spaces in Name2021-03-25T19:34:04+01:00Timon HeimSpaces in NameCould lead to issues down the line, should this just work or should we check for them.Could lead to issues down the line, should this just work or should we check for them.https://gitlab.cern.ch/YARR/YARR/-/issues/65Add feature that allows FE class to query HwCtrl for specific features2022-08-05T20:32:20+02:00Timon HeimAdd feature that allows FE class to query HwCtrl for specific featuresDepending on FW or HW some FE configuration might need to be altered. Should implement a FE and HwCtrl agnostic channel for the FE to ask the HwCtrl if certain features are supported. Should probably use string a json based exchange.
Fo...Depending on FW or HW some FE configuration might need to be altered. Should implement a FE and HwCtrl agnostic channel for the FE to ask the HwCtrl if certain features are supported. Should probably use string a json based exchange.
For example:
RD53A can be readout with different speeds and number of lanes depending on what the HW supports. This requires the FE config to be altered to the specific values.https://gitlab.cern.ch/YARR/YARR/-/issues/67Strip: Firmware LCB encoder and trickle memory2021-04-15T12:59:29+02:00Jia Jian TeohStrip: Firmware LCB encoder and trickle memoryDemonstrate how to send command to FELIX (ITk-Strip) firmware LCB command endoder.
Branch [devel_FelixNetIO_StarChip_FWlcb](https://gitlab.cern.ch/YARR/YARR/-/tree/devel_FelixNetIO_StarChip_FWlcb).
Code is not optimized. Better design...Demonstrate how to send command to FELIX (ITk-Strip) firmware LCB command endoder.
Branch [devel_FelixNetIO_StarChip_FWlcb](https://gitlab.cern.ch/YARR/YARR/-/tree/devel_FelixNetIO_StarChip_FWlcb).
Code is not optimized. Better design/restructuring is needed later.
For more info, refer to Sec 8.2.11 ITK STRIPS LCB ENCODER:
- https://gitlab.cern.ch/atlas-tdaq-felix/documents/-/blob/master/Phase2_FW_specs/FELIX_Phase2_firmware_specs.pdf
- https://gitlab.cern.ch/ezhivun/hcc-star-python-readout/
- for firmware help/support please contact Elena Zhivun - [ezhivun@bnl.gov](mailto:ezhivun@bnl.gov)
Tested using:
- FELIX-fw: FLX709(or 712)_STRIPS_2CH_CLKSELECT_GIT_firmware_strips_rm_4.10_rm-4.10_283_200713_12_51
https://cernbox.cern.ch/index.php/s/Y1ubzywZcEE0IWe
- FELIX-driver: https://atlas-project-felix.web.cern.ch/atlas-project-felix/user/dist/software/driver/tdaq_sw_for_Flx-4.5.0-2dkms.noarch.rpm
- FELIX-SW: https://atlas-project-felix.web.cern.ch/atlas-project-felix/user/dist/software/apps/4.x/felix-04.01.00.rc7-1.el7.cern.x86_64.rpmStrips system testsJia Jian TeohJia Jian Teohhttps://gitlab.cern.ch/YARR/YARR/-/issues/71RD53B NOCC and NOCC vs. threshold scans2020-09-03T01:02:31+02:00Maurice Garcia-SciveresRD53B NOCC and NOCC vs. threshold scansImplement regular noise occupancy scan using regular readout. Send triggers without any injection and count hits coming out. Noise hits are low ToT so they should not be affected by the ToT bug. It would be good to fill 7/8 of the ToT me...Implement regular noise occupancy scan using regular readout. Send triggers without any injection and count hits coming out. Noise hits are low ToT so they should not be affected by the ToT bug. It would be good to fill 7/8 of the ToT memories for this scan and keep them filled to have low current. The 8th memory would then be the one collecting noise hits.RD53B Implementation Advancedhttps://gitlab.cern.ch/YARR/YARR/-/issues/72RD53B FEscope scan2020-09-03T01:03:03+02:00Maurice Garcia-SciveresRD53B FEscope scanImplement a true PTOT threshold scan: fix injection and vary threshold. For each threshold value inject and read PTOT and PTOA.
For each hit, plot x=PTOA, y=threshold and x=PTOA+PTOT, y=threshold for individual pixels. This gives a scop...Implement a true PTOT threshold scan: fix injection and vary threshold. For each threshold value inject and read PTOT and PTOA.
For each hit, plot x=PTOA, y=threshold and x=PTOA+PTOT, y=threshold for individual pixels. This gives a scope plot of the 2nd stage output. Plotting this in a 2D histo with fine binning will be equivalent to dot display with infinite persistence on the scope. Plot also x=<PTOA>, y=threshold and x=<PTOA+PTOT>, y=threshold for all injections at a given threshold, which is equivalent to scope averaging mode.RD53B Implementation Advancedhttps://gitlab.cern.ch/YARR/YARR/-/issues/73Fast and robust S-curve "fitter"2023-12-18T23:31:01+01:00Maurice Garcia-SciveresFast and robust S-curve "fitter"Calculate the S-cure mean and sigma without fitting. Use bin-by-bin derivative of s-curve, which is a Gaussian, Mean is
mean = sum{x.p(x)} = 0.5 + sum{i.(B(i+1)-B(i))}i=0,N-1 / sum{B(i+1)-B(i)}i=0,N-1,
where i is bin number, B(i) b...Calculate the S-cure mean and sigma without fitting. Use bin-by-bin derivative of s-curve, which is a Gaussian, Mean is
mean = sum{x.p(x)} = 0.5 + sum{i.(B(i+1)-B(i))}i=0,N-1 / sum{B(i+1)-B(i)}i=0,N-1,
where i is bin number, B(i) bin contents, N is number of bins, and I assume x = bin number + 0.5
mean = 0.5 + [ N.B(N) - sum{B(i)}i=1,N-1 ] / sum{B(i)}i=1,N-1
Noting that sum{B(i)}i=1,N-1 is total histogram contents, C, minus B(N),
mean = 0.5 + [ (N+1)B(N) - C ] / ( C - B(N) )
Similarly, sigma = RMS = sqrt[ sum(x^2.p(x)) - mean^2 ]RD53B Implementation Advancedhttps://gitlab.cern.ch/YARR/YARR/-/issues/79ToT codes starting at 0 or 12022-04-28T02:33:35+02:00Daniel Joseph AntrimToT codes starting at 0 or 1# What
`Rd53aDataProcessor` outputs `tot` codes starting at 1, not zero. In current implementation of `Rd53bDataProcessor`, the `tot` codes appear to start at 0.
We should probably move to the `Rd53a` interpretation to ensure that the a...# What
`Rd53aDataProcessor` outputs `tot` codes starting at 1, not zero. In current implementation of `Rd53bDataProcessor`, the `tot` codes appear to start at 0.
We should probably move to the `Rd53a` interpretation to ensure that the analysis tools, etc are consistent.
Tagging @theim @yanghthttps://gitlab.cern.ch/YARR/YARR/-/issues/82Data format type information in stored raw data2020-10-22T17:18:12+02:00Daniel Joseph AntrimData format type information in stored raw dataAs we get to including things like `ToA`, we should figure out how to indicate in the stored (raw) data what to expect in terms of what fields are present.
For example: we could add a single header at the top of the data files that cont...As we get to including things like `ToA`, we should figure out how to indicate in the stored (raw) data what to expect in terms of what fields are present.
For example: we could add a single header at the top of the data files that contains a ~1 byte long field which specifies this type:
| Data Type | Type Indicator | Relevant For (?) |
| ------ | ------ | ------ |
| No timing info | 0x00 | Strips? |
| `ToT` | 0x01 | Fei4, Rd53a |
| `ToT` + `ToA` | 0x02 | Rd53b, ... |
And this would more or less indicate the data lengths of the stored hits which is needed for iterating over the data file. Simplest would be to assume the same width for each field that is "wide enough" to future proof the format (e.g. `ToT` and `ToA` both 16 bits wide).https://gitlab.cern.ch/YARR/YARR/-/issues/83Query about TxCore shutdown2022-07-21T16:43:29+02:00Bruce Joseph GallopQuery about TxCore shutdownOne of the features of the Strips ASICs is the output of HPR packets which are sent every 1ms.
The current scanConsole code does most of the shutdown of processing in reverse, but it calls TxCore::disableCmd and TxCore::disableRx after ...One of the features of the Strips ASICs is the output of HPR packets which are sent every 1ms.
The current scanConsole code does most of the shutdown of processing in reverse, but it calls TxCore::disableCmd and TxCore::disableRx after finishing the analysis. This means that HPRs that arrive after the data processor has been shut down are lost (also the memory leaks).
I was wondering whether this is something that would benefit from moving somewhere else. So, should disableRx be done before the data processor shutdown?https://gitlab.cern.ch/YARR/YARR/-/issues/101Build integration with tdaq2023-10-26T18:04:05+02:00Bruce Joseph GallopBuild integration with tdaqEventually we want to build against tdaq libraries. This should always be optional so that it's not required for a test system.
The problem is that tdaq cmake involves custom cmake includes which seem to take over much of the process. I...Eventually we want to build against tdaq libraries. This should always be optional so that it's not required for a test system.
The problem is that tdaq cmake involves custom cmake includes which seem to take over much of the process. Is there a way to make this optional and at the same time keep much of the build common. For instance, could there be a top-level cmake with tdaq integration that sets up library paths for the rest of the build?Use TDAQ build system and packageshttps://gitlab.cern.ch/YARR/YARR/-/issues/103Implement JSON schema and validation2021-03-25T20:23:28+01:00Daniel Joseph AntrimImplement JSON schema and validation# What
Right now we don't have any well defined schema for any of the various JSON files that are used as input to various places in YARR. This can lead to issues (e.g. erroneously missing/misspelled registers for FrontEnd configs, incom...# What
Right now we don't have any well defined schema for any of the various JSON files that are used as input to various places in YARR. This can lead to issues (e.g. erroneously missing/misspelled registers for FrontEnd configs, incomplete scan configurations, etc...) and also to ill-defined requirements (e.g. does the FrontEnd config provide the minimum set of requirements necessary for this FrontEnd object?).
# Potential C++ Solution
Use something like [json-schema-validator](https://github.com/pboettch/json-schema-validator) to run a validation step on passed in JSON objects.
JSON schemas can be parsed via any language which supports JSON (all of them). The JSON schema spec is defined [here](https://json-schema.org/implementations.html).
# The Way It Works in C++
You define a (set of) JSON schema(s) and then do:
```c++
using nlohmann::json_schema::json_validator;
...
json config_to_validate;
json_validator validator;
std::string schema_filepath = ...;
valdiator.set_root_schema(json::parse(schema_filepath));
try {
validator.validate(config_to_validate);
} catch(std::exception& e) {
std::cout << "Whoops! Config fails schema validation check with error: " << e.what() << std::endl;
}
```
# Potential Issue
Generally, the default is to use `variant` instead of `nlohmann::json`. I have't checked if the schema validation will work against `variant` objects, and do not know if that would pose any issue -- at least at the usual runtime.
In any case,
- In CI, a JSON validation stage can be added and the YARR libs can be compiled against `nlohmann::json` in this stage
- The schema is independent of the JSON implementation, so depending on where the validation is run the `variant` vs `nlohmann::json` issue might not arise. For example, running the JSON validation stage from within a higher-level python interface prior to execution of scans (which potentially link against `variant`). From the python stage, there would be no need to link against a C++ library for the JSON validation that depends on `nlohmann::json`, since there is already the `jsonschema` module.https://gitlab.cern.ch/YARR/YARR/-/issues/104BookKeeper feList is vector of raw pointers2021-03-26T18:49:58+01:00Daniel Joseph AntrimBookKeeper feList is vector of raw pointers# What
The method [getFrontEnd](https://gitlab.cern.ch/YARR/YARR/-/blob/devel/src/libYarr/include/AllChips.h#L13) returns an instance of `std::unique_ptr<FrontEnd>`, which is good. In most of the cases where `getFrontEnd` is called it tr...# What
The method [getFrontEnd](https://gitlab.cern.ch/YARR/YARR/-/blob/devel/src/libYarr/include/AllChips.h#L13) returns an instance of `std::unique_ptr<FrontEnd>`, which is good. In most of the cases where `getFrontEnd` is called it treats the returned heap-allocated `FrontEnd` as intended. However, for scans where we rely mostly on the `Bookkeeper` object for handling the list of `FrontEnd` objects we more or less do this:
```c++
std::vector<FrontEnd*> feList;
...
feList.push_back(getFrontEnd("type").release())
```
That is, we re-gain access to the raw pointer in order to satisfy `Bookkeper`'s `std::vector<FrontEnd*>` container. We then rely on `Bookkeeper`'s destructor to clean up the memory/etc. Generally, it's 2017 (at least) and we should avoid using raw pointers where absolutely not necessary. I think this was the intent with `getFrontEnd` returning `std::unique_ptr`, however `scanConsole` did not adapt itself to handle that.
# Fixes
We should rethink about the memory access pattern for the `FrontEnd` objects that are passed around via `Bookkeeper`. As a given `FrontEnd` object should only exclusively owned by a single entity at a time (i.e. nobody should be attempting to configure the same chip at once), `std::unique_ptr` is probably the correct thing to do (following the intent of `getFrontEnd`). This will enforce the intent of single-access to each of the `FrontEnd` objects, make concurrency simpler, and simplify the clean-up (calls to `new` and `delete` don't need to be used anymore :)).https://gitlab.cern.ch/YARR/YARR/-/issues/106dbAccessor chipID: 0 when downloading module and chip configs2021-04-09T11:39:34+02:00Lingxin MengdbAccessor chipID: 0 when downloading module and chip configsHi,
when we test a new module I would use dbAccessor -C to register the module with the chip ID and SN, then use dbAccessor -D to retrieve the connectivity config and copy&paste the on-screen instructions to create the chip configs. In ...Hi,
when we test a new module I would use dbAccessor -C to register the module with the chip ID and SN, then use dbAccessor -D to retrieve the connectivity config and copy&paste the on-screen instructions to create the chip configs. In the newly created chip configs all the chipIDs are 0, even though they have correct values in the connectivity config. The problem is that the chip configs overwrite the correct chip ID in the module config.https://gitlab.cern.ch/YARR/YARR/-/issues/108RD53A/StarChip NetIO Summary2021-06-15T10:49:11+02:00Timon HeimRD53A/StarChip NetIO SummaryCould users/developers please state which branch they are currently using for RD53A and StarChips readout via NetIO and what kind of issues they encounter?
The goal is to identify all valid branches and find a strategy to merge them or ...Could users/developers please state which branch they are currently using for RD53A and StarChips readout via NetIO and what kind of issues they encounter?
The goal is to identify all valid branches and find a strategy to merge them or at least identify tasks to be performed in order to merge them.
@isiral @mtrovato @zixu @mwensing @wittgen @bgallop @arnaezhttps://gitlab.cern.ch/YARR/YARR/-/issues/114Config handling with DB2021-07-15T17:49:58+02:00Timon HeimConfig handling with DBTo be filledTo be filledhttps://gitlab.cern.ch/YARR/YARR/-/issues/116RD53A reg_read output file format2021-11-17T22:45:03+01:00Lingxin MengRD53A reg_read output file formatWe need an output file for the reg_read scans that's also compatible with prodDB. Maybe we can brainstorm a bit here how this should be best done.
@isiralWe need an output file for the reg_read scans that's also compatible with prodDB. Maybe we can brainstorm a bit here how this should be best done.
@isiralhttps://gitlab.cern.ch/YARR/YARR/-/issues/117Rename default branch to "main"2021-10-06T17:33:02+02:00Daniel Joseph AntrimRename default branch to "main"I believe we should change the default branch name from the outdated `master` to `main`: see [github](https://github.blog/changelog/2020-10-01-the-default-branch-for-newly-created-repositories-is-now-main/) and [gitlab](https://about.git...I believe we should change the default branch name from the outdated `master` to `main`: see [github](https://github.blog/changelog/2020-10-01-the-default-branch-for-newly-created-repositories-is-now-main/) and [gitlab](https://about.gitlab.com/blog/2021/03/10/new-git-default-branch-name/) which now use this as the default by default.https://gitlab.cern.ch/YARR/YARR/-/issues/119RD53A FE specific analog scans disable pixels of other FEs2021-12-03T19:03:27+01:00Lingxin MengRD53A FE specific analog scans disable pixels of other FEsIf FE-specific analog scans, e.g. syn_analogscan or lindiff_analogscan, are executed, they will disable the pixels in the non-activated FEs. This is problematic since these scans are used for masking before source scans.If FE-specific analog scans, e.g. syn_analogscan or lindiff_analogscan, are executed, they will disable the pixels in the non-activated FEs. This is problematic since these scans are used for masking before source scans.https://gitlab.cern.ch/YARR/YARR/-/issues/122CMAKE and library overhaul2023-10-26T18:04:05+02:00Timon HeimCMAKE and library overhaulRelated to !413 and #118
Tasks to be performed on release candidate branch for v1.4: tbd
List of tasks to be performed:
- [x] resolve cyclic dependencies
- [ ] possibility to compile w/o any library except Yarr and Util
- [ ] bump c...Related to !413 and #118
Tasks to be performed on release candidate branch for v1.4: tbd
List of tasks to be performed:
- [x] resolve cyclic dependencies
- [ ] possibility to compile w/o any library except Yarr and Util
- [ ] bump cmake version
- [ ] bump tbb version
- [ ] external package handling (FELIX, Rogue), specifically compiling against FELIX from TDAQ library or cvfsv1.4https://gitlab.cern.ch/YARR/YARR/-/issues/123Star scans2023-10-05T18:37:05+02:00Bruce Joseph GallopStar scansThis is a summary of scans that are needed for strips and their current status. Most have been implemented in some form in YARR, but not all:
* Noise occupancy
* Initial counter based scan in devel
- [ ] Debug/verify for multipl...This is a summary of scans that are needed for strips and their current status. Most have been implemented in some form in YARR, but not all:
* Noise occupancy
* Initial counter based scan in devel
- [ ] Debug/verify for multiple ABCs
- [ ] Allow for scan with more than 255 hits per bin (send triggers in bursts)
- [ ] Variable trigger count (eg 100 at low threshold to 1e6 at high threshold) see !366
- [ ] Analysis to compare with itsdaq
* Strobe delay
- [x] Merge initial scan config !465
- [x] Analysis with feedback (strobe delay) to next scans !564
* Could this be based on two scurve fits with particular range cuts?
- [x] In depth analysis with plotting for comparison to itsdaq
* Pedestal trim
- [ ] Need scan config
- [ ] Analysis with feedback (trim) to next scans
* Or do we need a script so that info from "OccPixelThresholdTune" is carried over to subsequence scans?
* Response curve
- [x] Link to work already done
- [ ] Analysis / plotting (done in devel_SR1 !663, still needs to be ported to devel)
* nmask
- [X] Scan config in devel
- [ ] A generic plotting algorithm (2D scan plots in itsdaq) should produce triangles
* Trim
- [X] Note work already done (tuning scans in devel_FelixNetIO_StarChip)
- [ ] Port to devel with updates !667
- [ ] Implement scan config for scans similar to itsdaq
- [ ] Analysis / plotting (done in devel_SR1 !663 and !667)
* Digital (register/HPR) scans
- [x] See also !340 here, which adds some low level testing routines in the test_star binary
- [ ] Scan lpGBT phases while reading HCC registers/HPR
- [ ] Scan HCC phases while reading ABC registers/HPR/trigger data
Also other fixes:
- [x] !412 for configuring/resetting multiple HCCs.
- [ ] Processing of register data (via StarDataProcessor)
- [ ] Link to other issues
More plotting:
- [ ] Noise for whole stave on one page
- [ ] Similar for petal (needs more thought to layout)Strips integrationhttps://gitlab.cern.ch/YARR/YARR/-/issues/125Automatically record stdout/stderr in output directory2024-01-22T16:55:42+01:00Matthias SaimpertAutomatically record stdout/stderr in output directorynow I run all my testing sequence with `myTestingSequence.sh 2>&1 | tee log.txt` but IMO it would be great to include by default the stdout and stderr as separate files in the output directory of each scan.now I run all my testing sequence with `myTestingSequence.sh 2>&1 | tee log.txt` but IMO it would be great to include by default the stdout and stderr as separate files in the output directory of each scan.Timon HeimTimon Heimhttps://gitlab.cern.ch/YARR/YARR/-/issues/126Star mask bit flip2023-10-05T18:11:35+02:00Zhengcheng TaoStar mask bit flipIn the `StarMaskLoop` on devel branch, the bits in `ChannelRings` are flipped [here](https://gitlab.cern.ch/YARR/YARR/-/blob/devel/src/libStar/include/StarMaskLoop.h#L57-60) when writing to ABCStar `MaskInput` registers. An example from ...In the `StarMaskLoop` on devel branch, the bits in `ChannelRings` are flipped [here](https://gitlab.cern.ch/YARR/YARR/-/blob/devel/src/libStar/include/StarMaskLoop.h#L57-60) when writing to ABCStar `MaskInput` registers. An example from StarMaskLoop log:
```
[12:11:09:691][ debug ][ StarMaskLoop ]: -> Mask stage 0
[12:11:09:691][ debug ][ StarMaskLoop ]: ChannelRings:
Ring content: 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.
Ring content: 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.
[12:11:09:691][ debug ][ StarMaskLoop ]: Apply masks
[12:11:09:691][ debug ][ StarMaskLoop ]: Apply masks:
write mask: [ 0xffffffff 0xffffffff 0xffffffff 0xffffffff 0xffffffff 0xffffffff 0xffffffff 0xffffffff]
2nd row: 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
1sr row: 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
```
This was probably done on purpose and for nmask scans, but I found it a bit counterintuitive when adding the mask loop to an analog scan. For example
```
{
...
"nMaskedStripsPerGroup": 7,
"max": 8,
...
}
```
with `ABCs_TM = 0` actually means for every eight strips, only one of them are actually masked and the other seven will get pulses if they are enabled.
The bits in the masks will always have different/opposite meanings depending on the value of `TM` regardless, but I feel it might be less confusing if the meaning of the bits in `ChannelRings` are consistent with that in the ABCStar `MaskInput` registers. What do you think?https://gitlab.cern.ch/YARR/YARR/-/issues/130Closure test for all from/to json functions2022-01-25T18:39:09+01:00Timon HeimClosure test for all from/to json functionsShould implement a closure test for all from/to json functions. Specifically:
- Histo1d/2d/3d, GraphShould implement a closure test for all from/to json functions. Specifically:
- Histo1d/2d/3d, Graphhttps://gitlab.cern.ch/YARR/YARR/-/issues/131Crashes / memory issues of scanConsole2022-08-01T08:17:43+02:00Matthias SaimpertCrashes / memory issues of scanConsoleI thought it would be a good idea to create an issue about this since now several groups are reporting similar things (tagging @gstark and @simobius, feel free to add your personal experiences here)
### Example
- 45hrs synFE source sca...I thought it would be a good idea to create an issue about this since now several groups are reporting similar things (tagging @gstark and @simobius, feel free to add your personal experiences here)
### Example
- 45hrs synFE source scan of #Paris10, a module which returns many readout errors when reading synFE (freq changes a lot but typically several per 5min with some bursts) started on Friday
- DAQ machine starts to be extremely slow and irresponsive at 22h36 on Saturday (aft 30hrs of scan or so), FE are power cycled and HV is set to 0V by interlock but I cannot SSH the machine so cannot stop the scan, however monitoring is back (DAQ and DCS are on the same machine) so I guess this reduced the load of the machine ?
- new loss of monitoring at 1h14 on Sunday (due to the irresponsiveness of the machine I guess?), this time the interlock cuts down the power of LV/HV. I still cannot SSH the machine so things are left like this (noise scan still running).
- HW reboot of the machine on Sunday at 20h11, apparently due to scanConsole (below is the backtrace of scanConsole, extracted by Saclay IT). I do not enclose the entire `/var/crash` repo here because it is 2 Gb but can share it with CERNbox if useful.
```
crash> bt
PID: 5019 TASK: ffff8ee78b2aa100 CPU: 2 COMMAND: "scanConsole"
#0 [ffff8eedbc483b40] machine_kexec at ffffffff9c2662c4
#1 [ffff8eedbc483ba0] __crash_kexec at ffffffff9c322a32
#2 [ffff8eedbc483c70] crash_kexec at ffffffff9c322b20
#3 [ffff8eedbc483c88] oops_end at ffffffff9c98d798
#4 [ffff8eedbc483cb0] no_context at ffffffff9c275d14
#5 [ffff8eedbc483d00] __bad_area_nosemaphore at ffffffff9c275fe2
#6 [ffff8eedbc483d50] bad_area_nosemaphore at ffffffff9c276104
#7 [ffff8eedbc483d60] __do_page_fault at ffffffff9c990750
#8 [ffff8eedbc483dd0] do_page_fault at ffffffff9c990975
#9 [ffff8eedbc483e00] page_fault at ffffffff9c98c778
[exception RIP: __list_add+15]
RIP: ffffffff9c5a668f RSP: ffff8eedbc483eb0 RFLAGS: 00010087
RAX: ffff8eed2ba21cd8 RBX: 0000000000000001 RCX: 0000000000000002
RDX: 0000000000000000 RSI: ffff8eed2ba21ce8 RDI: ffffdcff8112a3a0
RBP: ffff8eedbc483ec8 R8: 0000000000000001 R9: 000000000000028e
R10: ffff8eedbc7d9868 R11: 0000000000000001 R12: 0000000000000000
R13: ffff8eed2ba21ce8 R14: ffff8eedbc7d9800 R15: ffffdcff8112a380
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000
#10 [ffff8eedbc483ed0] free_pcppages_bulk at ffffffff9c3c7571
#11 [ffff8eedbc483f68] drain_pages at ffffffff9c3c78cb
#12 [ffff8eedbc483f98] drain_local_pages at ffffffff9c3c78f5
#13 [ffff8eedbc483fa8] flush_smp_call_function_queue at ffffffff9c316d13
#14 [ffff8eedbc483fd0] generic_smp_call_function_single_interrupt at ffffffff9c317413
#15 [ffff8eedbc483fe0] smp_call_function_interrupt at ffffffff9c2598bd
#16 [ffff8eedbc483ff0] call_function_interrupt at ffffffff9c99854a
--- ---
#17 [ffff8ee71d543a98] call_function_interrupt at ffffffff9c99854a
[exception RIP: get_page_from_freelist+679]
RIP: ffffffff9c3c8567 RSP: ffff8ee71d543b40 RFLAGS: 00000246
RAX: 0000000000000011 RBX: ffff8ee71d543b18 RCX: 000000000001fc25
RDX: 000000000001fcff RSI: 000000000000001f RDI: 0000000000000246
RBP: ffff8ee71d543c48 R8: fffffffffffffff2 R9: ffffffff9d240ae8
R10: 00000000000c2dc8 R11: 0000000000100000 R12: ffffffff9ccb2140
R13: ffffffff9c29b2ff R14: ffff8ee71d543b08 R15: ffff8eedbc7d9800
ORIG_RAX: ffffffffffffff03 CS: 0010 SS: 0000
#18 [ffff8ee71d543c50] __alloc_pages_nodemask at ffffffff9c3c8f04
#19 [ffff8ee71d543d80] alloc_pages_vma at ffffffff9c41cc49
#20 [ffff8ee71d543de8] handle_mm_fault at ffffffff9c3f6837
#21 [ffff8ee71d543eb0] __do_page_fault at ffffffff9c990653
#22 [ffff8ee71d543f20] do_page_fault at ffffffff9c990975
#23 [ffff8ee71d543f50] page_fault at ffffffff9c98c778
RIP: 00007f6f53677e20 RSP: 00007f6f508eac60 RFLAGS: 00010216
RAX: 00007f6f3055b440 RBX: 00007f6f3055b2d0 RCX: 00007f6f3055b440
RDX: 0000000000011978 RSI: 00007f6f305f1440 RDI: 0000000000000000
RBP: 0000000000012c00 R8: 00000000005f2000 R9: 0000000000096000
R10: 000000000000007e R11: 0000000000001000 R12: 00007f6f3055b3a0
R13: 0000000000012c00 R14: 00007f6f305f1440 R15: 00007f6f441fd260
ORIG_RAX: ffffffffffffffff CS: 0033 SS: 002b
```
### General
These crashes from YARR are now starting to become one of the bottlenecks for the QC of RD53A modules at Saclay. Full list we had below (`/var/crash`) since we started QC:
```
127.0.0.1-2021-09-01-11:01:12
127.0.0.1-2021-09-03-17:56:00
127.0.0.1-2021-09-21-18:26:47
127.0.0.1-2021-10-25-18:20:00
127.0.0.1-2021-10-30-15:50:16
127.0.0.1-2021-11-02-14:47:47
127.0.0.1-2021-11-08-09:39:41
127.0.0.1-2021-11-09-22:17:49
127.0.0.1-2021-12-01-12:22:15
127.0.0.1-2021-12-12-23:23:15
127.0.0.1-2021-12-21-06:53:00
127.0.0.1-2021-12-22-12:48:55
127.0.0.1-2022-01-13-03:58:35
127.0.0.1-2022-01-23-20:11:23
127.0.0.1-2022-01-24-16:21:46
127.0.0.1-2022-01-24-16:52:43
```
the last 2 crashes (2022-01-24) were due to using the `-l` option when running `scanConsole`, see https://gitlab.cern.ch/YARR/YARR/-/issues/125#note_5180707 but the ones before happened running `scanConsole` without the `-l` option.
the vast majority (if not all) of these crashes happened either running a new scan after a previous scan led to a segfault of scanConsole (with no useful msg to report here, no particular defect of the module being tested) or during a noise scan with readout errors like this:
`[ error ][Rd53aDataProcessor]: [1] Received data not valid: 0xc4e0`
I can post more backtrace / crash repos / etc if useful
Thank you for your help with this issuehttps://gitlab.cern.ch/YARR/YARR/-/issues/138Prescan config that's different than chip config2022-02-07T17:59:06+01:00Lingxin MengPrescan config that's different than chip config- What happens if the prescan config has different settings than the same register in the chip config?
- How long does the prescan config need to settle before the scan can/should be run?
For the ADC calibration of RD53A modules it's re...- What happens if the prescan config has different settings than the same register in the chip config?
- How long does the prescan config need to settle before the scan can/should be run?
For the ADC calibration of RD53A modules it's required to probe and read the values for Vcal_med/hi at 500/3500. These registers are in the chip config, as well as in the prescan config. When testing something I swapped these values in either case, e.g. Vcal_med at 500/3500 in chip/prescan config or vise versa. The result is reg_readmux reads some value around 2000 ADC counts, which is in the middle of these two settings. I have the suspicion that the register is read before the prescan setting is fully set.
Basically, shall a sleep function (and how long) be used for the prescan settings to take place? Or are there more elegant ways?Timon HeimIsmet SiralTimon Heimhttps://gitlab.cern.ch/YARR/YARR/-/issues/140Axis labels for strips2023-10-05T18:15:16+02:00Bruce Joseph GallopAxis labels for stripsThere are several axis labels with something like "Number of Pixels". This is probably not appopriate for strips.
Maybe the appropriate name could be added to `FrontEndGeometry` and then passed in the same way as `setMapSize`.There are several axis labels with something like "Number of Pixels". This is probably not appopriate for strips.
Maybe the appropriate name could be added to `FrontEndGeometry` and then passed in the same way as `setMapSize`.https://gitlab.cern.ch/YARR/YARR/-/issues/141Possibility to record average ToT of pixels during source scan2022-09-29T00:28:33+02:00Matthias SaimpertPossibility to record average ToT of pixels during source scanFollowing the nice discussion we had at the pixel module session this week (*), the point was made by @theim that recording the average ToT during a source scan may allow to identify "half-disconnected" or "fragile" bumps highlighted by ...Following the nice discussion we had at the pixel module session this week (*), the point was made by @theim that recording the average ToT during a source scan may allow to identify "half-disconnected" or "fragile" bumps highlighted by crosstalk-based/threshold-based scans but not visible on source scans.
So I open this issue to discuss the possibility to add this in the output of noise scans. Not sure about the technical feasibility of this in terms of FW/SW. IMO ideally it should be off by default but could be turned on by a flag in the scan config.
(*) https://indico.cern.ch/event/1065545/contributions/4762712/attachments/2404768/4113921/RD53A_review_electricalDAQ_20220309.pdfHongtao YangHongtao Yanghttps://gitlab.cern.ch/YARR/YARR/-/issues/142RD53a tune scripts succeed even when the processed emulated data are reported...2022-03-22T22:56:16+01:00Matthias WittgenRD53a tune scripts succeed even when the processed emulated data are reported corruptWhen running
`scripts/test_tunings.sh` from the branch
`devel_rd53a_felixNetio_multichip_rebase_master`
the test tuning spits various decoding errors
```
[14:29:10:294][ error ][Rd53aDataProcessor]: [0] Received data not valid: 0x1
[1...When running
`scripts/test_tunings.sh` from the branch
`devel_rd53a_felixNetio_multichip_rebase_master`
the test tuning spits various decoding errors
```
[14:29:10:294][ error ][Rd53aDataProcessor]: [0] Received data not valid: 0x1
[14:29:10:294][ error ][Rd53aDataProcessor]: [0] Received data not valid: 0x1
[14:29:10:294][ error ][Rd53aDataProcessor]: [0] Received data not valid: 0x1
```
the CI pipeline succeeds.
The basic tests should count the errors encountered and fail the CI pipelines.https://gitlab.cern.ch/YARR/YARR/-/issues/144Histogram offset (base 1 vs base 0)2023-10-05T18:14:42+02:00Bruce Joseph GallopHistogram offset (base 1 vs base 0)Now we're getting closer to comparing strip histograms I realised I should raise the issue of how to count channels again.
In strips, we count from 0 (which matches the ASICs), but the current analyses build histograms with base 1.
Is ...Now we're getting closer to comparing strip histograms I realised I should raise the issue of how to count channels again.
In strips, we count from 0 (which matches the ASICs), but the current analyses build histograms with base 1.
Is this something that is likely to be changed for Pixels, or do we need to add a flag to the FrontEndGeometry class?https://gitlab.cern.ch/YARR/YARR/-/issues/146Preallocate event data vector2022-04-27T23:01:50+02:00Timon HeimPreallocate event data vectorIt was found to improve processing speed of the vector storing event data and hits is pre-allocated in a sensible way.
Need further study to understand this.It was found to improve processing speed of the vector storing event data and hits is pre-allocated in a sensible way.
Need further study to understand this.https://gitlab.cern.ch/YARR/YARR/-/issues/147Read Star EFuses2022-04-28T13:06:40+02:00Bruce Joseph GallopRead Star EFusesSee implementation for RD53B in !370.
In HCCStar, we can use the fuse ID from HCC to set up the communication address,
allowing DAQ to pick appropriate addresses.See implementation for RD53B in !370.
In HCCStar, we can use the fuse ID from HCC to set up the communication address,
allowing DAQ to pick appropriate addresses.https://gitlab.cern.ch/YARR/YARR/-/issues/148Refactor DataArchiver and HistogramArchiver to not write to file directly2022-04-28T18:33:56+02:00Timon HeimRefactor DataArchiver and HistogramArchiver to not write to file directlyHistogrammer or analyses should not write to file directly to stay compliant with a a model where no file access is possible.
Specifically affects:
- ``DataArchiver`` (Histogrammer)
- ``HistogramArchiver`` (Analysis)
A potential way cou...Histogrammer or analyses should not write to file directly to stay compliant with a a model where no file access is possible.
Specifically affects:
- ``DataArchiver`` (Histogrammer)
- ``HistogramArchiver`` (Analysis)
A potential way could be to collect file streams in a dedicated class to write them out there.https://gitlab.cern.ch/YARR/YARR/-/issues/150Scan progress monitor2022-05-03T00:24:57+02:00Timon HeimScan progress monitorCurrently there is no clear strategy how to monitor the progress of a scan. We are relying on printouts from loops, which is up to developers to determine and is not very clean for all potential use cases/scans.
A long time ago I experi...Currently there is no clear strategy how to monitor the progress of a scan. We are relying on printouts from loops, which is up to developers to determine and is not very clean for all potential use cases/scans.
A long time ago I experimented with a progress monitor system which uses the LoopStatus from the scan engine to report the progress in form of a simple progress bar. This worked but can be messy when combined when other printouts.
There is this solution for spdlog, which might be interesting:
https://github.com/michalber/spdmon
Cheers,
Timonhttps://gitlab.cern.ch/YARR/YARR/-/issues/153Generate default configs as post-commit hook?2022-05-24T19:01:08+02:00Timon HeimGenerate default configs as post-commit hook?Example configs stored in the ``configs/default`` folder (potentially misleading naming, as they are generated from the defaults, they are not THE defaults) can get out of date quickly.
It's good to have these configs to have an example...Example configs stored in the ``configs/default`` folder (potentially misleading naming, as they are generated from the defaults, they are not THE defaults) can get out of date quickly.
It's good to have these configs to have an example to point to when looking something up as it can be cryptic to look at the code.
Perhaps these configs could be generated via a post-commit hook?
I don't think artifacts can be added to the repo, right?https://gitlab.cern.ch/YARR/YARR/-/issues/155Print default values when config not found2022-11-30T00:08:20+01:00Julien GiraudPrint default values when config not foundsuggest to expand this line:
https://gitlab.cern.ch/YARR/YARR/-/blob/devel/src/libRd53b/Rd53bGlobalCfg.cpp#L426
to actually print the default values used when explicit config is not provided
tagging @msaimpersuggest to expand this line:
https://gitlab.cern.ch/YARR/YARR/-/blob/devel/src/libRd53b/Rd53bGlobalCfg.cpp#L426
to actually print the default values used when explicit config is not provided
tagging @msaimperhttps://gitlab.cern.ch/YARR/YARR/-/issues/156Strip HPR processing2022-07-26T11:45:52+02:00Bruce Joseph GallopStrip HPR processingBoth the HCC and ABCs regularly (1kHz) send HPR packets which
contain information about clock validity and command errors.
This should be read and changes noted.
Related to #83.Both the HCC and ABCs regularly (1kHz) send HPR packets which
contain information about clock validity and command errors.
This should be read and changes noted.
Related to #83.Strips system testshttps://gitlab.cern.ch/YARR/YARR/-/issues/157Inter-segment cross-talk2022-07-21T17:04:22+02:00Bruce Joseph GallopInter-segment cross-talkEach lpGBT can address 4 CCR segments. It would be interesting to be able to
see the effect of triggering or readout on one CCR as a noise source into the
others.Each lpGBT can address 4 CCR segments. It would be interesting to be able to
see the effect of triggering or readout on one CCR as a noise source into the
others.Strips system testshttps://gitlab.cern.ch/YARR/YARR/-/issues/160ITkpix ToT Memory Test2022-08-02T18:00:17+02:00Lingxin MengITkpix ToT Memory TestNot existing yet, to be developed.Not existing yet, to be developed.https://gitlab.cern.ch/YARR/YARR/-/issues/161Updates for PPA/PPB2022-08-20T00:25:07+02:00Bruce Joseph GallopUpdates for PPA/PPBAfter !542 some things still need updating for PPA/PPB.
In particular:
* Read register (StarRegDump.cpp)
* test_star (uses explicit register values for pulse input setup)After !542 some things still need updating for PPA/PPB.
In particular:
* Read register (StarRegDump.cpp)
* test_star (uses explicit register values for pulse input setup)https://gitlab.cern.ch/YARR/YARR/-/issues/164Output file doesn't contain frontEnd name (or ID)2022-08-16T15:10:09+02:00Bruce Joseph GallopOutput file doesn't contain frontEnd name (or ID)This came up in the context of !432, but I think is a more general issue. Currently the name is to be found in the histogram file name, but not inside.
At some point a reference to the configuration used might be useful as well, but tha...This came up in the context of !432, but I think is a more general issue. Currently the name is to be found in the histogram file name, but not inside.
At some point a reference to the configuration used might be useful as well, but that's for later.https://gitlab.cern.ch/YARR/YARR/-/issues/168Make sure scanConsole and write-register command return value follow the same...2023-10-19T17:36:54+02:00Elisabetta PianoriMake sure scanConsole and write-register command return value follow the same convention (<0 error)https://gitlab.cern.ch/YARR/YARR/-/issues/169Missing checkCom for StarChips2022-11-04T14:13:00+01:00Bruce Joseph GallopMissing checkCom for StarChipsCurrently checkCom is not implemented in the StarChips class.
This means that scanConsole will assume communication is possible.
The code should report quickly as it's implementation
A simple read register might be enough, but it cou...Currently checkCom is not implemented in the StarChips class.
This means that scanConsole will assume communication is possible.
The code should report quickly as it's implementation
A simple read register might be enough, but it could be useful to verify other things at this stage.
Things that could be part of this check:
* Read HPRs (check lock/sync etc)
* Read an HCC register (could be address register, but fuse check is separate)
* Write value associated with communications ID to an ABC register
* Read register and check input channel mapping (in other words checking we can feed back histogram data to the right ABC)
* Restore value
* Send trigger readout request
Much of this could be done in a single sequence (depending on the "restore register value" bit being well known).https://gitlab.cern.ch/YARR/YARR/-/issues/170Increase FrontEnd register size to 32bit to match Star registers2022-11-01T17:46:19+01:00Timon HeimIncrease FrontEnd register size to 32bit to match Star registersIncrease FrontEnd register size to 32bit to match Star registers.Increase FrontEnd register size to 32bit to match Star registers.https://gitlab.cern.ch/YARR/YARR/-/issues/171Make the doxygen pages look better2022-11-02T15:30:41+01:00Bruce Joseph GallopMake the doxygen pages look betterDoxygen output is now available here:
https://yarr.web.cern.ch/doxygen/devel/html/index.html
Currently, many classes are not effectively documented.
Note that some things are more suitable to be documented via the docs folder.Doxygen output is now available here:
https://yarr.web.cern.ch/doxygen/devel/html/index.html
Currently, many classes are not effectively documented.
Note that some things are more suitable to be documented via the docs folder.https://gitlab.cern.ch/YARR/YARR/-/issues/172Verify scan description2022-11-09T14:53:49+01:00Bruce Joseph GallopVerify scan descriptionIt might be useful for scan console to look at the description of the scan and warn about things being missing/out of order.
Examples of things to warn about might be:
* Multiple trigger loops
* Multiple data loops
* (Strips) register c...It might be useful for scan console to look at the description of the scan and warn about things being missing/out of order.
Examples of things to warn about might be:
* Multiple trigger loops
* Multiple data loops
* (Strips) register counter loop in the wrong place
* Parameter loop inside data/trigger loop
* Missing Trigger loop
This should help with the "Trigger is not enabled, will get stuck here!" error in StdDataLoop.https://gitlab.cern.ch/YARR/YARR/-/issues/173Differences between YARR and ITSDAQ2023-10-02T10:28:48+02:00Elise Maria Le Boulicaut EnnisDifferences between YARR and ITSDAQSome detailed studies were done to compare scan results between YARR and ITSDAQ. Some of the main findings possibly requiring action are listed below:
- The trimming algorithm is different between YARR and ITSDAQ. In YARR (!490 ), we sc...Some detailed studies were done to compare scan results between YARR and ITSDAQ. Some of the main findings possibly requiring action are listed below:
- The trimming algorithm is different between YARR and ITSDAQ. In YARR (!490 ), we scan over TrimDAC and BVT and we optimize the target BVT value such that a maximum number of channels can be trimmed to it. We then select the TrimDAC for each channel such that they get as close as possible to the target. There is also an option to optimize the trim range (BTRANGE), although in practice it is fixed to 6. In the ITSDAQ pedestal trim (code [here](https://gitlab.cern.ch/atlas-itk-strips-daq/itsdaq-sw/-/blob/master/macros/abc_star/NoiseTrimPlot.cpp)), BVT and BTRANGE are fixed to 15 and 6, respectively. TrimDAC values are scanned and the optimal ones are taken to be those that lead to an occupancy as close as possible to 50%. In principle, these two approaches should be equivalent, provided the trim target optimized in YARR is close to 15. It was found however, that these targets were higher than 15, leading to higher TrimDAC values. As a test, the YARR algorithm was modified such that the chosen trim target is as close as possible to 15 while still maximizing the number of trimmable channels (see [this commit](https://gitlab.cern.ch/arnaez/YARR/-/commit/4f3670a6f1cf7e9f72f2bd897e8f44ce81470914)). The results were then found to be very similar between YARR and ITSDAQ. Below are plots showing a comparison of optimal TrimDAC values before and after the modification in the YARR algorithm.
![trim_comparison_mod13_defaultAlg_YARR_007355_vs_ITSDAQ_20220902](/uploads/c90f12ec1e4c822e0132fe83b785ada0/trim_comparison_mod13_defaultAlg_YARR_007355_vs_ITSDAQ_20220902.png)
![trim_comparison_mod13_modifiedAlg_YARR_007399_vs_ITSDAQ_20220920](/uploads/5d4d8d5f5e0c00ee00b70c8bdb409a4d/trim_comparison_mod13_modifiedAlg_YARR_007399_vs_ITSDAQ_20220920.png)
It can also be noted that this modification led to smaller differences in the N-point gain results.
The question is then whether we want to keep the optimization of the target as originally implemented, or if we want to switch to the approach that is closer to ITSDAQ.
- In the N-point gain scan, the conversion of injected charge values from DAC to fC is different between YARR and ITSDAQ. In YARR, a conversion factor of 0.0195 is used (see [this line](https://gitlab.cern.ch/arnaez/YARR/-/blob/devel_SR1/src/libStar/StarAnalysis.cpp#L142)), whereas in ITSDAQ the conversion is hard-coded for each charge, based on simulation (see [here](https://gitlab.cern.ch/atlas-itk-strips-daq/itsdaq-sw/-/blob/master/macros/abc_star/ThreePointGain.cpp#L46)). As a test, the same "hard-coded" conversion was in YARR. This led to a reduction in the differences in gain. Below are plots showing the gain distribution for the same module, first without modifying the trimming algorithm or the charge conversion, then modifying only the trimming, then modifying both the trimming and the charge conversion.
![Gain_13_ITSDAQ_5_vs_YARR_007357](/uploads/478febfdde59bece6a5ab8715ff62771/Gain_13_ITSDAQ_5_vs_YARR_007357.png)
![Gain_13_ITSDAQ_5_vs_YARR_007444](/uploads/f7eb7e0f4e1f8d370eebe8e3b764998a/Gain_13_ITSDAQ_5_vs_YARR_007444.png)
![Gain_13_ITSDAQ_5_vs_YARR_008194](/uploads/0f81acd27f8eea27ba720bfa412930fd/Gain_13_ITSDAQ_5_vs_YARR_008194.png)
The charge conversion issue is addressed in Zhengcheng's MR !584
- The noise occupancy scan has many differences between YARR and ITSDAQ. The YARR code is found in MR !497 and the ITSDAQ code is [here](https://gitlab.cern.ch/atlas-itk-strips-daq/itsdaq-sw/-/blob/master/macros/abc_star/NOPlot.cpp). The image below shows a comparison of the noise curves, which is the log of the mean relative occupancy vs threshold squared (note there seems to be a bug in the ENC calculation in ITSDAQ):
![comparison_NO_08.31.2022](/uploads/526d78418a22c0c8bdd307483b83ce9d/comparison_NO_08.31.2022.png)
Firstly, because this scan uses the results of a previous N-point gain scan, any differences there will translate to different mV to fC conversions. The second difference which significantly affects the scan is the triggering. ITSDAQ sends bursts of 248 triggers with zero spacing, which effectively corresponds to a short trigger burst at 40 MHz. YARR sends trigger commands one-by-one at a frequency of 15kHz (configurable). Hence, the maximum number of triggers can be made much higher in ITSDAQ (64 million) compared to YARR (usually 1 million). The way that error bars are calculated for the noise curve are also different.
YARR:
![YARR](/uploads/33f56ac90a0b8a345e4e380f442eb568/YARR.png)
ITSDAQ:
![ITSDAQ_1](/uploads/9ed276435790c26251954fee3b30aa11/ITSDAQ_1.png)
![ITSDAQ_2](/uploads/0361058387d72f9329a42d243f2c4fa6/ITSDAQ_2.png)
This leads to the following questions:
1. Do we want to implement a functionality in YARR to send bursts of triggers instead of one trigger at a time?
2. How do we want to calculate the error bars? Personally, I think it makes more sense to apply binomial statistics to each channel individually rather than the mean, because it's not necessarily a given that the mean will behave in the same way as a single channel. Also, we cannot at the moment use asymmetric error bars as in ITSDAQ, since this is not supported in `GraphErrors` or `lmcurve_tyd`. Perhaps it would be worth adding this?https://gitlab.cern.ch/YARR/YARR/-/issues/174Integration of Strips SR1 code2024-01-30T16:38:38+01:00Bruce Joseph GallopIntegration of Strips SR1 codeThere is currently a branch devel_SR1 (see !553), which implements several scans/analyses for
Strips testing at SR1. For instance this is the status to which #173 refers.
I don't think it's practical to merge as is, the main things that...There is currently a branch devel_SR1 (see !553), which implements several scans/analyses for
Strips testing at SR1. For instance this is the status to which #173 refers.
I don't think it's practical to merge as is, the main things that should be fixed
are:
* Need to load calibration information from a previous test into the histogram clipboard for fit functions etc. (see StarJsonData and HistoFromDisk). This should instead be done by putting the calibration information into the FrontEnd configuration.
* In order to have a noise occupancy scan with a variable number of triggers, the analysis includes code to communicate directly with the StarCounterLoop to change the number of triggers to be sent. Instead the analysis should use the existing feedback mechanism to request more triggers, and find a way to communicate how many triggers were actually sent.
The plan is to split the above branch into smaller pieces to be merged in turn. For
instance, the following (subject to change as things go ahead):
- [x] Strobe delay analysis !564
- [x] Put calibration parameters into the configuration !584
- [ ] Initial throttle trigger loop, maybe rebased !366
- [ ] Make trigger throttle loop more robust
- [ ] Implement stripped down noise analysis (without the trigger throttle loop code) (depends on !584). Originally in !497.
- [ ] Port the n-point gain analysis (with updates to fill configuration in !584), see !689
- [ ] Port the trim analysis (original: !490, new: !667)
- [ ] Feedback configuration of Strobe Delay (not part of devel_SR1 as it's done externally), with correct chip mapping related to #169 (the input channel mapping check)
- [x] Loop action StarTrimDacLoop, should be possible to use StdParameterLoop instead
- [x] Use Fuse ID to set HCC communications ID !665
Other things that are in the branch that may or not be merged (might not be exhaustive). Mostly the code has been presented in other merge requests:
- [ ] Addition to ScurveFitter to allow extra analysis !452 and !453
- [ ] Adding integral to histo (I think this is redundant) !513
- [ ] Mutex in TxCore (basically pairing command mask and send fifo, maybe add a function ```writeFIFOToCommand(mask, buffer)```) !556
- [ ] NetIO/TxCore writeFifo 8-bit. Not sure why bytes, rather than 16-bit words are required? !391
- [ ] Report of burst timing for StarCounterLoop
- [ ] Histogram file name changes ('_' vs '-' as separator) !455
- [ ] Addition of LoopStatus to OccupancyMaps !437
- [ ] HistoFromDisk which I think is not needed if calibration information is added to configuration !432
- [ ] Broadcast write in StdParameterLoop (and prescan) potentially invalidating register fields (!477)
- [ ] Some handling of fuse IDs (needed to set up HCC communication ID). Currently m_sn is in StarChips.h but never set (!665 including rename to make it clear SN is different)https://gitlab.cern.ch/YARR/YARR/-/issues/175Add links to devel in documentation2023-02-06T17:02:01+01:00Bruce Joseph GallopAdd links to devel in documentationWould it be useful to add the following links in the standard documentation:
* Future docs: https://yarr.web.cern.ch/yarr/devel/
* Developer docs: https://yarr.web.cern.ch/doxygen/devel/
* Coverage: https://yarr.web.cern.ch/yarr/devel/c...Would it be useful to add the following links in the standard documentation:
* Future docs: https://yarr.web.cern.ch/yarr/devel/
* Developer docs: https://yarr.web.cern.ch/doxygen/devel/
* Coverage: https://yarr.web.cern.ch/yarr/devel/coverage/
Though some of these are already linked from the README.md file.https://gitlab.cern.ch/YARR/YARR/-/issues/177Global FrontEnd and local database are set twice in ScanConsole2022-11-14T20:52:41+01:00Zhengcheng TaoGlobal FrontEnd and local database are set twice in ScanConsoleThe code for setting up local database and global FE is duplicated in [`ScanConsoleImpl::initHardware`](https://gitlab.cern.ch/YARR/YARR/-/blob/devel/src/libYarr/ScanConsoleImpl.cpp#L391-418) and [`ScanConsoleImpl::configure`](https://gi...The code for setting up local database and global FE is duplicated in [`ScanConsoleImpl::initHardware`](https://gitlab.cern.ch/YARR/YARR/-/blob/devel/src/libYarr/ScanConsoleImpl.cpp#L391-418) and [`ScanConsoleImpl::configure`](https://gitlab.cern.ch/YARR/YARR/-/blob/devel/src/libYarr/ScanConsoleImpl.cpp#L231-258). I suppose this is not intentional?https://gitlab.cern.ch/YARR/YARR/-/issues/179Connectivity scan tool2024-03-08T15:36:28+01:00Timon HeimConnectivity scan toolTool which loops over all links trying to find all attached front-ends within certain constraints (type, link speed etc).Tool which loops over all links trying to find all attached front-ends within certain constraints (type, link speed etc).Release v1.5.1Timon HeimLingxin MengTimon Heimhttps://gitlab.cern.ch/YARR/YARR/-/issues/180Add data transmission error counter to decoder2022-11-29T18:08:43+01:00Timon HeimAdd data transmission error counter to decoderAdd error counter to data processors and report on total decoding error count at the end of processing.Add error counter to data processors and report on total decoding error count at the end of processing.https://gitlab.cern.ch/YARR/YARR/-/issues/181Develop read-adc script2023-01-26T05:31:18+01:00Emily Anne ThompsonDevelop read-adc scriptI added a script to read the internal ADC counts: https://gitlab.cern.ch/YARR/YARR/-/merge_requests/593 . This will likely be merged soon because it is urgently needed for module QC tools development. So I am collecting here some ideas b...I added a script to read the internal ADC counts: https://gitlab.cern.ch/YARR/YARR/-/merge_requests/593 . This will likely be merged soon because it is urgently needed for module QC tools development. So I am collecting here some ideas brought up in that MR on how we could improve this read-adc script implementation:
- Add possibility to convert ADC measurements to temperature (or create separate script to do that) (suggestion from @theim )
- Return list of ADC values instead of singular value. Returning a singular value works well with RD53b but not with other chips (ABC/HCC) (suggestion from @bgallop )
- Currently the script sets up the front-end so that the ADC can be read in a separate step. From the Strips point of view, it might be easier to do in one step (setup and read). (suggestion from @bgallop )
- Consider changing output format. Currently the script gives output to stdout, but we might consider returning a value instead (suggestion from @msaimper ). This would also apply to the read-register script.
Cheers,
Emilyhttps://gitlab.cern.ch/YARR/YARR/-/issues/182v1.4 (and devel) have incorrectly declared dependencies2023-01-27T18:00:03+01:00Giordon Holtsberg Starkv1.4 (and devel) have incorrectly declared dependenciesStemming from changes here https://gitlab.cern.ch/YARR/YARR/-/commit/f25315ea8e5d965679b84bdc9f6821ca95d9ab79 -- you need CMake > 3.14 in order to compile this. devtoolset-9 doesn't ship with that.Stemming from changes here https://gitlab.cern.ch/YARR/YARR/-/commit/f25315ea8e5d965679b84bdc9f6821ca95d9ab79 -- you need CMake > 3.14 in order to compile this. devtoolset-9 doesn't ship with that.https://gitlab.cern.ch/YARR/YARR/-/issues/183Trigger frequency limitation2023-02-27T03:42:48+01:00Elise Maria Le Boulicaut EnnisTrigger frequency limitationThis issue addresses the problem of data arriving late and therefore filling the wrong scan iteration. When running on strip staves, we noticed that this problem gets worse as the number of hybrids (i.e. Front Ends) increases. Decreasing...This issue addresses the problem of data arriving late and therefore filling the wrong scan iteration. When running on strip staves, we noticed that this problem gets worse as the number of hybrids (i.e. Front Ends) increases. Decreasing the trigger frequency mitigates the issue.
The first plot below shows the maximum trigger frequency that can be run without significantly affecting the s-curves as a function of the number of FEs. The second plot shows the maximum frequency as a function of 1/(number of FEs)^2, fit to a linear, which demonstrates that the relationship is approximately 1/N^2.
![frequency_vs_N](/uploads/566158e1f00992719f3a8782f0e8d6b2/frequency_vs_N.png)
![frequency_vs_1overN2](/uploads/f63090ff19bd568c67ca82ba0460975b/frequency_vs_1overN2.png)
I know there are efforts ongoing to address this problem (in particular by Alex and @ztao), but we discussed it would be a good idea to open this issue in order to have a benchmark.https://gitlab.cern.ch/YARR/YARR/-/issues/184Use of FrontEndCfg::toCharge in Star2023-02-08T23:09:46+01:00Bruce Joseph GallopUse of FrontEndCfg::toCharge in StarFollow up on !584.
FrontEndCfg::toCharge should be used to convert a global DAC value to a charge in electrons.
This is not used in libStar, but now that StarConversionTools exists, it might be possible. Question is whether an individu...Follow up on !584.
FrontEndCfg::toCharge should be used to convert a global DAC value to a charge in electrons.
This is not used in libStar, but now that StarConversionTools exists, it might be possible. Question is whether an individual calibration is needed for each AbcStar, and if so how to map that to an API.https://gitlab.cern.ch/YARR/YARR/-/issues/185ITkPixV1.1 highest injection frequency2023-02-07T01:37:19+01:00Timon HeimITkPixV1.1 highest injection frequencyMeasure if threshold is affected by injection frequency, ideally with analog scan which don't change the injection voltage.
Might be related to Parameter wait time as defined in scan.Measure if threshold is affected by injection frequency, ideally with analog scan which don't change the injection voltage.
Might be related to Parameter wait time as defined in scan.https://gitlab.cern.ch/YARR/YARR/-/issues/186Injection vs. number of pixels2023-02-07T01:39:00+01:00Timon HeimInjection vs. number of pixelsCheck if what is the maximum number of pixels which can be injected into before the injected charge is falsified or decreased.
Check via threshold scan or analog scans, threshold scan could also suffer from change injection voltage.Check if what is the maximum number of pixels which can be injected into before the injected charge is falsified or decreased.
Check via threshold scan or analog scans, threshold scan could also suffer from change injection voltage.https://gitlab.cern.ch/YARR/YARR/-/issues/189Load monitoring via ClipBoard2023-03-22T19:55:40+01:00Zhengcheng TaoLoad monitoring via ClipBoardAs discussed in the [meeting](https://indico.cern.ch/event/1267000/contributions/5320649/attachments/2616462/4522403/2023-03-22_yarr-update-feedback-MR-triggers-performance.pdf#page=9), it would be useful to have a monitoring tool for da...As discussed in the [meeting](https://indico.cern.ch/event/1267000/contributions/5320649/attachments/2616462/4522403/2023-03-22_yarr-update-feedback-MR-triggers-performance.pdf#page=9), it would be useful to have a monitoring tool for data loads at the data processors of different front ends.
This can potentially be done via `ClipBoard`. Each `ClipBoard` currently has a counter for total number of data objects in and another counter for data out. A separate monitoring thread could access these counters during scans with some configurable time interval and report the data throughput while scans are running or generate a report at the end with timestamps.https://gitlab.cern.ch/YARR/YARR/-/issues/190Automatically update chip config during scan2023-03-27T11:08:55+02:00Maria Giovanna FotiAutomatically update chip config during scanWe need a way to automatically update chip config, based on the results of a scan.
This feature already exists in Yarr, through function `ScanHelper::writeFeConfig` (called in `ScanConsoleImpl::cleanup`).
In the current state, anything...We need a way to automatically update chip config, based on the results of a scan.
This feature already exists in Yarr, through function `ScanHelper::writeFeConfig` (called in `ScanConsoleImpl::cleanup`).
In the current state, anything that is written to `feCfg` will go into the updated chip config. In principle this does not include the prescan registers, because these are applied through the `GlobalFe`. However for strips, we cannot use the `GlobalFe` because it overwrites all sub-registers in a register.
Because of this the `devel_SR1` branch has applied a hack (!477) which loops over Fes and writes the prescan via `feCfg`. This means that prescan values are written automatically to the updated chip config, which we don't want.
We see two possible solutions to this:
1. before applying the prescan, store the original register values in a list, then, at the end of the scan undo the changes relative to the prescan before calling `ScanHelper::writeFeConfig` to update the config.
2. create two separate `feCfg` objects a `feCfg_final` and `feCfg_tmp` to be updated during the scan. The prescan values would only be applied to `feCfg_tmp`, while all other register configurations that want to be propagated through the scan would be written to both `feCfg_final` and `feCfg_tmp`. Only `feCfg_final` is then used in `ScanHelper::writeFeConfig`.
In any case, either !477 or !598 are needed.
@elebouli, @bgallop, @ztao, @theim, @otoldaiehttps://gitlab.cern.ch/YARR/YARR/-/issues/191LocalDB upload fails if config path is relative to connectivity2023-03-28T04:58:32+02:00Timon HeimLocalDB upload fails if config path is relative to connectivityLocalDB does not use the same path config as Yarr, leading to:
```
❯ bin/scanConsole -r configs/controller/specCfg-rd53b-16x1.json -c ~/20UPGR92201045/20UPGR92201045_L2_warm.json -s configs/scans/rd53b/std_digitalscan.json -p -W
[2023-03...LocalDB does not use the same path config as Yarr, leading to:
```
❯ bin/scanConsole -r configs/controller/specCfg-rd53b-16x1.json -c ~/20UPGR92201045/20UPGR92201045_L2_warm.json -s configs/scans/rd53b/std_digitalscan.json -p -W
[2023-03-27 19:56:43.919] [info] Configuring logger ...
[19:56:43:919][ info ][ ScanConsole ][31762]: #####################################
[19:56:43:919][ info ][ ScanConsole ][31762]: ## Welcome to YARR - ScanConsole ##
[19:56:43:919][ info ][ ScanConsole ][31762]: #####################################
[19:56:43:923][ info ][ ScanHelper ][31762]: Chip type: RD53B
[19:56:43:923][ info ][ ScanHelper ][31762]: Chip count 4
[19:56:43:924][ info ][ ScanHelper ][31762]: Loading chip #0
[19:56:43:926][ info ][ ScanHelper ][31762]: Loading config file: /home/theim/20UPGR92201045/L2_warm/0x14138_L2_warm.json
[19:56:44:063][ info ][ ScanHelper ][31762]: Loading chip #1
[19:56:44:064][ info ][ ScanHelper ][31762]: Loading config file: /home/theim/20UPGR92201045/L2_warm/0x14158_L2_warm.json
[19:56:44:165][ info ][ ScanHelper ][31762]: Loading chip #2
[19:56:44:166][ info ][ ScanHelper ][31762]: Loading config file: /home/theim/20UPGR92201045/L2_warm/0x1415a_L2_warm.json
[19:56:44:267][ info ][ ScanHelper ][31762]: Loading chip #3
[19:56:44:268][ info ][ ScanHelper ][31762]: Loading config file: /home/theim/20UPGR92201045/L2_warm/0x1414a_L2_warm.json
[19:56:44:454][ info ][ ScanConsole ][31762]: Scan Type/Config configs/scans/rd53b/std_digitalscan.json
[19:56:44:454][ info ][ ScanConsole ][31762]: Connectivity:
[19:56:44:454][ info ][ ScanConsole ][31762]: /home/theim/20UPGR92201045/20UPGR92201045_L2_warm.json
[19:56:44:454][ info ][ ScanConsole ][31762]: Target ToT: -1
[19:56:44:454][ info ][ ScanConsole ][31762]: Target Charge: -1
[19:56:44:454][ info ][ ScanConsole ][31762]: Output Plots: true
[19:56:44:454][ info ][ ScanConsole ][31762]: Output Directory: ./data/011728_std_digitalscan/
[19:56:44:454][ info ][ ScanConsole ][31762]: Timestamp: 2023-03-27_19:56:44
[19:56:44:454][ info ][ ScanConsole ][31762]: Run Number: 11728
[19:56:44:454][ info ][ ScanConsole ][31762]: #####################
[19:56:44:454][ info ][ ScanConsole ][31762]: ## Init Hardware ##
[19:56:44:454][ info ][ ScanConsole ][31762]: #####################
[19:56:44:454][ info ][ ScanConsole ][31762]: -> Opening controller config: configs/controller/specCfg-rd53b-16x1.json
[19:56:44:454][ info ][ ScanHelper ][31762]: Loading controller ...
[19:56:44:454][ info ][ ScanHelper ][31762]: Found controller of type: spec
[19:56:44:454][ info ][ SpecCom ][31762]: Opening SPEC with id #1
[19:56:44:454][ info ][ SpecCom ][31762]: Mapping BARs ...
[19:56:44:454][ info ][ SpecCom ][31762]: ... Mapped BAR0 at 0x7f99ab03a000 with size 1048576
[19:56:44:455][warning ][ SpecCom ][31762]: ... BAR4 not mapped (Mmap failed)
[19:56:44:455][ info ][ SpecCom ][31762]: ~~~~~~~~~~~~~~~~~~~~~~~~~~~
[19:56:44:455][ info ][ SpecCom ][31762]: Firmware Version: 0x4d9ff6d
[19:56:44:455][ info ][ SpecCom ][31762]: Firmware Identifier: 0x4030232
[19:56:44:455][ info ][ SpecCom ][31762]: FPGA card: PLDA XpressK7 325
[19:56:44:455][ info ][ SpecCom ][31762]: FE Chip Type: RD53A/B
[19:56:44:455][ info ][ SpecCom ][31762]: FMC Card Type: Ohio Card (Display Port)
[19:56:44:455][ info ][ SpecCom ][31762]: RX Speed: 640Mbps
[19:56:44:455][ info ][ SpecCom ][31762]: Channel Configuration: 16x1
[19:56:44:455][ info ][ SpecCom ][31762]: ~~~~~~~~~~~~~~~~~~~~~~~~~~~
[19:56:44:455][ info ][ SpecCom ][31762]: Flushing buffers ...
[19:56:44:455][ info ][ SpecCom ][31762]: Init success!
[19:56:44:455][ info ][ ScanHelper ][31762]: Loaded controller config:
[19:56:44:455][ info ][ ScanHelper ][31762]: ~~~ {
[19:56:44:455][ info ][ ScanHelper ][31762]: ~~~ "cmdPeriod": 6.250000073038109e-9,
[19:56:44:455][ info ][ ScanHelper ][31762]: ~~~ "idle": {
[19:56:44:455][ info ][ ScanHelper ][31762]: ~~~ "word": 2863311530
[19:56:44:455][ info ][ ScanHelper ][31762]: ~~~ },
[19:56:44:455][ info ][ ScanHelper ][31762]: ~~~ "pulse": {
[19:56:44:455][ info ][ ScanHelper ][31762]: ~~~ "interval": 500,
[19:56:44:455][ info ][ ScanHelper ][31762]: ~~~ "word": 0
[19:56:44:455][ info ][ ScanHelper ][31762]: ~~~ },
[19:56:44:455][ info ][ ScanHelper ][31762]: ~~~ "rxActiveLanes": 1,
[19:56:44:455][ info ][ ScanHelper ][31762]: ~~~ "rxPolarity": 65535,
[19:56:44:455][ info ][ ScanHelper ][31762]: ~~~ "specNum": 1,
[19:56:44:455][ info ][ ScanHelper ][31762]: ~~~ "spiConfig": 541200,
[19:56:44:455][ info ][ ScanHelper ][31762]: ~~~ "sync": {
[19:56:44:455][ info ][ ScanHelper ][31762]: ~~~ "interval": 16,
[19:56:44:455][ info ][ ScanHelper ][31762]: ~~~ "word": 2172551550
[19:56:44:455][ info ][ ScanHelper ][31762]: ~~~ },
[19:56:44:455][ info ][ ScanHelper ][31762]: ~~~ "txPolarity": 0
[19:56:44:455][ info ][ ScanHelper ][31762]: ~~~ }
[19:56:44:455][ info ][ ScanConsole ][31762]: #######################
[19:56:44:455][ info ][ ScanConsole ][31762]: ## Loading Configs ##
[19:56:44:455][ info ][ ScanConsole ][31762]: #######################
[19:56:44:455][ info ][ ScanHelper ][31762]: Chip type: RD53B
[19:56:44:455][ info ][ ScanHelper ][31762]: Chip count 4
[19:56:44:455][ info ][ ScanHelper ][31762]: Loading chip config #0
[19:56:44:455][ info ][ Bookkeeper ][31762]: Added FE: Tx(0), Rx(2) under ID 0
[19:56:44:546][ info ][ ScanHelper ][31762]: Loading chip config #1
[19:56:44:546][ info ][ Bookkeeper ][31762]: Added FE: Tx(0), Rx(1) under ID 1
[19:56:44:637][ info ][ ScanHelper ][31762]: Loading chip config #2
[19:56:44:637][ info ][ Bookkeeper ][31762]: Added FE: Tx(0), Rx(0) under ID 2
[19:56:44:728][ info ][ ScanHelper ][31762]: Loading chip config #3
[19:56:44:728][ info ][ Bookkeeper ][31762]: Added FE: Tx(0), Rx(3) under ID 3
[19:56:44:846][ info ][ ScanConsole ][31762]: ####################
[19:56:44:846][ info ][ ScanConsole ][31762]: ## Set Database ##
[19:56:44:846][ info ][ ScanConsole ][31762]: ####################
[19:56:45:704][ info ][ Local DB ]: ------------------------------
[19:56:45:704][ info ][ Local DB ]: Function: Check config files
[19:56:45:706][ info ][ Local DB ]: -> Setting user config: /home/theim/.yarr/localdb/user.json
[19:56:45:706][ info ][ Local DB ]: -> Setting site config: /home/theim/.yarr/localdb/nakedsnail.dhcp.lbl.gov_site.json
[19:56:45:706][ error ][ Local DB ]: Not found RD53B in chip config file: L2_warm/0x14138_L2_warm.json
[19:56:45:706][ error ][ Local DB ]: Invalid configs for uploading data, aborting...
[19:56:45:706][ info ][ Local DB ]: ------------------------------
```Hideyuki OideHideyuki Oidehttps://gitlab.cern.ch/YARR/YARR/-/issues/192Return status of scanConsole for chip configuration should change from 1 to 0.2023-04-05T18:33:13+02:00Kehang BaiReturn status of scanConsole for chip configuration should change from 1 to 0.Return status of scanConsole for chip configuration without a scan config is currently 1, but should change to 0 for consistency in status checks in QC tools.
Tagging @theim.Return status of scanConsole for chip configuration without a scan config is currently 1, but should change to 0 for consistency in status checks in QC tools.
Tagging @theim.https://gitlab.cern.ch/YARR/YARR/-/issues/193Injection delay tuning2023-04-06T23:00:22+02:00Lingxin MengInjection delay tuninghttps://gitlab.cern.ch/YARR/YARR/-/issues/194[Discussion] Config handling API development2023-04-11T12:19:59+02:00Hideyuki Oide[Discussion] Config handling API development[This snippet](https://gitlab.cern.ch/YARR/YARR/-/snippets/2594) provides a possible implementation of FE config administration on LocalDB, with the following features:
* identification of a "branch" of config revision by `serialNumber`...[This snippet](https://gitlab.cern.ch/YARR/YARR/-/snippets/2594) provides a possible implementation of FE config administration on LocalDB, with the following features:
* identification of a "branch" of config revision by `serialNumber`, `stage` and `branch` name.
* Each `config` document points to the latest `config_revision` document, which contains the actual config (except `PixelConfig`).
```
{
_id: ObjectId("6434ecce31379fc0ad231b26"),
serialNumber: '20UPGXF0000013',
stage: 'MODULE/INITIAL_WARM',
branch: 'default',
current_revision_id: ObjectId("6434ecce31379fc0ad231b2d")
}
```
* Each `config_revision` object contains:
* the actual FE config object,
* "diff" from the previous (parent) revision,
* pointer to `PixelConfig`
* user-specified tag list
* message on the commit
* (timestamp)
```
{
_id: ObjectId("6434ef0fb49d2269a6e3549a"),
parent_revision_id: null,
config_data: {
RD53B: {
GlobalConfig: {
AiRegionRow: 0,
AuroraActiveLanes: 1,
...
VrefIn: 1,
VrefRsensBot: 0,
VrefRsensTop: 0
},
Parameter: {
ADCcalPar: [ 5.894350051879883, 0.1920430064201355, 4990 ],
ChipId: 15,
EnforceNameIdCheck: true,
...
VcalPar: [ 0.46000000834465027, 0.20069999992847443 ]
}
}
},
diff: {
RD53B: {
GlobalConfig: {
AiRegionRow: 0,
...
}
}
},
pix_config: ObjectId("6434ef0fb49d2269a6e35494"),
message: 'commit with ITkPixV1.1Q13_Chip4.json',
tags: []
```
* `PixelConfig` is not documented as `MongoDB` object, but it is stored in `GridFS`. Here I'd propose to use python `pickle` and save data as a `python` dictionary object, so that one can easily check identity of the config by `hashlib md5` which is a default parameter in each `GridFS` object.
## List of API methods (may add more)
Quite analogous to `git` commands
* `__init__(hostname, port)` : connect to MongoDB server and setup the client
* `create_config(serial_number, stage, branch = 'default')` : create a new config instance. Serial number and stage are mandatory.
* `get_info( config_id )` : get the contents of `config` object
* `info( config_id )` : print the contents of `config` object:
```
config id = 6434b841e9f445e668522512
- Serial Number: 20UPGXF0000013
- Stage: MODULE/INITIAL_WARM
- Branch: default
- HEAD: 6434b842e9f445e668522517
```
* `copy_config(original_id, serial_number, stage, branch = 'default')`: from an `original` config, create a copy of it with a different set of identifiers of `(serial_number, stage, branch)`. The difference of this method from the `branch()` method below is that for `branch()` case `serial_number` and `stage` are fixed and only `branch` parameter can be different, so more constrained.
* `checkout(serial_number, stage, branch = 'default')` : returns the ID of the `config` specified by the identifiers.
* `branch(parent_id, new_branch, revision_id="HEAD")` : create a copy of `config` of the same `serial_number` and `stage` with a different `branch` name. The revision history up to the moment of copying is shared. Returns the ID of the created branch.
* `get_config(config_id, revision=None, add_pixel_cfg = False)`: returns the FE config of the `config` at the specified `revision` ID. When `add_pixel_cfg=True`, the corresponding `PixelConfig` object is inserted into the `config` from `GridFS`.
* `commit(config_id, fe_cfg, message="")`: revise the config with the specified `fe_cfg`. `message` can be put to book-keep the revision history.
```
new commit: 6434b842e9f445e668522517 --> 20UPGXF0000013 | MODULE/INITIAL_WARM | default
```
* `_get_prev_commit( revision_id )` : an API-internal method to reconstruct the revision history list.
* `get_revision_id(config_id, tag="HEAD")` : method to get the revision ID by a tag, e.g. `HEAD`, `HEAD^2` or any user-defined tags.
* `get_revision_history( config_id )` : get the full revision history as a list of revision IDs, in the descending order of the commit (latest commit is the first).
* `get_log( config_id, depth = 10 )` : show revision history of the branch down to the specified depth.
```
config id = 6434b841e9f445e668522512
- Serial Number: 20UPGXF0000013
- Stage: MODULE/INITIAL_WARM
- Branch: default
- HEAD: 6434b842e9f445e668522517
--------------------------------------
commit 6434b842e9f445e668522517: (HEAD) revision4 on branch default
commit 6434b842e9f445e668522515: another revision
commit 6434b841e9f445e668522514: revision2
commit 6434b841e9f445e668522513: this is the first revision.
--------------------------------------
```
## Supposed use-case
* The YARR scan's "before" config can be queried to MongoDB using this API. Most of the cases it should be just fetching the latest revision of the branch. The API operation can be integrated in `scanConsole` as pre/post processes.
* Supporting parallel branches for {cold, warm, LP} and its evolution by scans.
* Revision of configs after YARR scan finishes, or by `module-qc-tools`
* Creation of new branches for user's R&D works
* Tools for the downloading/uploading config with production DB is not present in this API (yet).
* Readout `connectivity` is not administrated by this API (yet).
## Misc
* Pixel config data size by `python pickle` serialization is around 1.2 MB.
## Open points
* At which repository should this API live? Possible clients are `YARR`, `module-qc-analysis-tools` and `localdb-tools`.
* Some of YARR's `dbAccessor` features should be deprecated if this API is commissioned.
Inviting @theim @epianori @gstark for discussion.https://gitlab.cern.ch/YARR/YARR/-/issues/197Make read-register and write-register tools operational for Strips2023-10-05T18:09:39+02:00Alex ToldaievMake read-register and write-register tools operational for StripsThe `src/tools/read-register.cpp|write-register.cpp` do not work with the Strips chips right now. Because `StripsChips.h` does not implement fully the `FrontEnd` methods that are used in those executables:
/// Reads the named re...The `src/tools/read-register.cpp|write-register.cpp` do not work with the Strips chips right now. Because `StripsChips.h` does not implement fully the `FrontEnd` methods that are used in those executables:
/// Reads the named register and writes it to the local object memory
virtual void readUpdateWriteNamedReg(std::string name) {}
/// Write to a register using a string name (most likely from json)
virtual void writeNamedRegister(std::string name, uint16_t value) = 0;
/// Reads a named register and returns the value of it
virtual uint16_t readNamedRegister(std::string name) {return 0;}
Let's implement them, with the following practical considerations in mind:
* The commands should be able to read the _full_ register, not just the sub-registers.
* And it would be nice to have a more consistent naming: in some places the HCC register names begin with the HCC_ prefix, in others they don't.
* Also, the sub-register enums are capitalized (like OPMODE, STOPHPR), when the chip config files and the HCC/ABC spec documents have different cases (OPmode, StopHPR). Probably it makes sense to make it insensitive to the font case.
* The `read-register` and `write-register` should have the debugging log mode.https://gitlab.cern.ch/YARR/YARR/-/issues/198Netio socket unsubscribe terminates the connection to Felixcore and crashes s...2023-05-23T11:40:44+02:00Alex ToldaievNetio socket unsubscribe terminates the connection to Felixcore and crashes scanConsole at long latencies or many FEsWhen Strips run on many FEs, sometimes we get a socket/network error at the end of the scan that terminates `scanConsole` prematurely. It is usually the "wrong file descriptor" error, but sometimes it is "could not connect to <felixcore ...When Strips run on many FEs, sometimes we get a socket/network error at the end of the scan that terminates `scanConsole` prematurely. It is usually the "wrong file descriptor" error, but sometimes it is "could not connect to <felixcore ip> or something like that.
Commenting out the [`m_sub_sockets[chn]->unsubscribe` in `NetioHandler::delChannel`](https://gitlab.cern.ch/YARR/YARR/-/blob/devel/src/libNetioHW/NetioHandler.cpp#L157) fixes it and does not seem to bring other problems:
```
void NetioHandler::delChannel(uint64_t chn){
...
if(it!=m_channels.end()){
nlog->debug("### NetioHandler::delChannel({}) -> unsubscribe", chn);
m_channels.erase(it);
//SHIT: please do not unsubscribe: because felixcore/netio doesn't like it
m_sub_sockets[chn]->unsubscribe(chn, netio::endpoint(m_felixHost, m_felixRXPort));
delete m_sub_sockets[chn];
m_sub_sockets.erase(chn);
}
}
```
It looks like when one FE deletes its channel and calls `m_sub_sockets[chn]->unsubscribe`, it causes Felixcore to close the whole TCP connection to `scanConsole` or something like that. Then, the other FE send Netio `unsubscribe` to a closed connection and fail.
If the latency is short enough, you may not notice it. As the "unsubscribes" are sent to Felixcore before the connection has been shut down. But if the latency is long or you have many FEs, it can crash the `scanConsole` run at the very end.
I think that at some point I asked to un-comment this `unsubscribe`, because it was causing issues with AMAC OPC communication. But that's a long gone problem and AMAC OPC gets Netio by other means.
Still, I may be missing something about Netio. But if not, then I can open a merge request on this line.https://gitlab.cern.ch/YARR/YARR/-/issues/200LoopStatus class default construction broken2023-06-06T16:28:41+02:00Matthias WittgenLoopStatus class default construction brokenThis code in YARR is defect
https://gitlab.cern.ch/YARR/YARR/-/blob/devel/src/libYarr/include/LoopStatus.h#L41
```
LoopStatus()=default;
...
unsigned get(unsigned i) const { return statVec[i]; }
```
There's some test code using the defau...This code in YARR is defect
https://gitlab.cern.ch/YARR/YARR/-/blob/devel/src/libYarr/include/LoopStatus.h#L41
```
LoopStatus()=default;
...
unsigned get(unsigned i) const { return statVec[i]; }
```
There's some test code using the default ctor `LoopStatus()`
When compiling optimized this works for `get(i = 0)`
I assume the default ctor should be deleted as well. The class interface as is
would require to check the size before a call to `get()`
`size_t styleSize() const { return styleVec.size(); }` is never used.https://gitlab.cern.ch/YARR/YARR/-/issues/201Benchmarking notes2023-06-08T19:43:28+02:00Bruce Joseph GallopBenchmarking notes
Is there a place to discuss the benchmarks that @wittgen introduced (currently "benchmark_rd53b")?
A couple of brief comments:
* Does it need to be RD53B specific, it should be possible to configure based on a file?
* Are there any ot...
Is there a place to discuss the benchmarks that @wittgen introduced (currently "benchmark_rd53b")?
A couple of brief comments:
* Does it need to be RD53B specific, it should be possible to configure based on a file?
* Are there any other benchmarks that should be done as well
* Presumably a short run could be added to CI (eg a file with errors, doing fuzzing?)?https://gitlab.cern.ch/YARR/YARR/-/issues/203Star Reset query (HCCv1)2023-08-07T18:04:26+02:00Bruce Joseph GallopStar Reset query (HCCv1)@keener, @dtrischu
We were looking at resets again elsewhere.
The last command of `StarChips::resetAll` is a fast command number `LCB::HCC_START_PRLP`. This is left over from HCCv0 (and set to number 15).
In fact for HCCv1 15 and 11 ...@keener, @dtrischu
We were looking at resets again elsewhere.
The last command of `StarChips::resetAll` is a fast command number `LCB::HCC_START_PRLP`. This is left over from HCCv0 (and set to number 15).
In fact for HCCv1 15 and 11 are merged into a toggle using command 11. I don't think this will cause an issue, as fast command 15 is presumably ignored (it says RESERVED in the spec).
Note that ITSDAQ doesn't use either of these commands (except for recently added HCC trigger reception tests), so we could decide to just remove it here.https://gitlab.cern.ch/YARR/YARR/-/issues/204Allow read-adc without MonitorV address2023-10-19T19:33:45+02:00Emily Anne ThompsonAllow read-adc without MonitorV addressThe read-adc executable requires user to specify the MonitorV value that they want to read (https://gitlab.cern.ch/YARR/YARR/-/blob/master/src/tools/read-adc.cpp#L30). But it would be nice to just read the ADC with whatever is the curren...The read-adc executable requires user to specify the MonitorV value that they want to read (https://gitlab.cern.ch/YARR/YARR/-/blob/master/src/tools/read-adc.cpp#L30). But it would be nice to just read the ADC with whatever is the current setting of MonitorV. I would suggest that MonitorV is included as an option instead of a required argument. This would be useful for using the calibrated ADC instead of the Vmux in QC-tools (https://gitlab.cern.ch/atlas-itk/pixel/module/module-qc-tools/-/merge_requests/108).
Tagging @mmarjano and @theimhttps://gitlab.cern.ch/YARR/YARR/-/issues/205Measure the performance of calibration scans2024-03-27T20:49:43+01:00Alex ToldaievMeasure the performance of calibration scansA calibration scan consists of 3 parts:
* configure the FEs
* readout the calibration data (i.e. produce the `FrontEndData`)
* plot and analyze it
Ideally:
* the readout part of the calibration (the `HWController` push to `DataProcessor...A calibration scan consists of 3 parts:
* configure the FEs
* readout the calibration data (i.e. produce the `FrontEndData`)
* plot and analyze it
Ideally:
* the readout part of the calibration (the `HWController` push to `DataProcessor`, which pushes the `FrontEndData` to the calibration analysis) runs as fast as the triggering (e.g. 500 triggers / 10kHz = 0.05s).
* And the triggering is done at the HW limit frequency for the full occupancy data packets.
Then, we can push the readout to its HW limit, i.e. the HW limit of the triggering. And in the overall calibration, the limit will be the analysis part.
Factors:
* Felixcore or felix-star & rdma? (matters only if the network is indeed a bottleneck now)
* Number of triggers per iteration (should just scale, we do not send too many triggers for the calibrations)
* Trigger frequency (what is the HW limit for full occupancy packets? Do we need FW triggers to reach it?)
* Run from 1 YARR & many – if the network & HWhandler make a bottleneck (and to speed up the plotting)
* Run with and without the analysis, only saving the data to the disk
* Also, try to save the output data to a memory-mounted disk.
* Then, at some point, we could also try the hit counters.
So, we need to start from our standard setup: Felixcore, 500 SW triggers at 10kHz, no memory-mounted disk, with all of the analysis and try to run from 1 YARRs. See the profile, identify the current bottleneck. Then, according to what’s needed, try multiple YARRs, try memory-mounted disk, etc. If everything is perfect, we will just increase the trigger frequency.
Together with the profile, we need to measure these times:
* The time to configure – Yarr already measures that, right?
* HWController’s handler (push) – this one must be within the triggering limit, otherwise it is the bottleneck
* StdDataLoop iteration & time between iterations
+ within it: DataProcessors parsing (pop)
* Also: analysis time & time to save to the disk
And more metrics:
* cache hits
* CPU occupancy?
* memory occupancy?
# Commands to use
## `perf` flamegraph profile
```
sudo perf record -F 99 -g -- <scanConsole command>
# or with /bin/time
sudo perf record -F 99 -g -- /bin/time <scanConsole command>
# it may work without sudo!
# I am not sure if it will save all stack frames then (the ones from inside linux too?)
# produces a perf.data file
# -F sets the frequence of 99 samples per second -- increase if more statistics is needed
# the flamegraph from the perf.data file:
sudo perf script | stackcollapse-perf.pl > out.perf-folded
cat out.perf-folded | flamegraph.pl > perf-kernel.svg
```
It needs `stackcollapse-perf.pl` and `flamegraph.pl` Perl [scripts](https://github.com/brendangregg/FlameGraph).
## Cache hits, CPU, memory
The cache hits, CPU, etc [cannot be obtained from perf](https://stackoverflow.com/questions/62550369/run-perf-stat-on-the-output-of-perf-record?rq=3) simultaneously with the `record` of the call stack profile. So, they will have to be run separately:
```
# CPU and cache hits:
perf stat <scanConsole>
# memory usage:
# TODO
```
## Additional time counters inside YARR
For HWController, [`NetioHandler`](https://gitlab.cern.ch/YARR/YARR/-/blob/master/src/libNetioHW/NetioHandler.cpp#L65) passes a lambda as the handler. Its scope will not allow for a time counter. Make a dedicated `NetioHandler` method for the handler?
In [`FelixRxCore::on_data`](https://gitlab.cern.ch/YARR/YARR/-/blob/master/src/libFelixClient/FelixRxCore.cpp#L131), it's straightforward.
In [`StdDataLoop`](https://gitlab.cern.ch/YARR/YARR/-/blob/master/src/libYarr/StdDataLoop.cpp#L36), we need the times `exec2 - exec1` and `exec1 - exec2` for the iteration time and between-iterations:
```
// src/libYarr/include/StdDataLoop.h
+#include <chrono>
+using Clock = std::chrono::steady_clock;
class StdDataLoop: public LoopActionBase, public StdDataAction {
...
+
+ // additional timings for calibrations performance
+ std::chrono::time_point<Clock> exec1_time;
+ std::chrono::time_point<Clock> exec2_time;
+ std::chrono::microseconds time_of_iteration(0); // initialize with 0
+ std::chrono::microseconds time_between_iterations(0);
+ bool started_iterations = false;
};
// src/libYarr/StdDataLoop.cpp
+StdDataLoop::~StdDataLoop() {
+ SPDLOG_LOGGER_INFO(sdllog, "Time of iterations {} [us]", time_of_iteration.count());
+ SPDLOG_LOGGER_INFO(sdllog, "Time between iterations {} [us]", time_between_iterations.count());
+}
void StdDataLoop::execPart1() {
+ exec1_time = Clock::now();
+ if (started_iterations) {
+ time_between_iterations +=
+ std::chrono::duration_cast<std::chrono::microseconds>(exec1_time - exec2_time);
+ }
+ else started_iterations = true;
+
...
}
void StdDataLoop::execPart2() {
...
+
+ exec2_time = Clock::now();
+ time_of_iteration +=
+ std::chrono::duration_cast<std::chrono::microseconds>(exec2_time - exec1_time);
}
```
And in the [processing](https://gitlab.cern.ch/YARR/YARR/-/blob/master/src/libStar/StarDataProcessor.cpp#L87): add up the time inside the `while` loop of `StarDataProcessor::process_core`.https://gitlab.cern.ch/YARR/YARR/-/issues/206TDAQ integration, discussion points2023-10-27T16:01:53+02:00Timon HeimTDAQ integration, discussion pointsOther related open issues #101 and #122
As discussed in todays meeting based on talk from Zhengcheng: https://indico.cern.ch/event/1328277/contributions/5644572/attachments/2741636/4769051/20231026_yarr_tdaq_ztao.pdf
Discussion questi...Other related open issues #101 and #122
As discussed in todays meeting based on talk from Zhengcheng: https://indico.cern.ch/event/1328277/contributions/5644572/attachments/2741636/4769051/20231026_yarr_tdaq_ztao.pdf
Discussion questions that came up:
- Make Yarr repo TDAQ aware vs. have a tdaqYarr repo that includes Yarr as library => tending towards using Yarr as library, generally high flexibility, need to make proper make structures
- DAL(c++) vs Python bindings as interface to tdaq?
- Check with FELIX is they can provide their packages with proper cmake import
- Check with micro services as they might act as middleman between tdaqYarr and tdaq
- Need to understand which parts of tdaq will be used for calibration
- Need to update/define API that is compatible with current SW-ROD -> need requirements for data taking to fully understand operation (monitoring, error handling, ...)
@bgallop @ztao @wittgen @spaganhttps://gitlab.cern.ch/YARR/YARR/-/issues/207Map front end Tx and Rx channel number to FELIX ID2023-11-14T16:59:10+01:00Zhengcheng TaoMap front end Tx and Rx channel number to FELIX IDCurrently in the FELIX client controller [configuration file](https://gitlab.cern.ch/YARR/YARR/-/blob/master/configs/controller/felix_client.json), we need to specify the detectorID (`did`) and the connectorID (`cid`) (both are zeros by ...Currently in the FELIX client controller [configuration file](https://gitlab.cern.ch/YARR/YARR/-/blob/master/configs/controller/felix_client.json), we need to specify the detectorID (`did`) and the connectorID (`cid`) (both are zeros by default). They help to convert the tx and rx channel numbers provided in the connectivity config to the 64-bit FELIX IDs (`fid`) that are used with the FELiX client for sending and subscribing to data. As a result, one scan console can only work with one logical FELIX device at a time (one physical FLX-712 card has two logical devices).
This restriction is not necessary if we provide directly the `fids` of the front ends, or specify `did` and `cid` per front end in the connectivity configs.
There is currently also a problem running with the second FELIX logical device (`-d 1` and `did=0; cid=1`). `FelixRxCore::on_data` has a bug that would mismatch the rx channel number [here](https://gitlab.cern.ch/YARR/YARR/-/blob/devel/src/libFelixClient/FelixRxCore.cpp?ref_type=heads#L175) when receiving data. A temporary fix just for this is to change `uint32_t mychn = (fid >> 16) & 0xffffffff;` to `uint32_t mychn = (fid >> 16) & 0x000fffff;` so the lowest bits of the connector ID is not included in the channel number.Zhengcheng TaoZhengcheng Taohttps://gitlab.cern.ch/YARR/YARR/-/issues/210switchLPM segfault when on/off not supplied2023-11-21T01:12:24+01:00Charles Elliott HultquistswitchLPM segfault when on/off not suppliedTrying to switch low power mode on/off segfaults when "on" or "off" is not supplied as an argument. It would be more helpful to have some error message instead of a scary segfaultTrying to switch low power mode on/off segfaults when "on" or "off" is not supplied as an argument. It would be more helpful to have some error message instead of a scary segfaulthttps://gitlab.cern.ch/YARR/YARR/-/issues/211LP digital scan should respect disabled core columns2024-03-08T16:48:36+01:00Lingxin MengLP digital scan should respect disabled core columnsWe have a large number of ITkPix modules with issues where core columns have to be disabled, otherwise the communication fails.
Due to the LP config of the chips, where all EnCoreCol is set to 0, the LP digital scan doesn't know about ba...We have a large number of ITkPix modules with issues where core columns have to be disabled, otherwise the communication fails.
Due to the LP config of the chips, where all EnCoreCol is set to 0, the LP digital scan doesn't know about bad core columns and thus runs over all core columns and cause the scan to fail.https://gitlab.cern.ch/YARR/YARR/-/issues/212Question, why `NetioTxCore::sendFifo()` trace log skips 1 byte in the FIFO?2023-11-30T13:15:07+01:00Alex ToldaievQuestion, why `NetioTxCore::sendFifo()` trace log skips 1 byte in the FIFO?The trace log in [NetioTxCore::sendFifo()](https://gitlab.cern.ch/YARR/YARR/-/blob/devel/src/libNetioHW/NetioTxCore.cpp?ref_type=heads#L202-L205) skips the first byte to be sent:
```
void NetioTxCore::sendFifo(){
...
nlog->trace(...The trace log in [NetioTxCore::sendFifo()](https://gitlab.cern.ch/YARR/YARR/-/blob/devel/src/libNetioHW/NetioTxCore.cpp?ref_type=heads#L202-L205) skips the first byte to be sent:
```
void NetioTxCore::sendFifo(){
...
nlog->trace("FIFO[{}][{}]: ", elink, this_fifo.size()-1);
for(uint32_t i=1; i<this_fifo.size(); i++){
nlog->trace("{:02x}", this_fifo[i]&0xFF);
}
}
```
Why is that so? I was testing how @ztao's StarFelixTriggerLoop sets up the strips encoder, and it was a bit confusing that one byte would disappear in the fifo printout:
```
[NetioHW::TxCore] NetioTxCore::writeFifo elink=15 val=0x10000102
[NetioHW::TxCore] NetioTxCore::releaseFifo
[NetioHW::TxCore] NetioTxCore::sendFifo
[NetioHW::TxCore] FIFO[0][3]:
[NetioHW::TxCore] 00
[NetioHW::TxCore] 00
[NetioHW::TxCore] 01
[NetioHW::TxCore] 01
[NetioHW::TxCore] 02
[NetioHW::TxCore] 02
```https://gitlab.cern.ch/YARR/YARR/-/issues/213StdDataLoop can report missing data in the log while all data are actually re...2023-12-04T19:00:09+01:00Zhengcheng TaoStdDataLoop can report missing data in the log while all data are actually received.`StdDataLoop` can print a series of `Data taking loop timed out, only received xxx of yyy events for channel with id zzz` error messages even when all hits are actually received.
This can happen when `thereIsStillTime` is false but `rece...`StdDataLoop` can print a series of `Data taking loop timed out, only received xxx of yyy events for channel with id zzz` error messages even when all hits are actually received.
This can happen when `thereIsStillTime` is false but `receivingRxData` is still true because the trigger loop is not done yet: https://gitlab.cern.ch/YARR/YARR/-/blob/1784e5a6c4df613af6782c435c61f9acb0eb5a1a/src/libYarr/StdDataLoop.cpp#L223
Tagging @otoldaiehttps://gitlab.cern.ch/YARR/YARR/-/issues/215eyeDiagram standard output + logger functionality2024-02-23T22:12:08+01:00Matthias SaimperteyeDiagram standard output + logger functionalityWould it be possible to add to the `eyeDiagram` the support for the `-l` option of `scanConsole`?
see similar issue for scanConsole (resolved now): https://gitlab.cern.ch/YARR/YARR/-/issues/125
Maybe it would be worth also defining a s...Would it be possible to add to the `eyeDiagram` the support for the `-l` option of `scanConsole`?
see similar issue for scanConsole (resolved now): https://gitlab.cern.ch/YARR/YARR/-/issues/125
Maybe it would be worth also defining a standard output which could be dumped to a specific directory with the `-o` option like `scanConsole`
This is useful for logging purposes when running 10's of eyediagrams per day :smile:
thanks a lot!Maria MironovaMaria Mironovahttps://gitlab.cern.ch/YARR/YARR/-/issues/216Yarr Docker Container2024-02-13T15:22:13+01:00Timon HeimYarr Docker ContainerShould deploy and register a docker container that includes a compiled version of Yarr compatible with the usage in the Microservices for ITk LLS. Should be added to CI and performed as part of master release.
@gbrandt can comment if th...Should deploy and register a docker container that includes a compiled version of Yarr compatible with the usage in the Microservices for ITk LLS. Should be added to CI and performed as part of master release.
@gbrandt can comment if there are specific requirements for the container.Gerhard Immanuel BrandtMatthias WittgenGerhard Immanuel Brandthttps://gitlab.cern.ch/YARR/YARR/-/issues/218smol tiny point-like smoke test2024-02-01T22:53:21+01:00Lingxin Mengsmol tiny point-like smoke testDevelop an even smaller test than minimum health test for an initial check on modules in the wirebonding stage
- potentially use the connectivity scan to check communication
- for ITkPixv2 read back the Iref trim - usage of database (to...Develop an even smaller test than minimum health test for an initial check on modules in the wirebonding stage
- potentially use the connectivity scan to check communication
- for ITkPixv2 read back the Iref trim - usage of database (tools) --> QC tool v3?https://gitlab.cern.ch/YARR/YARR/-/issues/220Implement core column test2024-02-23T22:10:34+01:00Maria MironovaImplement core column testImplement tool for identifying core columns with non-responding pixels
WIP MR: https://gitlab.cern.ch/YARR/YARR/-/merge_requests/728Implement tool for identifying core columns with non-responding pixels
WIP MR: https://gitlab.cern.ch/YARR/YARR/-/merge_requests/728Release v1.5.1Charles Elliott HultquistCharles Elliott Hultquisthttps://gitlab.cern.ch/YARR/YARR/-/issues/221Fix read register issue2024-02-23T22:11:06+01:00Maria MironovaFix read register issueUpdate read-register and write-register functionality to return success/fail of register read.
WIP MR: https://gitlab.cern.ch/YARR/YARR/-/merge_requests/724Update read-register and write-register functionality to return success/fail of register read.
WIP MR: https://gitlab.cern.ch/YARR/YARR/-/merge_requests/724Release v1.5.1Maria MironovaMaria Mironovahttps://gitlab.cern.ch/YARR/YARR/-/issues/222Inconsistent build and install directories if YARR is added as an external de...2024-02-13T19:56:08+01:00Zhengcheng TaoInconsistent build and install directories if YARR is added as an external dependency to another projectYARR sets build output directory to `${CMAKE_BINARY_DIR}/lib` and `${CMAKE_BINARY_DIR}/bin` [here](https://gitlab.cern.ch/YARR/YARR/-/blob/9a259bda7b8712270ef34708bc16bb3e2a5bfb0e/CMakeLists.txt#L33-35), but installs from `${PROJECT_BINA...YARR sets build output directory to `${CMAKE_BINARY_DIR}/lib` and `${CMAKE_BINARY_DIR}/bin` [here](https://gitlab.cern.ch/YARR/YARR/-/blob/9a259bda7b8712270ef34708bc16bb3e2a5bfb0e/CMakeLists.txt#L33-35), but installs from `${PROJECT_BINARY_DIR}/lib` and `${PROJECT_BINARY_DIR}/bin` [here](https://gitlab.cern.ch/YARR/YARR/-/blob/9a259bda7b8712270ef34708bc16bb3e2a5bfb0e/CMakeLists.txt#L145-147). In case YARR is no longer the top-level project, `${CMAKE_BINARY_DIR}` and `${PROJECT_BINARY_DIR}` differ, and installation can fail.https://gitlab.cern.ch/YARR/YARR/-/issues/223Felix-star support2024-02-23T22:10:51+01:00Maria MironovaFelix-star supporthttps://gitlab.cern.ch/YARR/YARR/-/merge_requests/721https://gitlab.cern.ch/YARR/YARR/-/merge_requests/721Release v1.5.1Angira RastogiAngira Rastogi