framework.tex 20.3 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
\chapter{The \corrybold Framework}
\label{ch:framework}

This chapter provides some crucial information about the framework, its executables and the available configuration parameters which are processed on a global level.

\section{The \texttt{corry} Executable}
\label{sec:executable}
The \corry executable \texttt{corry} functions as the interface between the user and the framework.
It is primarily used to provide the main configuration file, but also allows to add and overwrite options from the main configuration file.
This is both useful for quick testing as well as for batch processing of many runs to be reconstructed.

The executable handles the following arguments:
\begin{itemize}
\item \texttt{-c <file>}: Specifies the configuration file to be used for the reconstruction, relative to the current directory.
This is the only \textbf{required} argument, the simulation will fail to start if this argument is not given.
\item \texttt{-l <file>}: Specify an additional location such as a file to forward log output to. This is used as additional destination alongside the standard output and the location specified in the framework parameters described in Section~\ref{sec:framework_parameters}.
\item \texttt{-v <level>}: Sets the global log verbosity level, overwriting the value specified in the configuration file described in Section~\ref{sec:framework_parameters}.
Possible values are \texttt{FATAL}, \texttt{STATUS}, \texttt{ERROR}, \texttt{WARNING}, \texttt{INFO} and \texttt{DEBUG}, where all options are case-insensitive.
The module specific logging level introduced in Section~\ref{sec:logging_verbosity} is not overwritten.
20
\item \texttt{-o <option>}: Passes extra framework or module options which are added and overwritten in the main configuration file.
21
This argument may be specified multiple times, to add multiple options.
22
23
Options are specified as key/value pairs in the same syntax as used in the configuration files described in Chapter~\ref{ch:configuration_files}, but the key is extended to include a reference to a configuration section or instantiation in shorthand notation.
There are three types of keys that can be specified:
24
\begin{itemize}
25
26
27
\item Keys to set \textbf{framework parameters}. These have to be provided in exactly the same way as they would be in the main configuration file (a section does not need to be specified). An example to overwrite the standard output directory would be \texttt{corry -c <file> -o output\_directory="run123456"}.
\item Keys for \textbf{module configurations}. These are specified by adding a dot (\texttt{.}) between the module and the actual key as it would be given in the configuration file (thus \textit{module}.\textit{key}). An example to overwrite the deposited particle to a positron would be \texttt{corry -c <file> -o Tracking4D.timingCut="100ns"}.
\item Keys to specify values for a particular \textbf{module instantiation}. The identifier of the instantiation and the name of the actual key are split by a dot (\texttt{.}), in the same way as for keys for module configurations (thus \textit{identifier}.\textit{key}). The unique identifier for a module can contains one or more colons (\texttt{:}) to distinguish between various instantiations of the same module. The exact name of an identifier depends on the name of the detector and the optional input and output name. Those identifiers can be extracted from the logging section headers. An example to change the temperature of propagation for a particular instantiation for a detector named \textit{dut} could be \texttt{corry -c <file> -o Clustering4D:dut.timingCut="1us"}.
28
29
\end{itemize}
Note that only the single argument directly following the \texttt{-o} is interpreted as the option. If there is whitespace in the key/value pair this should be properly enclosed in quotation marks to ensure the argument is parsed correctly.
30
31
32
33
\item \texttt{-g <option>}: Passes extra detector options which are added and overwritten in the detector configuration file.
This argument can be specified multiple times, to add multiple options.
The options are parsed in the same way as described above for module options, but only one type of key can be specified to overwrite an option for a single detector.
These are specified by adding a dot (\texttt{.}) between the detector and the actual key as it would be given in the detector configuration file (thus \textit{detector}.\textit{key}). An example to change the orientation of a particular detector named \texttt{detector1} would be \texttt{corry -c <file> -g detector1.orientation=0deg,0deg,45deg}.
34
35
36
37
38
39
40
41
\end{itemize}

No interaction with the framework is possible during the reconstruction. Signals can however be send using keyboard shortcuts to terminate the run, either gracefully or with force. The executable understand the following signals:
\begin{itemize}
\item \texttt{CTRL+C} (SIGINT): Request a graceful shutdown of the reconstruction. This means the currently processed event is finished, while all other events requested in the configuration file are ignored. After finishing the event, the finalization stage is executed for every module to ensure all modules finish properly.
\item \texttt{CTRL+\textbackslash} (SIGQUIT): Forcefully terminates the framework. It is not recommended to use this signal as it will normally lead to the loss of all generated data. This signal should only be used when graceful termination is for any reason not possible.
\end{itemize}

Simon Spannagel's avatar
Simon Spannagel committed
42
43
\section{The Clipboard}

44
45
46
47
48
49
50
51
52
53
54
55
56
57
The clipboard is the framework's infrastructure for temporary storing information during the event processing.
Every module can access the clipboard and both read and write information.
Collections or individual elements on the clipboard are accessed via their name, and is internally stored as map.

The clipboard consists of two parts, a temporary storage and a persistent storage space.

\subsection{Temporary Data Storage}
The temporary data storage is only available during the processing of a single event.
It is automatically cleared at the end of the event processing and has to be populated with new data in the new event to be processed.
The temporary storage acts as the main data structure to communicate information between different modules and can hold multiple collections of \corry objects such as pixel hits, clusters or tracks.

\subsection{Persistent Storage}
The persistent storage is not cleared at the end of each event processing and can be used to store information used in multiple events.
Currently this storage only allows for the caching of double-precision floating point numbers.
Simon Spannagel's avatar
Simon Spannagel committed
58
59
60


\section{Global Framework Parameters}
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
\label{sec:framework_parameters}
The \corry framework provides a set of global parameters which control and alter its behavior. These parameters are inherited by all modules.
The currently available global parameters are:

\begin{itemize}
\item \parameter{detectors_file}: Location of the file describing the detector configuration described in Section~\ref{sec:detector_config}.
The only \textit{required} global parameter: the framework will fail to start if it is not specified.
\item \parameter{detectors_file_updated}: Location of the file that the (potentially) updated detector configuration should be written into. If this file does not already exist, it will be created. If the same file is given as for \parameter{detectors_file}, the file is overwritten with the updated values.
\item \parameter{histogram_file}: Location of the file where the ROOT output histograms of all modules will be written to. The file extension \texttt{.root} will be appended if not present. Directories within the ROOT file will be created automatically for all modules.
\item \parameter{number_of_events}: Determines the total number of events the framework should process, negative numbers allow processing of all data available.
After reaching the specified number of events, reconstruction is stopped.
Defaults to $-1$.
\item \parameter{number_of_tracks}: Determines the total number of tracks the framework should reconstruct, negative numbers indicate no limit on the number of reconstructed tracks.
After reaching the specified number of events, reconstruction is stopped.
Defaults to $-1$.
\item \parameter{run_time}: Determines the wall-clock time of data acquisition the framework should reconstruct up until. Negative numbers inidcate no limit on the time slice to reconstruct.
Defaults to $-1$.
\item \parameter{log_level}: Specifies the lowest log level which should be reported.
Possible values are \texttt{FATAL}, \texttt{STATUS}, \texttt{ERROR}, \texttt{WARNING}, \texttt{INFO} and \texttt{DEBUG}, where all options are case-insensitive.
Defaults to the \texttt{INFO} level.
More details and information about the log levels, including how to change them for a particular module, can be found in Section~\ref{sec:logging_verbosity}.
Can be overwritten by the \texttt{-v} parameter on the command line (see Section~\ref{sec:executable}).
\item \parameter{log_format}: Determines the log message format to display.
Possible options are \texttt{SHORT}, \texttt{DEFAULT} and \texttt{LONG}, where all options are case-insensitive.
More information can be found in Section~\ref{sec:logging_verbosity}.
\item \parameter{log_file}: File where the log output should be written to in addition to printing to the standard output (usually the terminal).
Another (additional) location to write to can be specified on the command line using the \texttt{-l} parameter (see Section~\ref{sec:executable}).
\item \parameter{library_directories}: Additional directories to search for module libraries, before searching the default paths.
\end{itemize}

91
92
93
94
95
96
\section{Modules and the Module Manager}
\label{sec:module_manager}
\corry is a modular framework and one of the core ideas is to partition functionality in independent modules which can be inserted or removed as required.
These modules are located in the subdirectory \textit{src/modules/} of the repository, with the name of the directory the unique name of the module.
The suggested naming scheme is CamelCase, thus an example module name would be \textit{OnlineMonitor}.
A \emph{specifying} part of a module name should precede the \emph{functional} part of the name, e.g.\ \textit{EventLoaderCLICpix2} rather than \textit{CLICpix2EventLoader}.
Simon Spannagel's avatar
Simon Spannagel committed
97
There are three different kind of modules which can be defined:
98
99
100
101
102
103
104
105
106
\begin{itemize}
    \item \textbf{Global}: Modules for which a single instance runs, irrespective of the number of detectors.
    \item \textbf{Detector}: Modules which are concerned with only a single detector at a time.
    These are then replicated for all required detectors.
    \item \textbf{DUT}: Similar to the Detector modules, these modules run for a single detector.
    However, they are only replicated if the respective detector is marked as DUT.
\end{itemize}
The type of module determines the constructor used, the internal unique name and the supported configuration parameters.
For more details about the instantiation logic for the different types of modules, see Section~\ref{sec:module_instantiation}.
107

Simon Spannagel's avatar
Simon Spannagel committed
108
109
110
111
112
113
114
115
116
117
118
119
120
121
\subsection{Module Status Codes}

The \command{run()} function of each module returns a status code to the module manager, indicating the status of the module.
These codes can be used to e.g. request an end of the current run, or to signal problems in data processing.
The following status codes are currently supported:

\begin{description}
    \item{\command{Success}}: Indicates that the module successfully finished processing all data. The framework continues with the next module.
    \item{\command{NoData}}: Indicates that the respective module did not find any data to process. The framework continues with the next module.
    \item{\command{DeadTime}}: Indicates that the detector handled by the respective module currently is in data acquisition dead time. The framework skips all remaining modules for this event and continues with the subsequent event. By this, measurements such as efficiency are not affected by known inefficiencies during detector dead times.
    \item{\command{EndRun}}: Allows the module to request a premature end of the run. This can e.g.\ be used by alignment modules to stop the run after they have accumulated enough tracks for the alignment procedure. The framework executes all modules for the current event and then enters the finalization stage.
    \item{\command{Failure}}: Indicates that there was a severe problem when processing data in the respective module. The framework skips all remaining modules for this event end enters the finalization stage.
\end{description}

Simon Spannagel's avatar
Simon Spannagel committed
122
123
124
125
126
127
\subsection{Execution Order}

Modules are executed in the order in which they appear in the configuration file, the sequence is repeated for every time frame or event.
If one module in the chain raises e.g.\ a \command{DeadTime} status, subsequent modules are not executed anymore.
This should be taken into account when choosing the order of modules in the configuration, since e.g.\ data from detectors catered by subsequent event loaders is not processed and hit maps are not updated if an earlier module requested to skip the rest of the module chain.

128
129
130
131
\subsection{Module instantiation}
\label{sec:module_instantiation}
Modules are dynamically loaded and instantiated by the Module Manager.
They are constructed, initialized, executed and finalized in the linear order in which they are defined in the configuration file; for this reason the configuration file should follow the order of the real process.
132
For each section in the main configuration file (see~\ref{ch:configuration_files} for more details), a corresponding library is searched for which contains the module (the exception being the global framework section).
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
Module libraries are always named following the scheme \textbf{libCorryvreckanModule\texttt{ModuleName}}, reflecting the \texttt{ModuleName} configured via CMake.
The module search order is as follows:
\begin{enumerate}
\item Modules already loaded before from an earlier section header
\item All directories in the global configuration parameter \parameter{library_directories} in the provided order, if this parameter exists.
\item \item The internal library paths of the executable, that should automatically point to the libraries that are built and installed together with the executable.
These library paths are stored in \dir{RPATH} on Linux, see the next point for more information.
\item The other standard locations to search for libraries depending on the operating system.
Details about the procedure Linux follows can be found in~\cite{linuxld}.
\end{enumerate}

If the loading of the module library is successful, the module is checked to determine if it is a global, detector or DUT module.
As a single module may be called multiple times in the configuration, with overlapping requirements (such as a module which runs on all detectors of a given type, followed by the same module but with different parameters for one specific detector, also of this type) the Module Manager must establish which instantiations to keep and which to discard.
The instantiation logic determines a unique name and priority, where a lower number indicates a higher priority, for every instantiation.
The name and priority for the instantiation are determined differently for the two types of modules:
\begin{itemize}
\item \textbf{Global}: Name of the module, the priority is always \emph{high}.
\item \textbf{Detector/DUT}: Combination of the name of the module and the name of detector this module is executed for.
If the name of the detector is specified directly by the \parameter{name} parameter, the priority is \emph{high}.
If the detector is only matched by the \parameter{type} parameter, the priority is \emph{medium}.
If the \parameter{name} and \parameter{type} are both unspecified and the module is instantiated for all detectors, the priority is \emph{low}.
\end{itemize}
In the end, only a single instance for every unique name is allowed.
If there are multiple instantiations with the same unique name, the instantiation with the highest priority is kept.
If multiple instantiations with the same unique name and the same priority exist, an exception is raised.
158

159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
\section{Logging and Verbosity Levels}
\label{sec:logging_verbosity}
\corry is designed to identify mistakes and implementation errors as early as possible and to provide the user with clear indications about the problem.
The amount of feedback can be controlled using different log levels which are inclusive, i.e.\ lower levels also include messages from all higher levels.
The global log level can be set using the global parameter \parameter{log_level}.
The log level can be overridden for a specific module by adding the \parameter{log_level} parameter to the respective configuration section.
The following log levels are supported:
\begin{itemize}
\item \textbf{FATAL}: Indicates a fatal error that will lead to direct termination of the application.
Typically only emitted in the main executable after catching exceptions as they are the preferred way of fatal error handling.
An example of a fatal error is an invalid configuration parameter.
\item \textbf{STATUS}: Important information about the status of the reconstruction.
Is only used for messages which have to be logged in every run such as initial information on the module loading, opened data files and the current progress of the run.
\item \textbf{ERROR}: Severe error that should not occur during a normal well-configured reconstruction.
Frequently leads to a fatal error and can be used to provide extra information that may help in finding the problem (for example used to indicate the reason a dynamic library cannot be loaded).
\item \textbf{WARNING}: Indicate conditions that should not occur normally and possibly lead to unexpected results.
The framework will however continue without problems after a warning.
A warning is for example issued to indicate that a calibration file for a certain detector cannot be found and that the reconstruction is therefore performed with uncalibrated data.
\item \textbf{INFO}: Information messages about the reconstruction process.
Contains summaries of the reconstruction details for every event and for the overall run.
Should typically produce maximum one line of output per event and module.
\item \textbf{DEBUG}: In-depth details about the progress of the reconstruction such as information on every cluster formed or on track fitting results.
Produces large volumes of output per event, and should therefore only be used for debugging the reconstruction.
\item \textbf{TRACE}: Messages to trace what the framework or a module is currently doing.
Unlike the \textbf{DEBUG} level, it does not contain any direct information about the physics but rather indicates which part of the module or framework is currently running.
Mostly used for software debugging or determining performance bottlenecks in the simulations.
\end{itemize}

\begin{warning}
    It is not recommended to set the \parameter{log_level} higher than \textbf{WARNING} in a typical reconstruction as important messages may be missed.
    Setting too low logging levels should also be avoided since printing many log messages will significantly slow down the simulation.
\end{warning}

The logging system supports several formats for displaying the log messages.
The following formats are supported via the global parameter \parameter{log_format} or the individual module parameter with the same name:
\begin{itemize}
\item \textbf{SHORT}: Displays the data in a short form.
Includes only the first character of the log level followed by the configuration section header and the message.
\item \textbf{DEFAULT}: The default format.
Displays system time, log level, section header and the message itself.
\item \textbf{LONG}: Detailed logging format.
Displays all of the above but also indicates source code file and line where the log message was produced.
This can help in debugging modules.
\end{itemize}
203

Simon Spannagel's avatar
Simon Spannagel committed
204
\section{Coordinate Systems}
205
206
207
208
209
210
211
212
\label{sec:coordinate_systems}

Local coordinate systems for each detector and a global frame of reference for the full setup are defined.
The global coordinate system is chosen as a right-handed Cartesian system, and the rotations of individual devices are performed around the geometrical center of their sensor.

Local coordinate systems for the detectors are also right-handed Cartesian systems, with the x- and y-axes defining the sensor plane.
The origin of this coordinate system is the center of the lower left pixel in the grid, i.e.\ the pixel with indices (0,0).
This simplifies calculations in the local coordinate system as all positions can either be stated in absolute numbers or in fractions of the pixel pitch.