Commit 39e88520 authored by Simon Spannagel's avatar Simon Spannagel
Browse files

Manual: add the relevant section on config file parsing from the APSQ manual

parent 8c8568fe
Pipeline #1102790 passed with stages
in 15 minutes and 32 seconds
......@@ -119,6 +119,91 @@ my_switch = true
my_other_switch = 0
\end{minted}
\subsection{File format}
\label{sec:config_file_format}
Throughout the framework, a simplified version of TOML~\cite{tomlgit} is used as standard format for configuration files.
The format is defined as follows:
\begin{enumerate}
\item All whitespace at the beginning or end of a line are stripped by the parser.
In the rest of this format specification the \textit{line} refers to the line with this whitespace stripped.
\item Empty lines are ignored.
\item Every non-empty line should start with either \texttt{\#}, \texttt{[} or an alphanumeric character.
Every other character should lead to an immediate parse error.
\item If the line starts with a hash character (\texttt{\#}), it is interpreted as comment and all other content on the same line is ignored.
\item If the line starts with an open square bracket (\texttt{[}), it indicates a section header (also known as configuration header).
The line should contain a string with alphanumeric characters and underscores, indicating the header name, followed by a closing square bracket (\texttt{]}), to end the header.
After any number of ignored whitespace characters there could be a \texttt{\#} character.
If this is the case, the rest of the line is handled as specified in point~3.
Otherwise there should not be any other character (except the whitespace) on the line.
Any line that does not comply to these specifications should lead to an immediate parse error.
Multiple section headers with the same name are allowed.
All key-value pairs following this section header are part of this section until a new section header is started.
\item If the line starts with an alphanumeric character, the line should indicate a key-value pair.
The beginning of the line should contain a string of alphabetic characters, numbers, dots (\texttt{.}), colons (\texttt{\:}) and underscores (\texttt{\_}), but it may only start with an alphanumeric character.
This string indicates the 'key'.
After an optional number of ignored whitespace, the key should be followed by an equality sign (\texttt{$=$}).
Any text between the \texttt{$=$} and the first \texttt{\#} character not enclosed within a pair of single or double quotes (\texttt{'} or \texttt{"}) is known as the non-stripped string.
Any character after the \texttt{\#} is handled as specified in point 3.
If the line does not contain any non-enclosed \texttt{\#} character, the value ends at the end of the line instead.
The 'value' of the key-value pair is the non-stripped string with all whitespace in front and at the end stripped.
The value may not be empty.
Any line that does not comply to these specifications should lead to an immediate parse error.
\item The value may consist of multiple nested dimensions which are grouped by pairs of square brackets (\texttt{[} and \texttt{]}).
The number of square brackets should be properly balanced, otherwise an error is raised.
Square brackets which should not be used for grouping should be enclosed in quotation marks.
Every dimension is split at every whitespace sequence and comma character (\texttt{,}) not enclosed in quotation marks.
Implicit square brackets are added to the begin and end of the value, if these are not explicitly added.
A few situations require explicit addition of outer brackets such as matrices with only one column element, i.e. with dimension 1xN.
\item The sections of the value which are interpreted as separate entities are named elements.
For a single value the element is on the zeroth dimension, for arrays on the first dimension and for matrices on the second dimension.
Elements can be forced by using quotation marks, either single or double quotes (\texttt{'} or \texttt{"}).
The number of both types of quotation marks should be properly balanced, otherwise an error is raised.
The conversion to the elements to the actual type is performed when accessing the value.
\item All key-value pairs defined before the first section header are part of a zero-length empty section header.
\end{enumerate}
\subsection{Accessing parameters}
\label{sec:accessing_parameters}
Values are accessed via the configuration object.
In the following example, the key is a string called \parameter{key}, the object is named \parameter{config} and the type \parameter{TYPE} is a valid C++ type the value should represent.
The values can be accessed via the following methods:
\begin{minted}[frame=single,framesep=3pt,breaklines=true,tabsize=2,linenos]{c++}
// Returns true if the key exists and false otherwise
config.has("key")
// Returns the number of keys found from the provided initializer list:
config.count({"key1", "key2", "key3"});
// Returns the value in the given type, throws an exception if not existing or a conversion to TYPE is not possible
config.get<TYPE>("key")
// Returns the value in the given type or the provided default value if it does not exist
config.get<TYPE>("key", default_value)
// Returns an array of elements of the given type
config.getArray<TYPE>("key")
// Returns a matrix: an array of arrays of elements of the given type
config.getMatrix<TYPE>("key")
// Returns an absolute (canonical if it should exist) path to a file
config.getPath("key", true /* check if path exists */)
// Return an array of absolute paths
config.getPathArray("key", false /* do not check if paths exists */)
// Returns the value as literal text including possible quotation marks
config.getText("key")
// Set the value of key to the default value if the key is not defined
config.setDefault("key", default_value)
// Set the value of the key to the default array if key is not defined
config.setDefaultArray<TYPE>("key", vector_of_default_values)
// Create an alias named new_key for the already existing old_key or throws an exception if the old_key does not exist
config.setAlias("new_key", "old_key")
\end{minted}
Conversions to the requested type are using the \parameter{from_string} and \parameter{to_string} methods provided by the framework string utility library.
These conversions largely follow standard C++ parsing, with one important exception.
If (and only if) the value is retrieved as a C/C++ string and the string is fully enclosed by a pair of \texttt{"} characters, these are stripped before returning the value.
Strings can thus also be provided with or without quotation marks.
\begin{warning}
It should be noted that a conversion from string to the requested type is a comparatively heavy operation.
For performance-critical sections of the code, one should consider fetching the configuration value once and caching it in a local variable.
\end{warning}
\section{Main configuration}
\label{sec:main_config}
The main configuration consists of a set of sections specifying the modules used.
......
......@@ -89,3 +89,12 @@
month = dec,
day = {12}
}
@online{tomlgit,
title = {TOML},
subtitle = {Tom's Obvious, Minimal Language},
author = {Tom Preston-Werner},
url = {https://github.com/toml-lang/toml},
publisher = {Github},
journal = {Github repository},
commit = {e5d623ecdc26327699157381bde3ccd6ed8c67de}
}
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment