This note gives a set of guidelines and recommendations for coding
in C++ for the ATLAS experiment.
There are several reasons for maintaining and following a set of programming
guidelines. First, by following some rules, one can avoid some
common errors and pitfalls in C++ programming, and thus have more
reliable code. But even more important: a computer program should
not only tell the machine what to do, but it should also tell /other people/
what you want the machine to do. (For much more elaboration on this idea,
look up references on ``literate programming,'' such as [fn:knuth84].)
This is obviously important any time
when you have many people working on a given piece of software,
and such considerations would naturally lead to code that is easy
to read and understand. Think of writing ATLAS code as another
form of publication, and take the same care as you would writing
up an analysis for colleagues.
This document is derived from the original ATLAS C++ coding standard,
[[https://cds.cern.ch/record/685315][ATL-SOFT-2002-001]] [fn:atlas-old], which was last revised in 2003. This itself
derived from work done by the CERN ``Project support team''
and SPIDER project, as documented in [[http://pst.web.cern.ch/PST/HandBookWorkBook/Handbook/Programming/CodingStandard/c++standard.pdf][CERN-UCO/1999/207]] [fn:pst].
These previous guidelines have been significantly revised
to take into account the evolution of the C++ language [fn:cxx],
current practices in ATLAS, and the experience gained
over the past decade.
Some additional useful information on C++ programming may be
found in [fn:meyers1], [fn:meyers2], and [fn:gof].
This note is not intended to be a fixed set of rigid rules.
Rather, it should evolve as experience warrants.
* Naming
This section contains guidelines on how to name objects in a program.
** Naming of files
- *Each class should have one header file, ending with ``.h'', and
one implementation file, ending with ``.cxx''.* [<<source-naming>>]
Some exceptions: Small classes used as helpers for another class should
generally not go in their own file, but should instead be placed with the
larger class. Sometimes several very closely related classes
may be grouped together in a single file; in that case, the files should
be named after whichever is the ``primary'' class. A number of related
small helper classes (not associated with a particular larger class)
may be grouped together in a single file, which should be given
a descriptive name. An example of the latter could be a set of classes
used as exceptions for a package.
For classes in a namespace, the namespace should not be included
in the file name. For example, the header for =Trk::Track=
should be called =Track.h=.
Implementation (.cxx) files that would be empty may be omitted.
The use of the ``.h'' suffix for headers is long-standing ATLAS practice;
however, it is unfortunate since language-sensitive editors will then default to using
``C'' rather than ``C++'' mode for these files. For emacs, it can help
to put a line like this at the start of the file:
#+BEGIN_EXAMPLE
// This file is really -*- C++ -*-.
#+END_EXAMPLE
** Meaningful names
- *Choose names based on pronounceable English words, common
abbreviations, or acronyms widely used in the experiment, except*
It is of course acceptable to use stdio functions if you're calling
an external library that requires them.
Admittedly, formatting using C++-style streams is more cumbersome
than a C-style format list. If you want to use =printf= style formatting,
see ``CxxUtils/StrFormat.h''.
- *Do not use the ellipsis notation for function arguments.* [<<no-ellipsis>>]
Functions with an unspecified number of arguments should not be used
because they are a common cause of bugs that are hard to find. But
=catch(...)= to catch any exception is acceptable (but should generally
not be used outside of framework code).
#+BEGIN_EXAMPLE
// avoid to define functions like:
void error(int severity, ...) // "severity" followed by a
// zero-terminated list of char*s
#+END_EXAMPLE
- *Do not use preprocessor macros to take the place of functions, or for defining constants.* [<<no-macro-functions>>]
Use templates or inline functions rather than the pre-processor macros.
#+BEGIN_EXAMPLE
// NOT recommended to have function-like macro
#define SQUARE(x) x*x
Better to define an inline function:
inline int square(int x) {
return x*x;
};
#+END_EXAMPLE
- *Do not declare related numerical values as const. Use enum declarations.* [<<use-enum>>]
The enum construct allows a new type to be defined and hides the
numerical values of the enumeration constants.
#+BEGIN_EXAMPLE
enum State {halted, starting, running, paused};
#+END_EXAMPLE
- *Do not use NULL to indicate a null pointer; use the nullptr keyword instead.* [<<nullptr>>]
=nullptr= is new in C++11. If your code must compile with older
versions, use the constant 0 instead.
- *Do not use const char** *or built-in arrays ``[]''; use* =std::string= *instead.* [<<use-std-string>>]
One thing to be aware of, though. C++ will implicitly convert
a =const char*= to a =std::string=; however, this may add significant
overhead if used in a loop. For example:
#+BEGIN_EXAMPLE
void do_something (const std::string& s);
...
for (int i=0; i < lots; i++) {
...
do_something ("hi there!");
#+END_EXAMPLE
Each time through the loop, this will make a new =std::string= copy
of the literal. Better to move the conversion to =std::string= outside
of the loop:
#+BEGIN_EXAMPLE
std::string myarg = "hi there!";
for (int i=0; i < lots; i++) {
...
do_something (myarg);
#+END_EXAMPLE
- *Avoid use union types.* [<<avoid-union-types>>]
Unions can be an indication of a non-object-oriented design that is
hard to extend. The usual alternative to unions is inheritance and
dynamic binding. The advantage of having a derived class representing
each type of value stored is that the set of derived class can be
extended without rewriting any code. Because code with unions is only
slightly more efficient, but much more difficult to maintain, you
should avoid it.
Unions may be used in some low-level code and in places where
efficiency is particularly important.
- *Do not use asm (the assembler macro facility of C++).* [<<no-asm>>]
For those rare use cases where an =asm= might be needed, the use
of the =asm= should be encapsulated and made available in a low-level
package (such as CxxUtils).
- *Do not use the keyword struct for types used as classes.* [<<no-struct>>]
The =class= keyword is identical to =struct= except that by default its
contents are private rather than public. =struct= may be allowed for
writing non-object-oriented PODs (plain old data, i.e. C structs) on
purpose. It is a good indication that the code is on purpose not
object-oriented.
- *Do not use static objects at file scope. Use an anonymous namespace instead.*
[<<anonymous-not-static>>]
The use of =static= to signify that something is private to a source file
is obsolete; further it cannot be used for types. Use an anonymous
namespace instead.
For entities which are not public but are also not really part of a class,
prefer putting them in an anonymous namespace to putting them in a class.
That way, they won't clutter up the header file.
- *Do not declare your own typedef for booleans. Use the bool type of C++ for booleans.* [<<use bool>>]
The =bool= type was not implemented in C. Programmers usually got around
the problem by typedefs and/or const declarations. This is no longer
needed, and must not be used in ATLAS code.
- *Avoid pointer arithmetic.*
Pointer arithmetic reduces readability, and is extremely error prone.
It should be avoid outside of low-level code.
** Readability and maintainability
- *Code should compile with no warnings.* [<<no-warnings>>]
Many compiler warnings can indicate potentially serious problems
with your code. But even if a particular warning is benign, it should
be fixed, if only to prevent other people from having to spend time
examining it in the future.
Warnings coming from external libraries should be reported to whomever
is maintaining the ATLAS wrapper package for the library. Even if the
library itself can't reasonably be fixed, it may be possible to put
a workaround in the wrapper package to suppress the warning.
See [fn:warnings] for help on how to get rid of many common types of warning.
If it is really impossible to get rid of a warning, that fact should
be documented in the code.
- *Keep functions short.* [<<short-functions>>]
Short functions are easier to read and reason about. Ideally, a single
function should not be bigger than can fit on one screen (i.e., not more
than 30--40 lines).
- *Avoid excessive nesting of indentation.* [<<excessive-nesting>>]
It becomes difficult to follow the control flow in a function when it
becomes deeply nested. If you have more than 4--5 indentation levels,
consider splitting off some of the inner code into a separate function.
- *Avoid duplicated code.* [<<avoid-duplicate>>]
This statement has a twofold meaning.
The first and most evident is that one must avoid simply cutting and
pasting pieces of code. When similar functionalities are necessary in
different places, they should be collected in methods, and reused.
The second meaning is at the design level, and is the concept of code reuse.
Reuse of code has the benefit of making a program easier to understand
and to maintain. An additional benefit is better quality because code
that is reused gets tested much better.
Code reuse, however, is not the end-all goal, and in particular, it is
less important than encapsulation. One should not use inheritance to
reuse a bit of code from another class.
- *Document in the code any cases where clarity has been sacrificed for performance.* [<<document-changes-for-performance>>]
Optimize code only when you know you have a performance problem. This
means that during the implementation phase you should write code that
is easy to read, understand, and maintain. Do not write cryptic code,
just to improve its performance.
Very often bad performance is due to bad design. Unnecessary copying
of objects, creation of large numbers of temporary objects,
improper inheritance, and a poor choice of algorithms, for example, can be
rather costly and are best addressed at the architecture and design
level.
- *Avoid using typedef to create aliases for classes.* [<<avoid-typedef>>]
Typedefs are a serious impediment in large systems. While they
simplify code for the original author, a system filled with typedefs
can be difficult to understand. If the reader encounters a class =A=, he
or she can find an =#include= with ``A.h'' in it to locate a description of =A=;
but typedefs carry no context that tell a reader where to find a
definition. Moreover, most of the generic characteristics obtained
with typedefs are better handled by object oriented techniques, like
polymorphism.
A typedef statement, which is declared within a class as private or
protected, is used within a limited scope and is therefore acceptable.
Typedefs may be used to provide names expected by STL algorithms
(=value_type=, =const_iterator=, etc.) or to shorten cumbersome
STL container names.
Typedefs may also be used to identify particular uses of integral types.
For example, the auxiliary store code uses integers to identify particular
auxiliary data items. But rather than declaring these as an integer type
directly, a typedef =auxid_t= is used. This makes explicit what variables
of that type are meant to be.
An exception to this are the =typedef= definitions used by the xAOD code.
- *Code should use the standard ATLAS units for time, distance, energy, etc.* [<<atlas-units>>]
As a reminder, energies are represented as MeV and lengths as mm.
Please use the symbols defined in =GaudiKernel/SystemOfUnits.h=.
#+BEGIN_EXAMPLE
#include "GaudiKernel/SystemOfUnits.h"
float pt_thresh = 20 * Gaudi::Units::GeV;
float ip_cut = 0.1 * Gaudi::Units::cm;
#+END_EXAMPLE
** Portability
- *All code must comply with the 2011 version of the ISO C++ standard (C++11)*. [<<standard-cxx>>]
A draft of the standard which is essentially identical to the final version
may be found at [fn:cxx].
Some code may need to remain compatible with C++98 for a limited time.
At some point, compatibility with C++14 will also be required.
- *Make non-portable code easy to find and replace.* [<<limit-non-portable-code>>]
Non-portable code should preferably be factored out into a low-level package
in Control, such as CxxUtils. If that is not possible, an =#ifdef= may
be used.
However, =#ifdefs= can make a program completely unreadable. In
addition, if the problems being solved by the =#ifdef= are not solved
centrally by the release tool, then you resolve the problem over and
over. Therefore. the using of =#ifdef= should be limited.
- *Headers supplied by the implementation (system or standard libraries header
files) must go in* =<>= *brackets; all other headers must go in* "" *quotes.* [<<system-headers>>]
#+BEGIN_EXAMPLE
// Include only standard header with <>
#include <iostream> // OK: standard header
#include <MyFyle.hh> // NO: nonstandard header
// Include any header with ""
#include "stdlib.h" // NO: better to use <>
#include "MyPackage/MyFyle.h" // OK
#+END_EXAMPLE
- *Do not specify absolute directory names in include directives. Instead, specify only the terminal package name and the file name.* [<<include-path>>]
Absolute paths are specific to a particular machine and will likely
fail elsewhere.
The ATLAS convention is to include the package name followed by the file name.
Watch out: if you list the package name twice, compilation will work
with CMT, but it may not work with other build systems.