Re-work CTA log line format
We have agreed to make the following changes to CTA logging:
Tasks
-
Update the CTA logging code to support JSON output -
Add a switch in the config to toggle between the present CTA format or JSON format -
Update the system tests and unit tests
Update the CTA logging code to support JSON output
The support for JSON should be added as a supplement to the present log format.
Most of the frontend logging code can be found at: ./common/log For cta-taped the logging code may be found in TODO
As discussed, the JSON logs should have the following properties:
- One json object per line (each starts with
{
and ends with}
) - All values should be strings. No integers/floats, due to the way json parsers may handle large numbers.
- There should be two timestamp values:
-
time
: Where the value consists the epoch time in seconds and the nano-second time, separated by a.
. Example:"time":"1694677443.820410000"
-
local_time
: A human-readable ISO 8601 datetime. Example:"local_time":"2023-10-06T14:44:18+01:00"
-
Example log line:
{"time":"1694677443.820410000", "local_time":"2023-09-14T09:44:03+01:00" "severity":"warning", "pname":"cta-taped", "pid":"3435", "tid":"3491", "message":"In Scheduler::authorizeAdmin(): success.", "user":"tape-local", "host":"tpsrv315.cern.ch" }
Add a switch in the config to toggle between the present CTA format or JSON format
The CTA frontend output file path is specified in the CTA config file at ./xroot_plugins/cta-frontend-xrootd.conf.example
:
# CTA Frontend log URL
cta.log.url file:/var/log/cta/cta-frontend.log
There should be an option right before/after it, which should be used to toggle between the present log format and the JSON format. Perhaps something like cta.log.format <default/json>
, which defaults to default
if the option is not set in the conf file.
Similarly, for the tape server there is the file ./tapesever/cta-taped.sysconfig
, where the output file may be specified using CTA_TAPED_OPTIONS="--log-to-file=/var/log/cta/cta-taped.log"
. A similar option to toggle default/json format here would be good.
Update the system tests and unit tests
Our system and unit tests partially rely on the log output. These will have to be adjusted such that they handle both the present CTA format and the new JSON format.
Tests can be found in ./tests
Background
With the transfer of log-writing from rsyslog
to the cta-frontend + cta-taped processes themselves at CERN we will have to re-examine the log format that is output by CTA.
In production we presently rely on rsyslog to:
- Receive the CTA log message
- Add an epoch timestamp with nanosecond resolution
- Write the log entry to the file
- Transfer the log message to a log aggregation instance
Correct timestamps are critical for our monitoring and alerting setup. So, with steps 1-3 now going away we will need to either change the way we use rsyslog, or change the way CTA outputs logs.
Thus far rsyslog has had no understanding of the CTA logs: It has not parsed the fields such as severity, nor the timestamp. Instead, messages have been consumed and written 'as-is', with timestamps written by rsyslog being based on the point in time when rsyslog receives the log message. This way of doing it has issues, because:
- The timestamp registered on the producing machine may not be correct when CTA writes the logs. For instance, what happens if rsyslog stops temporarily and CTA continues writing logs? With the present setup the log entry time will be the point in time of rsyslog restarting.
- The log aggregation server uses the time it receives the log entry. Which means that the rsyslog-generated timestamp differs from the CTA one, due to network delays, local buffering, etc.