API: add /pipelines/stats route
/api/pipelines/stats
can be queried by GET or POST:
-
GET
-
pipeline_id
: single pipeline ID -
production_id
: single production ID
-
-
POST
-
pipeline_ids
: list of pipeline IDs -
production_ids
: list of production IDs
-
It returns JSON like this: https://lhcb-analysis-productions.web.cern.ch/2423007/#XicpStStToXicpPiPi
{2423007: {458: {'full_input_size': 203498026247347, # i.e. 203.498 TB
'full_est_output_size': 1126734635401, # i.e. 1.126 TB
'test_input_size': 40085679472, # 40.08 GB
'test_output_size': 1466438294, # 1.46 GB
'test_total_processed': 1644273}}} # 1.6M events
where sizes are in bytes. It can be used to aggregate statistics about:
- How many events were processed across all production test jobs.
- How much data was processed across all tests.
- How much data was produced across all tests.
- How much data will be processed by the full production.
- Estimated output size expected to be created by the full production.
If a single Analysis Production has many jobs than one person can comfortably examine by eye, these stats should come in handy.
Relates to lhcb-dpa/project#139
Edited by Ryunosuke O'Neil