cta-statistics-update can fail for catalogues in postgres
Summary
The cta-statistics update
command line tool can fail with the following error when run against a postgres DB:
Aborting: virtual void cta::statistics::DatabaseStatisticsService::updateStatisticsPerTape(): executeNonQuery failed: executeNonQuery failed for SQL statement UPDATE TAPE TAPE_TO_UPDATE SET(DIRTY,NB_MASTER_FILES,MASTER_DATA_IN_BYTES,NB_...: can not execute sql, another query is in progress
The command works for us at CERN because we use an Oracle DB, where we can have multiple active cursors in the same session.
Related community post: https://cta-community.web.cern.ch/t/current-usage-of-cta-statistics-tools/173
Steps to reproduce
- Set up CTA with a postgres database for the catalogue.
- Execute
/usr/bin/cta-statistics-update /etc/cta/cta-catalogue.conf
with some quantity of dirty tapes.
What is the current bug behaviour?
The command fails with:
Aborting: virtual void cta::statistics::DatabaseStatisticsService::updateStatisticsPerTape(): executeNonQuery failed: executeNonQuery failed for SQL statement UPDATE TAPE TAPE_TO_UPDATE SET(DIRTY,NB_MASTER_FILES,MASTER_DATA_IN_BYTES,NB_...: can not execute sql, another query is in progress
What is the expected correct behaviour?
The update command should succeed with something along the lines of:
Updating tape statistics in the catalogue... Updated catalogue tape statistics in 0.040164 seconds, 1 tape(s) have been updated
Relevant logs and/or screenshots
Possible causes
It appears that this is caused by the fact that we execute the UPDATE query for each VID result while fetching these from the DB.
In doing so we keep the cursor for the VID result set active while the secondary cursor for the UPDATE created.
See void DatabaseStatisticsService::updateStatisticsPerTape()
in CTA/statistics/DatabaseStatisticsService.cpp.
The observed error is a safeguard on our end specifically for postgresql.
One possible solution would be to fetch all the results into RAM first (Steve estimates this would be ~16MB at worst for 10.000 tapes), and then execute the ordered update afterwards. The update query here is ordered to avoid the possibility of a deadlock.