Add a basic script to validate metadata content
In the context of recent examples of inconsistencies in the in-file metadata content due to interference of jobs running in containers (ATLASG-2686, AFT-711), this MR adds a script to check the sanity of the metadata:
- check if the number of events from
EventStreamInfo
equals to the number of entries inDataHeader
- check if
/TagInfo
metadata contains inconsistent information - check presence of
FileMetaData
- check uniqueness of run/lumiblock/event number per event and against the summary in the
FileMetaData
This is of course non-exhaustive, so I'm open to suggestions regarding other checks.
I've added this to run in one of the ART tests, please let me know if this is sufficient and correct.
Merge request reports
Activity
Thanks @maszyman ,
I think its a very good idea to have such a script check for metadata sanity. As a first step we can add it to art tests (as you did) and then later (after we found no false positives) even to the official file validation in production.
Peter
added Database Derivation main review-pending-level-1 labels
CI Result SUCCESS (hash 2e864e8a)Athena AthSimulation AthGeneration AthAnalysis externals cmake make tests Full details available on this CI monitor view. Check the JIRA CI status board for known problems
Athena: number of compilation errors 0, warnings 0
AthSimulation: number of compilation errors 0, warnings 0
AthGeneration: number of compilation errors 0, warnings 0
AthAnalysis: number of compilation errors 0, warnings 0
For experts only: Jenkins output [CI-MERGE-REQUEST-EL9 8211] (remote access info)removed review-pending-level-1 label
added review-approved label
mentioned in commit a1bd0bbc
107 if ( 108 len( 109 set( 110 item[5:6] 111 for item in tag_info["project_name"] 112 if item.startswith("data") 113 ) 114 ) 115 > 1 116 ): 117 logging.error("/TagInfo contains values from different data taking periods") 118 return 1 119 if ( 120 "data_year" in tag_info 121 and isinstance(tag_info["data_year"], list) 122 and set(tag_info["data_year"]) > 1 mentioned in merge request !70717 (merged)