Fix `cta-catalogue-schema-verify` checking of NOT NULL constraints in Postgres
This issue was reported by the dCache team on CTA Community.
After upgrade to CTA schema v10 on a PostgreSQL DB, cta-catalogue-schema-verify
returns these errors:
[root@tpm102 ~]# cta-catalogue-schema-verify /etc/cta/cta-catalogue.conf
Schema version : 10.0
Checking indexes...
SUCCESS
Checking tables, columns and constraints...
ERROR: IN TABLE DISK_SYSTEM, CONSTRAINT DISK_SYSTEM_DIN_NN is missing in the schema but defined in the catalogue database.
ERROR: IN TABLE DISK_SYSTEM, CONSTRAINT DISK_SYSTEM_DISN_NN is missing in the schema but defined in the catalogue database.
ERROR: IN TABLE VIRTUAL_ORGANIZATION, CONSTRAINT VIRTUAL_ORGANIZATION_DIN_NN is missing in the schema but defined in the catalogue database.
FAILED
This is probably due to this workaround in the CTA code:
std::list<std::string> SchemaMetadataGetter::getConstraintNames(const std::string& tableName) {
std::list<std::string> constraintNames = m_sqliteDatabaseMetadataGetter->getConstraintNames(tableName);
if(m_dbType == cta::rdbms::Login::DbType::DBTYPE_POSTGRESQL){
//If the database to compare is POSTGRESQL, we cannot compare NOT NULL CONSTRAINT names
//indeed, POSTGRESQL can not give the NOT NULL constraint names
removeObjectNameMatches(constraintNames,cta::utils::Regex("(^NN_)|(_NN$)"));
}
return constraintNames;
}
See also this note in the Postgres documentation: Not-Null Constraints
It looks like there is a difference between creating the constraint in the schema directly:
CREATE TABLE DISK_SYSTEM(
...
DISK_INSTANCE_NAME VARCHAR(100) CONSTRAINT DISK_SYSTEM_DIN_NN NOT NULL,
and adding it into the schema later via a changeset:
ALTER TABLE DISK_SYSTEM ADD CONSTRAINT DISK_SYSTEM_DIN_NN CHECK (DISK_INSTANCE_NAME IS NOT NULL);
Possible solutions
Naming of NOT NULL
constraints should be consistent, regardless of whether the schema was created from scratch or modified by a Liquibase changeset.
This means either:
- Ensure that all
NOT NULL
constraints in Postgres are named. However, the documentation says that "creating an explicit not-null constraint is more efficient".
Or:
- Ensure that all
NOT NULL
constraints in Postgres are anonymous. Don't create these constraints by use ofALTER TABLE ... ADD CONSTRAINT ...
. Instead useALTER TABLE ... ALTER COLUMN ... SET NOT NULL
.
The trade-off is between being more efficient vs. having all constraints labelled with a name.