Skip to content

Draft: feat: add materialized view for processing paths optimization

Summary

This PR implements a processing paths optimization for the LHCb bookkeeping system by embedding computed path columns directly in the processing table, eliminating expensive hierarchical queries and Oracle materialized view restrictions.

Changes Made

Database Schema (database_schema.sql)

  • Processing Table Enhancement: Added procpath (VARCHAR2(4000)) and path_depth (NUMBER) columns
  • Constraints:
    • Unique constraint on procpath to ensure path uniqueness
    • Indexes for efficient path-based and depth-based lookups
  • Automatic Maintenance Triggers:
    • processing_before_insert_update: Computes paths automatically on insert/update
    • processing_after_update: Updates descendant paths when parent nodes change
    • Built-in cycle detection and path length validation
    • Prevents '/' characters in processing pass names

Code Updates

  • FileQueryUtils.py:
    • Direct join to processing table instead of materialized view
    • Removed complex CTE hierarchical queries (CTE_PROCPATHS_*)
    • Simplified query building using pre-computed procpath column
  • StorageUsageDumpAgent.py:
    • Updated to use processing table directly for path lookups
    • Removed inline CTE for processing paths

Design Decisions

Why Embedded Columns Instead of Materialized Views?

  • Oracle compatibility: Materialized views with REFRESH ON COMMIT don't support CONNECT BY queries
  • Performance: Pre-computed indexed columns are faster than runtime hierarchical queries
  • Reliability: No materialized view refresh issues or commit restrictions
  • Simplicity: Direct table joins without external view dependencies

Trigger-Based Maintenance

  • Automatic updates: Paths computed transparently on data changes
  • Cascade updates: When a parent node changes, all descendants are updated
  • Data integrity: Cycle detection prevents infinite loops
  • Validation: Path length limits and character restrictions enforced

Performance Benefits

  • Eliminated expensive queries: No more runtime SYS_CONNECT_BY_PATH calculations
  • Indexed access: Fast lookups via procpath and composite indexes
  • Direct joins: Simple table joins instead of complex hierarchical queries
  • Automatic maintenance: Triggers handle path updates efficiently

Technical Details

This approach embeds the full hierarchical path directly in each processing row, maintained automatically by triggers. Queries can now join directly to the processing table and access the pre-computed procpath column, eliminating the need for expensive CONNECT BY queries or problematic materialized views.

BEGINRELEASENOTES *BookkeepingSystem NEW: Added processing path optimization with embedded procpath/path_depth columns in processing table, eliminating expensive hierarchical queries and improving query performance through automatic trigger-based maintenance ENDRELEASENOTES

🤖 Generated with Claude Code

Edited by Chris Burr

Merge request reports

Loading