[PGSCHED] design scheduler DB structure for archive job queue
Currently all info from the objectstore related to any job has been put into one row of the new database schema.
I suggest to go through all the columns to understand more globally:
- origin of the information per column (clarifies the worlflow)
- availability when filled/fill order (helps prevent updates [1][2] and rather do inserts when needed)
- which filter fields we use to filter rows systematically for selection ? (helps understand which fields should be indexed)
- which we systematically select for full job execution after insertion ?
- systematic update ? (helps to recognise where updates are really necessary and where we can insert a new row instead [1][2])
-
mountID
vsTapePool
concepts could direct the Archive and Retrieve jobs in separate tables (execution
related tables rather thanqueue
related) - we design a queuing system -
throughput is essential
! - anything else you might think of ?
These should help us design a Archive related table(s) in the new PGSCHED DB schema.
The current tables have the following form (will try to make mermaid diagrams later to make the picture clearer):
TABLE: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
REPACK | repack_reqid | status | create_time | mount_policy | vid | is_complete | is_no_recall | is_move | is_expand_started | is_expand_finished | is_add_copies | create_username | create_host | user_provided_files | user_provided_bytes | total_files_to_retrieve | total_bytes_to_retrieve | total_files_to_archive | total_bytes_to_archive | retrieved_files | retrieved_bytes | archived_files | archived_bytes | repack_finished_time | last_expanded_fseq | buffer_url | failed_to_create_archive_req | failed_to_archive_files | failed_to_archive_bytes | failed_to_retrieve_files | failed_to_retrieve_bytes | subreq_pb | destinfo_pb | ||||||||||||||||||||||||
RETRIEVE | job_id | retrieve_reqid | status | creation_time | mount_policy | vid | mount_id | starttime | priority | storage_class | retrieve_min_req_age | active_copy_nb | size_in_bytes | archive_file_id | checksumblob | requester_name | requester_group | dst_url | disk_instance | disk_file_path | disk_file_id | disk_file_gid | disk_file_owner_uid | disk_system_name | repack_reqid | is_repack | retrieve_error_report_url | failure_report_url | failure_report_log | activity | is_failed | is_verify | srr_activity | srr_mount_policy | srr_time | srr_host | srr_username | lt_create | lt_first_selected | lt_completed | rr_repack | retrievejob_pb | info_pb | ||||||||||||||
ARCHIVE | job_id | archive_reqid | status | creation_time | mount_policy | vid | mount_id | start_time | priority | storage_class | min_archive_request_age | copy_nb | size_in_bytes | archive_file_id | checksumblob | requester_name | requester_group | src_url | disk_instance | disk_file_path | disk_file_id | disk_file_gid | disk_file_owner_uid | repack_reqid | is_repack | archive_error_report_url | archive_report_url | failure_report_log | failure_log | repack_dest_vid | is_reportdecided | total_retries | max_total_retries | retries_within_mount | max_retries_within_mount | last_mount_with_failure | total_report_retries | max_report_retries | tape_pool | repack_filebuf_url | repack_fseq |
[1] https://malisper.me/postgres-heap-only-tuples/
[2] https://www.postgresql.org/docs/current/storage-hot.html
Related also to issue #604.
Edited by Jaroslav Guenther