Skip to content

Draft: feat(wms): New LHCb workflows

  • CWL converter

    • Implementation:

      • Here is a minimal example to show you how it is structured and why:
      class: Workflow
      cwlVersion: v1.2
      
      requirements:
      - class: MultipleInputFeatureRequirement
      - class: StepInputExpressionRequirement
      - class: ResourceRequirement
        coresMin: 1
      
      # Inputs are provided either by the transformation (e.g. `input-data`) 
      # or computed once the cwl is on the worker node (e.g. `number-of-processors`)
      inputs:
      - id: input-data
        type: File
      - id: pool-xml-catalog
        type: File
      - id: run-number
        type: int?
      - id: number-of-processors
        type: int
      - id: output-prefix
        type: string
      - id: histogram
        type: boolean
      - id: output-type
        type: string?
      outputs:
      - id: output-data
        label: Output Data
        outputSource:
        - DaVinci_1/dstpidmctuple.root
        linkMerge: merge_flattened
        type: File[]
      - id: others
        label: Others
        outputSource:
        - DaVinci_1/others
        linkMerge: merge_flattened
        type: File[]
      - id: pool-xml-catalog-out
        label: Pool XML Catalog
        outputSource: DaVinci_1/pool-xml-catalog-out
        type: File
      
      steps:
      - id: DaVinci_1
        in:
        - id: input-data
          source: input-data
        - id: pool-xml-catalog
          source: pool-xml-catalog
        - id: run-number
          source: run-number
        - id: number-of-processors
          source: number-of-processors
        - id: output-prefix
          source: output-prefix
          valueFrom: $(inputs["output-prefix"])_1
        - id: histogram
          source: histogram
        - id: output-type
          source: output-type
        out:
        - id: dstpidmctuple.root
        - id: others
        - id: pool-xml-catalog-out
        run:
          id: _:ccc42b11-d9ff-4093-9d09-bf4460719ed1
          class: CommandLineTool
          inputs:
          - id: input-data
            type: File
            inputBinding:
              prefix: --lfn-paths
          - id: pool-xml-catalog
            type: File
            inputBinding:
              prefix: --pool-xml-catalog
          - id: run-number
            type: int?
            inputBinding:
              prefix: --run-number
          - id: number-of-processors
            type: int
            inputBinding:
              prefix: --number-of-processors
          - id: output-prefix
            type: string
            inputBinding:
              prefix: --output-prefix
          - id: histogram
            type: boolean
            inputBinding:
              prefix: --histogram
          - id: output-type
            type: string?
            inputBinding:
              prefix: --output-type
          outputs:
          - id: dstpidmctuple.root
            type: File
            outputBinding:
              glob: '*dstpidmctuple.root'
          - id: others
            type: File[]
            outputBinding:
              glob:
              - prodConf*.json
              - prodConf*.py
              - summary*.xml
              - prmon*
              - DaVinci*.log
          - id: pool-xml-catalog-out
            type: File
            outputBinding:
              glob: pool_xml_catalog.xml
          requirements:
          - class: InitialWorkDirRequirement
            listing:
            # This should be come from a production/transformation and should be parameters common to all jobs coming from the same transformation
            # Some of them can be overridden (passed as inputs parameters), such as `number-of-events`
            - entryname: configuration.json
              entry: |
                {
                  "application": {
                    "name":"DaVinci",
                    "version":"v65r2",
                    "event_timeout":null,
                    "system_config":"x86_64_v2-el9-gcc13+detdesc-opt",
                    "extra_packages":["AnalysisProductions.v1r2781"],
                    "nightly":null
                  },
                  "input": {
                    "mc_tck":"",
                    "number_of_events":-1,
                    "max_number_of_events":null,
                    "cpu_work_per_event":100 
                  },
                  "output": {
                    "types":["dstpidmctuple.root"],
                    "histogram":false
                  },
                  "options":{
                    "entrypoint":"PIDEffPbPb2024.dv:main",
                    "extra_options": {
                      "dddb_tag":"dddb-20240427",
                      "data_type":"Upgrade",
                      "conddb_tag":"sim10-2024.W45.W47-v00.00-mu100",
                      "input_type":"ROOT",
                      "simulation":"True",
                      "input_stream":"ionraw",
                      "input_process":"Hlt2",
                      "input_raw_format":"0.5"
                    },
                    "extra_args":[]
                  },
                  "db_tags":{
                    "dddb_tag":null,
                    "conddb_tag":null,
                    "dq_tag":null
                  },
                  "run_metadata":null
                }
      
          # The new Gaudi Application, based on Typer, completely offline (no interaction with/mention of LHCbDirac)
          baseCommand:
          - gaudi-app
          arguments:
          - configuration.json

      Note:

    • Unit tests

    • Integration tests:

      • Run the CWL converter on all active transformations once per day.
      • Start CWL to extract the generated prodConf.json files and compare them with the prodConf.json generated by the associated jobDescription.xml.
  • Pre-processing

    • Implementation
    • Unit tests
  • Post-processing

    • Implementation: Contains the logic that is currently in the modules following GaudiApplication.
    • Unit tests

BEGINRELEASENOTES

*WorkloadManagement

CHANGE: Convert LHCb workflows into CWL and pre/post-process them in the JobWrapper

ENDRELEASENOTES

Edited by Alexandre Franck Boyer

Merge request reports

Loading