Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in
  • C cmsgemos-analysis
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 29
    • Issues 29
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
  • Merge requests 3
    • Merge requests 3
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Releases
  • Packages & Registries
    • Packages & Registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • cmsgemonline
  • gem-daq
  • cmsgemos-analysis
  • Issues
  • #32

Closed
Open
Created Aug 03, 2021 by Laurent Petre@lpetreOwner

Structure the analysis scripts

Summary

At the moment, we are using a disparate set of analysis scripts with different data formats and behaviors. We should uniformize those in order to simplify their usage and their port to the analysis suite.

The proposal is the following:

  • All scripts executables do only one task and well (KISS philosophy). A typical example is the S-bit rate analysis that can be factored in the following steps:

    1. Raw results -> output thresholds (with an option to define the allowed noise levels)
    2. Plotting of the raw results (with an option to add the derived thresholds in the plot)
    3. Convert the output thresholds into VFAT configuration files

    These 3 scripts executables being able to run independently and linked together via Bash commands/scripts.

  • The Python scripts have the following structure to help in the port to the analysis framework when the time will come:

    def my_helper_function_1():
        pass
    
    def my_helper_function_2():
        pass
    
    def my_main_function():
        pass
    
    def my_tool_1():
        # Create an argument parser
        # Call the functions with the right parameters
    
    def my_tool_2():
        # Create an argument parser
        # Call the functions with the right parameters
    
     # EDIT
     # if __name__ == "__main__":
     #    # Create an argument parser
     #    # Call the functions with the right parameters

    The Python functions are converted into executable programs with the right import in the pyproject.toml file. This is an intermidate solution until a better and more uniform CLI system is provided.

  • The input and output paths are given as arguments to the scripts and are not inferred based on the location of the scripts themselves or other files. This allows to easily run on the same data with different parameters.

  • The manipulated files are, at the moment, CSV files, ideally GZIP compressed. They should be manipulated with pandas in case the dataframe storage format would change (HDF5?).

    • We need to agree on a delimiter. A reduced list of potential characters is ;, :, |.
    • ; has been chosen by @cgalloni in cmsgemos, use it in absence of other proposals
  • A line in the data file must be self-consistent. In the S-bit rate example, a line in the output file is enough to figure out to which VFAT in the whole system a threshold must be applied. In this case, it means that the fed, slot, optohybrid and vfat must all be defined.

@aaravind @cgalloni

Edited Aug 24, 2021 by Laurent Petre
Assignee
Assign to
Time tracking