Skip to content
Snippets Groups Projects

Draft: working EMPFlowGA_flows+loose.json

Open Ivan Oleksiyuk requested to merge ioleksiy/training-dataset-dumper:EMPFlowGA into main
1 unresolved thread

Description

Adding a config to dump EMPFlow jets with:

  • Ghost associated "tracks" (with r22default selection)
  • Ghost associated tracks with r22loose selection "tracks_loose"
  • Collection of all Pflow objects "flows"
  • Collection of all separately "charged" and "neutral" Pflow objects !?!
  • Tracks left after overlap removal with flows "tracks_OR" and "tracks_OR_loose" !?!

Each collection has a limit of 80!?! constituents (instead of 40 that is default)

!?! - points that are still up for discussion. The size of this ump will be huge as we have twice the number of constituent and way to many collections. Some studies still have to be run to maybe remove "charged" and "neutral" and "tracks_OR", "tracks_OR_loose". The "tracks" might be replaced with "tracks_loose" in the future

The config should be used with: dump-single-btag

Review checklist:

  • CI Passing
  • Comments addressed
  • Source branch is up to date with target
Edited by Ivan Oleksiyuk

Merge request reports

Members who can merge are allowed to add commits.
Requires 1 approval from eligible users.
Merge blocked: 3 checks failed
All required approvals must be given.
Merge request must not be draft.
Merge request must be rebased, because a fast-forward merge is not possible.

Merge details

  • The source branch is 74 commits behind the target branch.
  • 1 commit will be added to main.
  • Source branch will be deleted.

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
1 {
  • Thanks @ioleksiy. I don't love the filename (we should probably avoid +). Can we call this EMPFlow_GN3dev.json or something?

  • to be fair we do have one other config file that has - in the name, so apparently it works but I do seem to remember the grid having some issues naming datasets with - in the name, so I would assume we could get issues from +. Maybe best to avoid those and stick with things that would be a valid variable name in python.

    Edited by Dan Guest
  • Ivan Oleksiyuk changed this line in version 2 of the diff

    changed this line in version 2 of the diff

  • Author Contributor

    Hi, I renamed the file. Sorry for waiting. I also removed the OR collections (as we discussed OR is not needed for training) and reduced the number of constituents back to a maximum of 40 (it is a default for us for now as I understand although not based on any studies as far as I am aware)

  • Please register or sign in to reply
  • Nicole Michelle Hartman approved this merge request

    approved this merge request

  • added 1 commit

    • 8ada9ce7 - renamed file, removed OR, reduced to 40 constituents

    Compare with previous version

  • Ivan Oleksiyuk reset approvals from @hartman by pushing to the branch

    reset approvals from @hartman by pushing to the branch

  • Ivan Oleksiyuk added 19 commits

    added 19 commits

    Compare with previous version

  • Samuel Van Stroud mentioned in merge request !744 (merged)

    mentioned in merge request !744 (merged)

  • Please register or sign in to reply
    Loading