JDL Splitting options

37a0bc4b · Haakon Andre Reme-Ness · Costin Grigoras · 19fd7fdd · 37a0bc4b
Commit 37a0bc4b authored 7 months ago by Haakon Andre Reme-Ness Committed by Costin Grigoras 7 months ago
--- a/docs/jdl_syntax.md
+++ b/docs/jdl_syntax.md
+# JDL syntax reference
+## Different split options
+Splitting a job into smaller subjobs is based on the strategy defined in the JDL and will split
+files provided by InputData or InputCollection to different subjobs with the same executable.
+Different strategies have different optional or mandatory fields.
+### Split
+``` 
+    Will only split the job if this field is defined.    
+    usage: Split="[strategy]"
+```
+---
+### SplitArguments
+``` 
+    Reduntant field, but will add this for splitjobs to Arguments for job 
+    usage: SplitArguments="[arguments for executable]"
+```
+---
+### **Split strategies options**
+---
+### production
+``` 
+    Duplicate the job a number of time equal to End-Start defined.
+    #alien_counter# begins the counter at Start provided.
+    usage: Split="production:[Start]-[End]"
+```
+---
+### file
+``` 
+    Divides inputdata files based on full LFN path, resulting in one file per subjob as LFN's are unique.
+    usage: Split="file"
+```
+---
+### directory
+``` 
+    Divides inputdata files based on lowest directoy in LFN path.
+    Example: /alice/cern.ch/user/a/alice/LHC22f3.xml --> /alice/cern.ch/user/a/alice
+    usage: Split="directory"
+    optional:
+            SplitMaxInputFileNumber
+            SplitMaxInputFileSize
+```
+---
+### parentdirectory
+``` 
+    Divides inputdata files based on parent of the lowest directoy in LFN path.
+    Example: /alice/cern.ch/user/a/alice/LHC22f3.xml --> /alice/cern.ch/user/a
+    usage: Split="parentdirectory"
+    optional:
+            SplitMaxInputFileNumber
+            SplitMaxInputFileSize
+```
+---
+### se
+``` 
+    Divides inputdata files based on which Storage Elements files are stored on.
+    usage: Split="se"
+    mandatory:
+            SplitMaxInputFileNumber
+    optional:
+            SplitMinInputFileNumber
+```
+### af (under development)
+``` 
+    Analysis Facility split meant for cases where files all share a Storage Element and forcing jobs to run on that site 
+    usage: Split="af"
+    mandatory:
+            SplitMaxInputFileNumber/SplitMaxInputFileSize
+    optional:
+            ForceOnlySEInput
+            MaxInputMissingThreshold
+```
+---
+### SplitMaxInputFileNumber
+``` 
+    Sets a maximum limit for number of inputdata files per subjob
+    usage: SplitMaxInputFileNumber="[number]"
+```
+---
+### SplitMaxInputFileSize
+``` 
+    Sets a maximum limit for combined size of inputdata files per subjob
+    usage: SplitMaxInputFileSize="[number]"
+```
+---
+### SplitMinInputFileNumber
+``` 
+    Sets a minimum limit for number of inputdata files per subjob, used by storage element split
+    to merge subjobs with less inputdata files than the limit 
+    usage: SplitMinInputFileNumber="[number]"
+```
+---
+###  ForceOnlySEInput (under development)
+``` 
+    Used by Analysis Facility to force only inputdata files located on site provided in Requirements of JDL to be used.
+    Other files are ignored for the job. Has a default threshhold of missing files before it fails.
+    usage: ForceOnlySEInput="[true/false]"
+```
+---
+###  MaxInputMissingThreshold (under development)
+``` 
+    Sets a percentage value of missing files before an af split fails
+    usage: MaxInputMissingThreshold="[percentage]"
+```
+---
+### **#alien# pattern**
+This pattern is replaced by a value based on subjob or a counter in the final JDL
+###  counter
+``` 
+    An increasing subjob counter, can define 
+    usage: #alien_counter# --> 1,2,3....
+    options:
+           #alien_counter_[number of digits]i# --> #alien_counter_03i# = 001, 002, 003... 
+```
+---
+###  file patterns
+``` 
+    Replace this pattern with a value based on either the first or last of the inputdata files in the subjob. 
+    Default if not provided is first.
+    usage: #alien[first/last][option]# 
+    options:
+           dir --> /alice/cern.ch/user/a/alice/LHC22f3.xml = alice
+           fulldir --> /alice/cern.ch/user/a/alice/LHC22f3.xml = /alice/cern.ch/user/a/alice/LHC22f3.xml
+           filename/[pattern to be replaced]/[new value] --> filename/.xml/.new/ --> /alice/cern.ch/user/a/alice/LHC22f3.xml= LHC22f3.new
+    example:
+            #alienlastdir#
+            #alienfilename/.root//#
+```
+---
+###  OrderBy
+``` 
+    Order inputdata files in the JDL based on a given strategy (Usually will be alphabetical by default)
+    usage: OrderBy = "options"
+    options:
+           random --> Shuffle all files randomly
+           size --> Order by size, matching largest with smalles and so forth
+           epn --> Order by epn
+           tf --> Order by timeframes
+           alphabetical --> order by name
+```
+---
\ No newline at end of file