Skip to content
Snippets Groups Projects

JDL Splitting options

Merged Haakon Andre Reme-Ness requested to merge hremenes/jalien-docs:master into master
1 file
+ 229
0
Compare changes
  • Side-by-side
  • Inline
+ 229
0
# JDL syntax reference
## Different split options
Splitting a job into smaller subjobs is based on the strategy defined in the JDL and will split
files provided by InputData or InputCollection to different subjobs with the same executable.
Different strategies have different optional or mandatory fields.
### Split
```
Will only split the job if this field is defined.
usage: Split="[strategy]"
```
---
### SplitArguments
```
Reduntant field, but will add this for splitjobs to Arguments for job
usage: SplitArguments="[arguments for executable]"
```
---
### **Split strategies options**
---
### production
```
Duplicate the job a number of time equal to End-Start defined.
#alien_counter# begins the counter at Start provided.
usage: Split="production:[Start]-[End]"
```
---
### file
```
Divides inputdata files based on full LFN path, resulting in one file per subjob as LFN's are unique.
usage: Split="file"
```
---
### directory
```
Divides inputdata files based on lowest directoy in LFN path.
Example: /alice/cern.ch/user/a/alice/LHC22f3.xml --> /alice/cern.ch/user/a/alice
usage: Split="directory"
optional:
SplitMaxInputFileNumber
SplitMaxInputFileSize
```
---
### parentdirectory
```
Divides inputdata files based on parent of the lowest directoy in LFN path.
Example: /alice/cern.ch/user/a/alice/LHC22f3.xml --> /alice/cern.ch/user/a
usage: Split="parentdirectory"
optional:
SplitMaxInputFileNumber
SplitMaxInputFileSize
```
---
### se
```
Divides inputdata files based on which Storage Elements files are stored on.
usage: Split="se"
mandatory:
SplitMaxInputFileNumber
optional:
SplitMinInputFileNumber
```
### af (under development)
```
Analysis Facility split meant for cases where files all share a Storage Element and forcing jobs to run on that site
usage: Split="af"
mandatory:
SplitMaxInputFileNumber/SplitMaxInputFileSize
optional:
ForceOnlySEInput
MaxInputMissingThreshold
```
---
### SplitMaxInputFileNumber
```
Sets a maximum limit for number of inputdata files per subjob
usage: SplitMaxInputFileNumber="[number]"
```
---
### SplitMaxInputFileSize
```
Sets a maximum limit for combined size of inputdata files per subjob
usage: SplitMaxInputFileSize="[number]"
```
---
### SplitMinInputFileNumber
```
Sets a minimum limit for number of inputdata files per subjob, used by storage element split
to merge subjobs with less inputdata files than the limit
usage: SplitMinInputFileNumber="[number]"
```
---
### ForceOnlySEInput (under development)
```
Used by Analysis Facility to force only inputdata files located on site provided in Requirements of JDL to be used.
Other files are ignored for the job. Has a default threshhold of missing files before it fails.
usage: ForceOnlySEInput="[true/false]"
```
---
### MaxInputMissingThreshold (under development)
```
Sets a percentage value of missing files before an af split fails
usage: MaxInputMissingThreshold="[percentage]"
```
---
### **#alien# pattern**
This pattern is replaced by a value based on subjob or a counter in the final JDL
### counter
```
An increasing subjob counter, can define
usage: #alien_counter# --> 1,2,3....
options:
#alien_counter_[number of digits]i# --> #alien_counter_03i# = 001, 002, 003...
```
---
### file patterns
```
Replace this pattern with a value based on either the first or last of the inputdata files in the subjob.
Default if not provided is first.
usage: #alien[first/last][option]#
options:
dir --> /alice/cern.ch/user/a/alice/LHC22f3.xml = alice
fulldir --> /alice/cern.ch/user/a/alice/LHC22f3.xml = /alice/cern.ch/user/a/alice/LHC22f3.xml
filename/[pattern to be replaced]/[new value] --> filename/.xml/.new/ --> /alice/cern.ch/user/a/alice/LHC22f3.xml= LHC22f3.new
example:
#alienlastdir#
#alienfilename/.root//#
```
---
### OrderBy
```
Order inputdata files in the JDL based on a given strategy (Usually will be alphabetical by default)
usage: OrderBy = "options"
options:
random --> Shuffle all files randomly
size --> Order by size, matching largest with smalles and so forth
epn --> Order by epn
tf --> Order by timeframes
alphabetical --> order by name
```
---
\ No newline at end of file
Loading