Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
linuxops
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Iterations
Requirements
Jira
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Locked files
Build
Pipelines
Jobs
Pipeline schedules
Test cases
Artifacts
Deploy
Releases
Package registry
Container registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Code review analytics
Issue analytics
Insights
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
linuxsupport
websites
linuxops
Merge requests
!30
Nomad nodes
Code
Review changes
Check out branch
Download
Patches
Plain diff
Merged
Nomad nodes
nomad_nodes
into
master
Overview
0
Commits
7
Pipelines
0
Changes
3
Merged
Ghost User
requested to merge
nomad_nodes
into
master
5 years ago
Overview
0
Commits
7
Pipelines
0
Changes
3
Expand
0
0
Merge request reports
Compare
master
version 4
50fbf1f9
5 years ago
version 3
ec7a9668
5 years ago
version 2
8ecfd90c
5 years ago
version 1
14bdb042
5 years ago
master (base)
and
latest version
latest version
2a4d53fc
7 commits,
5 years ago
version 4
50fbf1f9
6 commits,
5 years ago
version 3
ec7a9668
4 commits,
5 years ago
version 2
8ecfd90c
3 commits,
5 years ago
version 1
14bdb042
2 commits,
5 years ago
3 files
+
69
−
0
Inline
Compare changes
Side-by-side
Inline
Show whitespace changes
Show one file at a time
Files
3
Search (e.g. *.vue) (Ctrl+P)
docs/nomad/access.md
0 → 100644
+
50
−
0
Options
# Nomad
Nomad is a workload orchestrator that we use for automating tasks.
## Access
### UI
To access Nomad's UI, point your browser to:
https://lxsoftadm.cern.ch:4646
And get the admin ACL token:
```
tbag show nomad_submit_token --hg lxsoft/adm
```
### CLI
TODO
### Monitoring
There's some basic monitoring available here:
https://kojimon.web.cern.ch/d/_ffC7H-ik/nomad?refresh=10s&orgId=1&from=now-7d&to=now
## Cronjobs
All cronjobs' definitions can be found here:
https://gitlab.cern.ch/linuxsupport/cronjobs
## Troubleshooting
If jobs are not starting due to placement failures (no resources on any node),
the cluster might have gotten itself into a broken state. I've also seen problems
with
`docker pull`
after S3 failures. Here are some things you can do to try to recover:
!!! danger "Important Note"
Do this only to one server at a time, keeping the others running. Otherwise
you risk loosing the entire cluster, which would be a real PITA to recover.
1.
Restart Nomad:
`service nomad restart`
2.
Stop Nomad, restart the docker daemon, restart nomad
3.
Stop Nomad, delete the client DB, restart Nomad:
`service nomad stop && rm -f /var/lib/nomad/client/state.db && service nomad start`
(Do this especially if nomad crashes on startup. There have been some bugs in
the past that prevented nomad from restarting cleanly)
4.
Reboot the node
Loading