Skip to content
Snippets Groups Projects
Commit a28887e6 authored by Alex Iribarren's avatar Alex Iribarren
Browse files

Merge branch 'nomad_nodes' into 'master'

Nomad nodes

See merge request !30
parents 5612df1d 2a4d53fc
No related branches found
No related tags found
1 merge request!30Nomad nodes
# Nomad
Nomad is a workload orchestrator that we use for automating tasks.
## Access
### UI
To access Nomad's UI, point your browser to:
https://lxsoftadm.cern.ch:4646
And get the admin ACL token:
```
tbag show nomad_submit_token --hg lxsoft/adm
```
### CLI
TODO
### Monitoring
There's some basic monitoring available here:
https://kojimon.web.cern.ch/d/_ffC7H-ik/nomad?refresh=10s&orgId=1&from=now-7d&to=now
## Cronjobs
All cronjobs' definitions can be found here:
https://gitlab.cern.ch/linuxsupport/cronjobs
## Troubleshooting
If jobs are not starting due to placement failures (no resources on any node),
the cluster might have gotten itself into a broken state. I've also seen problems
with `docker pull` after S3 failures. Here are some things you can do to try to recover:
!!! danger "Important Note"
Do this only to one server at a time, keeping the others running. Otherwise
you risk loosing the entire cluster, which would be a real PITA to recover.
1. Restart Nomad: `service nomad restart`
2. Stop Nomad, restart the docker daemon, restart nomad
3. Stop Nomad, delete the client DB, restart Nomad:
`service nomad stop && rm -f /var/lib/nomad/client/state.db && service nomad start`
(Do this especially if nomad crashes on startup. There have been some bugs in
the past that prevented nomad from restarting cleanly)
4. Reboot the node
# Mirroring a new repo
A recurring use case is some users asking us to mirror an external repo:
E.g. [RQF1444976](https://cern.service-now.com/service-portal/view-request.do?n=RQF1444976)
In the [reposync](https://gitlab.cern.ch/linuxsupport/cronjobs/reposync/) Gitlab repo, we can add new mirrors.
Ideally, in a first step, we can add the repo definition to the `dev.repos.d` folder, e.g. [tpc.repo](https://gitlab.cern.ch/linuxsupport/cronjobs/reposync/blob/57ed4f6f1f1727ecfcfdf94bc1c02989503b1014/prod.repos.d/tpc.repo)
If we want to modify the path (e.g. we only have an IP from upstream), we can edit the `dev.repos.yaml` file to define what we want to remove from the upstream url (pathcut) and how do we want the path to be created in our mirror (pathroot). E.g. [this commit](https://gitlab.cern.ch/linuxsupport/cronjobs/reposync/commit/78659add3b5f851cf3d6fe523318961a0e93cb91#46921bca268c7e0fcd91e1636e7426a2e311b009).
Once we have tested that the repository gets mirrored as we wish, we can move the introduced changes:
`dev.repos.yaml` -> `prod.repos.yaml`
`dev.repos.d/my-new-mirror.repo` -> `prod.repos.d/my-new-mirror.repo`
......@@ -42,6 +42,9 @@ nav:
- 'Adding users to koji': koji/addingusers.md
- 'Untagging policy': koji/untagging.md
- 'Creating a tag': koji/addnewtag.md
- 'Nomad':
- 'Access': nomad/access.md
- 'Mirroring new repos': nomad/mirroring.md
- 'AIMS2':
- 'AIMS2 ': aims2/aims2.md
- 'AIMS2 Architecture': aims2/aims2architecture.md
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment