diff --git a/docs/aims2/aims2.md b/docs/aims2/aims2.md index 0c5f6af6ef73476689878e35d3de51fc5f706a41..a52f31f2d766c134386b7595100b5b8587910f5f 100644 --- a/docs/aims2/aims2.md +++ b/docs/aims2/aims2.md @@ -6,108 +6,11 @@ AIMS2 allows the user to perform remote PXE (Preboot eXecution Environment) inst The previous incarnation of AIMS failed to meet the new modern requirements of IT/FIO and other Linux users at CERN and has prompted a rethink of what Linux Support is providing as a remote installation service. AIMS2 tries to provide viable and effective solutions to some of the short comings of previous versions of AIMS, as expressed by its users and maintainers, whilst trying to minimise the changes to existing individual or group workflows. Some of the new features of AIMS2 include arbitrary PXE boot media management, removal of AFS dependencies, better host/device install tracking, traceability, device authorisation, Kerberos authentication and more. -## Related resources - -### Databases - -**Production** - -* Databases: `dbod-aims.cern.ch:6601` -* User: `aims` -* Retrieve credentials with: `tbag show --hg "aims" aims_psqldb_pwd` -* Owner: `aims-admins` - -**Test** - -* Databases: `dbod-aimstest.cern.ch:6616` -* User: `aims` -* Retrieve credentials with: `tbag show --hg "aims" aims_psqldbtest_pwd` -* Owner: `aims-admins` - -**Please be aware when connecting directly to the database. It is case sensitive and might have issues with passwords that contain `@` character.** - -If you change the Database password, please do the following: - -* Update the service account password and update the Teigi stored value: `tbag set --hg aims aims_psqldb_pwd --binary <new_password>` or `tbag set --hg aims aims_psqldbtest_pwd --binary <new_password>` -* Force a Puppet run on each of the AIMS hosts - -Please check [the account management for Oracle accounts](https://cern.service-now.com/service-portal?id=kb_article&n=KB0000829) for further information. - -### Storage - -* CephFS shares: - * `aims.cern.ch` - * `AIMS Service` tenant - * `aims_share` CephFS share - * `aims_id` share-id - * `flax`/`Meyrin CephFS` cluster - * `remote_path`: `'/volumes/_nogroup/cebaadff-7d3d-4a1f-88d0-6eb773325901'` - * `access_key`: `tbag show --hg aims flax.aims_id.secret` - * Size 500GB - * [Metrics](https://filer-carbon.cern.ch/grafana/d/000000111/cephfs-detail?from=now-90d&orgId=1&refresh=1m&to=now&viewPanel=15&var-cluster=flax&var-share=cebaadff-7d3d-4a1f-88d0-6eb773325901) - * `aimstest.cern.ch` - * `AIMS Service` tenant - * `aims_share_test` CephFS share - * `aims_id` share-id - * `dwight`/`Geneva CephFS Testing` cluster - * `remote_path`: `'/volumes/_nogroup/4272339e-0efb-4d0f-b17d-43d4d6ec600b'` - * `access_key`: `tbag show --hg aims dwight.aims_id.secret` - * Size 200GB - * [Metrics](https://filer-carbon.cern.ch/grafana/d/000000111/cephfs-detail?from=now-90d&orgId=1&refresh=1m&to=now&viewPanel=15&var-cluster=dwight&var-share=4272339e-0efb-4d0f-b17d-43d4d6ec600b) - -!!! note "" - `aims.cern.ch`, i.e. production, has backups configured by our CephFS colleagues. See [RQF1823187](https://cern.service-now.com/service-portal?id=ticket&table=u_request_fulfillment&n=RQF1823187). If you need any assistance, contact them or open a Service Now ticket. - - - -### Service accounts - -* User: `aims` -* Retrieve credentials with: `tbag show --hg "aims" aims_password` -* Owner: Alex Iribarren -* Purpose: connecting to LANDB and LDAP - -**Password changing procedure:** - -* Annouce with an OTG that the service will be degraded for 5-10min. No new issued installations will work. -* Update the service account password and update the Teigi stored value: `tbag set --hg aims aims_password --binary <new_password>` -* Force a Puppet run on each of the AIMS hosts - ---- - -* User: `linux_private` -* Retrieve credentials with: `tbag show espassword --hg aims` -* Owner: Ulrich Schwickerath -* Purpose: Sending logs to our ES instance - -### Egroups - -* `aims2-upload`: Users with permissions to upload images to AIMS servers -* `aims-admins`: AIMS administrators -* `aims2-cc-admins`: Administrators for hosts on buildings 513, 613, 9994 (SafeHost), 9918 (Wigner), 773 (Network Hub) or 6045 (LHCb containers) - -**Note these egroups can only be replaced for AIMS if updating `CONF` table on the database. Keys correspond to `EGROUP_UPLOAD`, `EGROUP_AIMSSUPPORT` and `EGROUP_SYSADMINS`.** - -### LB Alias - -* `aims.cern.ch`: used for prod instances. -* `aimstest.cern.ch`: used for test instances. -* `aimsdev.cern.ch`: used for dev instances and debugging. Might be empty at any given point. It helps isolating specific nodes. - -### Related GitLab projects - -* AIMS2 hostgroup: <https://gitlab.cern.ch/ai/it-puppet-hostgroup-aims> - * As of July 2021, these are the existing nodes for the hostgroup: - * **prod**: `aims01`, `aims02`, `aims03` - * **test**: `aimstest01`, `aimstest02`, `aimstest03` -* AIMS2 applications: <https://gitlab.cern.ch/linuxsupport/rpms/aims2> -* Other AIMS2 related components: <https://gitlab.cern.ch/linuxsupport/aims> - ## Documentation -### How-to +### AIMS2 Resources -If you would like to know the quickest/simplest way to enable your hosts/devices for installation, please take a look at the [How-to](./aims2how2.md) guide. It will guide you step-by-step through the installation procedure, using clear and simple examples. +Check [our resources page](./resources.md) documentation to know what this service uses and how to retrieve access to them or contact whoever is responsible for them. ### AIMS2 Client @@ -117,7 +20,8 @@ For a more comprehensive guide to the features available with aims2 please refer For more information on how an `aims2` server is configured and how the service works please refer to the [aims2server](./aims2server.md) documentation. This documentation is mainly intended for those within Linux Support. -### Original AIMS2 presentation +### Ancient references Although this is outdated, on Wenesday 1st October 2008, a presentation titled ["Introduction to AIMS2"](https://twiki.cern.ch/twiki/pub/LinuxSupport/Aims2/aims2.ppt) was presented. +If for whatever reason you still need more old references, refer to <https://twiki.cern.ch/twiki/bin/view/LinuxSupport/Aims2how2>. More than probably it is not relevant anymore and our current docs are more than enough to clarify how the service works. diff --git a/docs/aims2/aims2diagrams.md b/docs/aims2/aims2diagrams.md index 01254804bc2fdb805bed8aa29401d8a5238a74f6..9ea12f19540211cb4d1f81ded821956541fecb29 100644 --- a/docs/aims2/aims2diagrams.md +++ b/docs/aims2/aims2diagrams.md @@ -12,7 +12,7 @@ For registered vs unregistered use cases. Be aware that items below have links embedded for related documentation. Check them for further details. You may not be able to see the embedded content in GitLab's Markdown representation. -If you feel like editing the diagram yourself, it is attached in the `assets/drawio/` directory of this documentation's Git repo. +If you feel like editing the diagram yourself, it is attached in the `assets/drawio/` directory of this documentation's Git repo. You can load this diagram on CERNBox and edit it with DrawIO. <div class="mxgraph" style="max-width:100%;border:1px solid transparent;" data-mxgraph="{"highlight":"#0000ff","nav":true,"resize":true,"toolbar":"zoom layers lightbox","edit":"_blank","xml":"<mxfile host=\"cernbox.cern.ch\" modified=\"2021-09-01T12:26:36.668Z\" agent=\"5.0 (X11)\" version=\"14.4.7\" etag=\"qZiRBiWGgZZmoBe-btXE\" type=\"embed\"><diagram id=\"e8V_kJZfMdfp9TGMViHE\" name=\"Page-1\">7V1bc5vIEv41qjrnQSruoEdLtjapk2xcSbY2eUohGEmsESiAYnt//Zkrl5kRAmkkbO9maxMxghFM377u6W5G5nz79Fvm7zYf0xDEI0MLn0bm7cgwDM0x4D9o5JmM6JrrkpF1FoV0rBr4Ev0N2Il0dB+FIG+cWKRpXES75mCQJgkIisaYn2XpY/O0VRo3f3Xnr4Ew8CXwY3H0zygsNmTUs7Vq/B2I1hv2y7pGv9n67GQ6kG/8MH2sDZl3I3OepWlBPm2f5iBGq8fWhVy3OPBteWMZSIpOF2g2ueSXH+/p09E7K57Z42bpPgkBukIbmbPHTVSALzs/QN8+QgrDsU2xjeGRDj+u0qSgFNMNdBzF8TyN0wwOJGkCx2ehn2/wdOj8vMjSB8DOGBnmneY4jlF+wxbYgiP0VkFWgKeDD6yXywgZEKRbUGTP8BR2gUtXnvHelB4/1gjJztnUiGgykvmUedbl3NX6wg90ieXLbXuKVzv2lyC+T/OoiNIEjgVwGQBcyBlapAhy7AfuhCJFM/hxtJaefkO/WKZFkW458kHihD7wVoGMbE7ggeVKEZF0s0klS7cFKllTCZUMVwWV3H+p1IlKOkclS5QlZVT6IwfZp+VfSJ0bGl5QRhknLpBSiX7Bj2v08X2WJlGA1W2wiRKA5iUnwTlr55F7iKPkgUy1KQpkQG7QLRiLIE73YZgG+eQRLCcByJJJsIHjS7hAP+Dd+pBlFp/vbm4/3k0wrcvZvvrZGtBH/bGMfTg//o7xl1FbVsp6r4WzVl4AAilnLT3bsjU1nGV5TcYqzWWdsQwJY0378xU8rLFWK6st0gxs/YRnmxUZrrFId1awWlkBooMd+hg8w7lCkJnHGWJJuOfDshzwg4c15qlP+wJOA5jdpTbaHoKLVitDzkWhs3RsR5F+mja5yJSop9LQ1LlId5SwkWhXBLMyh4+SIUWizcFus/iCyA+yXxEkLm9wGC+EfuHnBWS5a9Ath3wWJWvEJzoH84bQC3oJwihJPV1CUk1GUgYgzsEFjkC/m/cfEc3+s4PLaGgZ2MVwGf/7L1iQkE7jpNFyuoKFE6RR4ueYAlVACJ06ephmxSZdp4kf31WjsybdqnM+pGjJMbX+AkXxTP0df1+kTVoi8X7+Rq/HB9/RwcRmh7dP9S9vn+lRXvhZcYM8VUTR2M/zKGDDiyg+5Gs1HCvyvOgh20kG1yTdZwE9izooRc1uuZqcshmI/SL61Zz+HCKJztHtu/n9edLEMftSAyZwZMyuAU/zPDXMbmsd1JQMv6hhdWEZhSVEjFN/duhyz+cSnUFUyyHFU2qUgypLUEo1yqQEjszLGI0mkmuO/0h1k+fpM4SH1pkfRqB6HBpjYMO3UQZnJ7eVIAkWAgs41ECN6/ZpjUJWkwQUj2n2kE92gRqOcJwmR1TqsM4Stsxyaep8pXdfv97f8vC16fWsowJeUMOy8MT9U77f7eDiwcNst83hP360zY3uMFevr06by6P3lmjHmRmLhSp8YTfJ5EiMlC4zUuZFHY/7LP0V5YiJoWYG2x3UuyBvpyMmIMWS4yR9nARQUo0FG0HUhLjTXGAKPSx/+EjEIT0Mx9+iBUfy8r+ZBjUDFLQehDYvRujVCjgHnAZ3utQUQUyDU92mTE71i3GAqM0tQZvfI1oVlBv8ZcxY4VWvsyx2pGqdD4nV/6LgAUOqdlHCGjDd8XEgqgLJv0GMjA0+34yQrfOh0UnW48eo2IyfIb4aP4g/dlSY7AsK0zU8cItz1yztcjiokzCJHpwgOiAJReBdh/RPUfGt9rkG6OFRhefRAYPzp7oBCvC8Tv2eOqBv8FWNajWqyJAIG+uM++kv3KcREg3GFKZ3QMOyKcjt06sqegsTWdwmQRnfYRORZxYmguT1n2un7dAJecsN85sRZmNPDX4gM1ZcWa5pN8DeYbvhrXGl9eK4kldVfMCoM1fymyK8wXqZXHnIRt7cv4cD0PmL4b1oqwyhSM2Pxmi//QgEFVwJeFmQJqtoPS6AvyUD3DxHTaLXahIRH34gN1+XjqOxsW0UhiTiA/Lob4SpKK9TKsDJ7dnIvm3GXKZtVpLmHtC5RuUWRkMy3FbzqU0g/taVsLeuTzSjwS9jfj87Xa1yUHA8dIRrOhnd6fnqjVv0U7TdiYrKFhWVdSAedh1FZfNxVH4vtauicqzTFNXpugSKe5RADBzHmzQ/ArqJopBsvyYpSv0xFrvNM52s5+6rOX1BKkRvdac66ZD648h0iOk05d5UwoZj3WtOy4DmpdWJaQrq5HfIEnBkmaaF6ApfIcrP07MBkNDBvV9ApknwiKGdDpvKZLcaixNteoXoPJujtvCz95++0IVPsxDvgUGa73cCEfq5ql3Fo7sj6nLRhioj79jGoYqsL8cbgiuZkawM4/e6XTziElRewHc2X4tLwG9Encje57KyXFW5XFB3yltM5Ybu/tsdFYt2I4cjS5x9o0bNJwG+xe4JkHl6JJU01vHcuFFLOufIMIEe2sCVxZGmjmv6iuJIDod5prKg7BXjSCwTYFgzc6r/fqJwSpCwcQB7nCmwHr9ZVgaCFfvQnsn/kMWlSx+9QFcbC2LzC0ZWwm6vCaUSiWlBqcbUMBtLO1bj+DrSSc8BqaI2EPO6KDDaYmD64rGQeUUsJEny+ONu8R4Jzs3n+TvHehGa9YrR1O7auK55zXYaV17fmeIzFZjF4eKiB1SxqDqFmfhw/YHAxQBKnT6jVKlXl6tVImKUjCYXfnnOMU58cYpkykW6dW0qVml4Ej2icAuXrtFv2X5Jo4qnJrasyRQ9YHZDBhXC7EuRp0wvUk+eHrkst2AHkhAlFRsaSWHYoOBNw2LOy+GEhHYeI7y6+xwd+PD/MFqtAF4J4mfFqU/iDzuUKoOWHY6jxSAMctQJO7K9j5LBVjGqG+zFIO1+GCRt0eSCphNFM9kkBXTdYZ6M7Zrmkue83unH5W8p4FlBFRuyzEn7Ml6d6Q6BPUrcoPfDDXKN0derMyUbseYBrK48omiKWPD3lEj/yo9jVLmCbi3F8p4/SKnzmpwgs327bwy9IMvVGwIwtpXAuDEfrFew9yeQczoIdj8lzNlIr6c6tZ5bLwtplnI6RtuyDieslnFEXPHRPcgiuKCIKc+IxMhk9lyRPYD+ubib7XWLw5weOZVUc96mjwky70TIVtF6n2HzToKs0RbV6h8o7xTm8hN0ZS0VTyMj7KgZfEWKaJ9kYB3lBQYXbMZlxv/G0V9+3AAEdkLEbIicBOFk6CDKMdChYIVIe75LEwJ2kuctKvZCN9P1IWOIVAjAIiBKrkwJ+koQCEG/SKLM/ctkLwefpu3pj1fW/tW2xonaf2rJZbRK9phairQ9S7kot3wvk+rR5pQ12NhcDMVDZnuBbbZJt8t93t8/G2QbxOYyJx1JWbSsOEFNVbQI38RsWroTj9Q1Ump+nAE/fK4UK1bcknD2Wb7yILQwuT0KxxRjH9IgqooSLxaDGwZp9YmK9kJRmnfM6VGIolj0USWKOr91xecKbNDQR5SUUQyER0CCjQyDPwEUsILY+jPASYTiqVq1a11GUwjgQhAiimlcxg+RDNdKJ+h3l0EO9boKBsRQXYU/DgEuZIL4LT2puMKpE/v1owvnQPZ0hS48y2poLDX7a2ODc1gd56pYo8El58R/m4WNizH8fxmnS3RnPpHIRZ4FPF925bZGzuibSMng088HLu3xjCFNIrN7nTYK4QFvwk7NuxquT4Al2ZTUD7BMPxXTd79Pt5jbxKrtXbU5GexRpX1bPmU+Lps92rfl0j18eBq/zc5QuuVyxJZ0hpP5QEIK0Sm42zIETmigqRrdnZ971MMRk2VM6AQNkabruyeCwej3DCrN7z7/fjuDRL0hWO8wnHrRO0i6IjILHTRcSQNANSkqh5DFPAZ+gnKZtdAH2zTpW+10HF8UGQACvuiMKFRW2rcjiou2WLBtdyJxnaVNFvr3cOoCHSyx/hJR6QfU3Jmoz1+ezLX6Dn0oY3Kq1ZFRRtobVZ3UMc2HNvk3GVg1lOnR+kKUP12Md1DoQDFGRUbIau7GiJxNTP8Tzr8IsIcN5S+JViAvmHj+IO5vPtntRD09K93jvKak/W4e71XuuLMCaa+D6qVALt+I7mBvMk5XzczZ3VyVriodGKarZE6OFG1cthElyZjU5vu8wPW4g6Q/2RdKfzL6kFupaeK7Dtuy5BOZ/lOTkHJwO4Ulxg5N8AsijhdCcPeqBBcbhIrNfqrsT82ZSJoYvoFVd2T554pWXVziDv3uxf5wdbTHtYpDe5IBRIrFxE+StMCpAz+MmnGLwaqQmMiMPBeJC4Ps7hcg4WEZnTI6LzweQwCtysfisjscCfdfuiMdzd8tgY9a/NRZs7XvBbxdyM9HcV1ZB3c1bnYPg1eauTDJoXP8Uy1TdAXVGEybH1h1T6c9pct1v7usDy4DOtdvc/h18fUenv4tSkARDkp0naWmdqL6kRyT1xKJcTpzwWXiMCz19eXnjw5bJu/Q7K5BejS7Ygj8u6QB5WvbPnfbZWWsTUxdP3PHXGmytTtIP3OVDHzuPuHRsvGpsoZG4uqLjhqqXHjtQmDK6VIXAqMZIx4bL0gmdK1Lq7HLSwn3frPeW/SWUtOgawfKTpTbBkdc/nvIwqRKD+exfbj5/XZWpgOfm4grAUgrG/0nDW3gP3SG2jj5owZSuVzJpSd7uZwEUPFlFScZ5lfTZOjEpJXrtjthsZBGeugB/Xiu3eL2uh1+80BRtxO+DxJrBHGoLl7nW4z2vsA2Wgrpjyv4Hnmzc8gsPj6vjBnQWhcQRKSEeBMhR7GR39oxPZZMVJUhLf0cJ7LjAhryihLIAz/3ICeKbkWLGM/IyPVRQm6x8dGEBSDNV5/T/YgrlO46f5srfbniC/dFdVo8Hx7Ve8/Kyzc1l6u8sc6DR2xqh0dd0+YUl2q26A3S6KoyKsaVajp7+vHqTEqjPklhyztu49pRVrgpVv3KMHdbEecJCjjfQS2+irDSxTgyLwBKzNI75sm9Nk3jtXfLQprGds+MRpS1gnpTt6hoC32w9wexlXA+ZM/o3g82dagMdYxSaBFXDGSt3lyKPre/5w7cNdEd1JgoqVrTJrrd6NOhTzTNOmItpEVrHZ2gEy2NpFfjZToE6Br3PnXvgkE32XsyREuDX/jG2Zkaeu9ocv5sdBxCNgexVq0EbXRaB4Ktn8DTsR1Ey3+fpcE+A1scI6nK62jtH3Y8SMyks69C+gywen7a32CVop4AI9YQIEWtB9AwHuzjRLxio+oea0Gp60xDDRHNPGgzfejYkrUkfJGPyn5YS4BvNsQsj5kKc8y7Pz/f/VYZ2DJdtc2uXiU11fxgGj0MssrXX70EgyxsUMjSSWXpFpeqmXs1G66nOHcKitwO1NSeazNt7s0bDss2UBzw003OOLv6kQCe7Zx5AXsUVbV33qCdDk7MCTBeSQ2nBCUOVMLZm1HNqdd2wdl8x1yleiZuGXPgGPLNpge6vDaQJIgqyg4UCWAcIoDxzyGA4w1IALG7T83fwd19Mlb3TPwZ6FXgW8P/oFdNI08jq4ZIet8c32sSgNIbgWoU+lRkJF1VM93A1d6A0odhaYFoAraZQ7u94rspf0Grx5H+CWwipPE6miGp3JOyCv8ulFNYRZfI5GBWmn3TL3OvMssvzUhPXdFKH3oD3aUaFX3d4B6FQbrdQYHAhrjwH7DYBUQLYLn1MVfsNs85kg34cQsFGEptXn3MSDMw+NenHUA9wVBXQiTRyz3eO0V/7cnuaj08TBQFbVAErwFYIxQbyEPrzYgGaOgFuJxi5TMFg35w4/8iR3g/dQngnYTlrDmUanyL8HexIunaUSaI030oeUPdEi7JD7jUaAkWn+9ubj/e9XxF3bSesfMPqkewOHtnSCqDLFnE+oQEmh556bP6fv/jBtDoGbI4rElXjqWPBAerbaohm4/rWnta+ptlIv69lg57sUw9ymJKmIj1/74ME90LanHE95vVyjfHH+02cTndo2uvoRiqYVj53El8TO9YV8BROv8+KgkMN08oiusSo9O1QfL6TsxaVfFqcE0CeM7OGO8JeOieS6vmhl5JXfL6SFi7eX81HaNKee0u2YM0jXK4bBhvKmtsIit15V92dJpQw6EM7cbVglGodPgjxA7oov8D</diagram></mxfile>"}"></div> <script type="text/javascript" src="https://cernbox.cern.ch/byoa/drawio/js/viewer-static.min.js"></script> diff --git a/docs/aims2/aims2how2.md b/docs/aims2/aims2how2.md deleted file mode 100644 index 4089a55e68ec09b59d185be76792263f1c03688b..0000000000000000000000000000000000000000 --- a/docs/aims2/aims2how2.md +++ /dev/null @@ -1,3 +0,0 @@ -# How to - -TODO: This section has not yet been reviewed and is located on <https://twiki.cern.ch/twiki/bin/view/LinuxSupport/Aims2how2>. diff --git a/docs/aims2/aims2server.md b/docs/aims2/aims2server.md index f20dad0c9d5692f6f506f6afbf3276ec67651ba6..7317266a94c1bae9dbdd4ac9a795907f28a13159 100644 --- a/docs/aims2/aims2server.md +++ b/docs/aims2/aims2server.md @@ -4,21 +4,22 @@ AIMS2 servers have two main roles. The first is to service client requests, hand ## Server logic -When the user adding a new image is authorised, the image blob is inserted directly in the AIMS2 database. Then `httpd` takes care of building the uploaded image and store it on disk under its corresponding directoy according to the server TFTP structure; it also builds the corresponding configuration files for the registered interfaces. All boot images, bootloaders and PXE configurations are shared between all hostgroup nodes thanks to CephFS shares. +When the user adding a new image is authorised, the image blob is stored on a CephFS share. `httpd` takes care storing it on disk under its corresponding directoy according to the server TFTP structure; it also builds the corresponding configuration files for the registered interfaces. All boot images, bootloaders and PXE configurations are shared between all hostgroup nodes thanks to CephFS shares. ## TFTP Structure -The following structure shows how an AIMS2 server is configured (Root path `/tftboot`). +The following structure shows how an AIMS2 server is configured (Root path `/aims_share/tftboot`). ```bash [root@aims01 tftpboot]# ls -R | grep ":$" | sed -e 's/:$//' -e 's/[^-][^\/]*\//--/g' -e 's/^/ /' -e 's/-/|/' . + |-dnsmasq |-aims |---boot - |-----FEDORA27_X86_64 + |-------CS8_X86_64 |-------initrd |-------vmlinuz - |-----FEDORA28_X86_64 + |-----CC7_X86_64 |-------initrd |-------vmlinuz |-----HWREG_AUTOINSTALL @@ -27,7 +28,6 @@ The following structure shows how an AIMS2 server is configured (Root path `/tft |-----arm64 |-------arm64-efi |-----bios - |-----lgcy |-----uefi |-------themes |---------not @@ -36,9 +36,9 @@ The following structure shows how an AIMS2 server is configured (Root path `/tft |-----arm64 |-----bios |-------aims - |-----lgcy - |-------aims + |-----netbootxyz |-----uefi + |-aimshttpupload |-hwreg |---loader |-----bios @@ -47,7 +47,7 @@ The following structure shows how an AIMS2 server is configured (Root path `/tft |---pxelinux.cfg ``` -Please note that `aims/config/<TYPE>` and `aims/boot/` are dynamic directories, that is, their contents is maintained by the aims2 syncronisation daemon (`/usr/sbin/aims2sync`). You will not be able to manually add a configuration file (01-*) to the `aims/config/*` directories unless it is registered in the aims2 database. PXE menus configurations are not affected. Directories in `aims/boot` are maintained by the daemon. Any directory not known to the database is removed. +Please note that `aims/config/<TYPE>` and `aims/boot/` are dynamic directories, that is, their contents is maintained by `aims2server`. ## SOAP Interface @@ -76,10 +76,9 @@ use aims2server::db; ```perl # These are our callable methods. Anything else will croak below. - for my $Method qw(AddHost RemoveHost GetHostByName AddImage RemoveImage GetImageByName ListAllImages EnablePXE DisablePXE HostHistory GetKickstartFile UpdateKickstartFile) + for my $Method qw(AddHost RemoveHost GetHostByName AddImage RemoveImage GetImageByName ListAllImages EnablePXE DisablePXE HostHistory GetKickstartFile UpdateKickstartFile Permission) -use subs qw(new SetUser AddImage RemoveImage GetImageByName AddHost RemoveHost GetHostByName EnablePXE DisablePXE GetKickstartFile UpdateKic -kstartFile RemoveKickstartFile); +use subs qw(new SetUser AddImage RemoveImage GetImageByName AddHost RemoveHost GetHostByName EnablePXE DisablePXE GetKickstartFile UpdateKickstartFile RemoveKickstartFile Permission); ``` ## Authentication @@ -88,15 +87,11 @@ Kerberos authentication setup is done through the hostgroup. ## Database -`aims2server` database supported is provided by Oracle. +`aims2server` database supported is provided by DBoD. The database configuration is set at `/etc/aims2.conf`, through the [AIMS2 hostgroup](https://gitlab.cern.ch/ai/it-puppet-hostgroup-aims). -### Accessing the database via SQLPlus - -To access the database directory, the following can be used from `aiadm`: - -`sqlplus64 aims2@DATABASE`. Credentials can be checked on `/etc/aims2.conf` on AIMS nodes. Please be aware of which environment you want to check. +Check [our resources page](./resources.md) to get the access details and credentials. ## Server-side configuration @@ -106,7 +101,7 @@ You can use the `aims2config` tool from within an AIMS host to see, update or cr ## Cleanup Daemon -To maintain the server clean of dead hosts, old configurations and our `dnsmasq` workaround, each server has a daemon running. This daemon maintains a persistent connection to the database +To maintain the server clean of inconsistent PXE configurations that do not match the bootmode a node has in the database each server has a daemon running. This daemon maintains a persistent connection to the database. * Starting the daemon: `service aims2cleanup start` @@ -118,3 +113,9 @@ During normal operation the daemon will maintain a connection to the database. I The daemon will log output to `/var/log/aims/aims2cleanup.log` +### Old PXE configurations cleanup + +Since we do not want to keep the PXE configurations files of a node forever (as this means it will always try to install whatever the configuration file says), we have a cleanup mechanism [here](https://gitlab.cern.ch/ai/it-puppet-hostgroup-aims/-/blob/af8603c4275cc8288b76c2377a568013cf42a145/code/manifests/aims.pp#L324-340). This runs a daily cleanup using [aims2oldhostsremoval](https://gitlab.cern.ch/linuxsupport/rpms/aims2/-/blob/master/src/aims2oldhostsremoval) script to remove no longer relevant content. + +The reason why we run the script from a cron on the Puppet configuration is to avoid it running on all nodes at the same time. Only one node running is required as all of them share the filesystem. + diff --git a/docs/aims2/aims2workflows.md b/docs/aims2/aims2workflows.md index 3979bbb7a8a7d2c5ef9507767cbbf51a20f12a05..4b86978cf9e298250b7ae797b8978c2d8cee09af 100644 --- a/docs/aims2/aims2workflows.md +++ b/docs/aims2/aims2workflows.md @@ -101,4 +101,4 @@ The workflow implies the following: * [LOS-900: Having a test environment for DHCP](https://its.cern.ch/jira/browse/LOS-900) * This would be great as it would avoid depending on anyone to test change by change and agreeing on dates and times to do so. - * Otherwise sorry for the next souls that need to test changes on these workflows + * Otherwise sorry for the next souls that need to test changes on these workflows. diff --git a/docs/aims2/replacerebootaimsserver.md b/docs/aims2/replacerebootaimsserver.md index dded306d525c5676663808da9d96f4ab4bbcc255..2f7fed965ae17d71714e88a083fcd51413f57293 100644 --- a/docs/aims2/replacerebootaimsserver.md +++ b/docs/aims2/replacerebootaimsserver.md @@ -10,7 +10,7 @@ In case you need to replace any of the servers, the procedure is as simple as it sudo roger update --appstate intervention --message "Replacing node" --all_alarms false ``` -* Wait for the slave to replace the master +* Wait for the Load Balanced alias to remove the desired node * Delete the previous instance * Make sure you do not have anything worth keeping from these instances such as any test or temporary file diff --git a/docs/aims2/resources.md b/docs/aims2/resources.md new file mode 100644 index 0000000000000000000000000000000000000000..01da618dbbd78a724a37a169ed17c961471dbbb3 --- /dev/null +++ b/docs/aims2/resources.md @@ -0,0 +1,99 @@ +# Related resources + +## Useful Mattermost channel + +Ask Arne Wiebalck to add you to <https://mattermost.web.cern.ch/it-dep/channels/ironic-internal>. This is the meeting point for Ironic/Procurement/AIMS2 conversations and helps speeding up certain tests or debugging. + +## Databases + +**Production** + +* Databases: `dbod-aims.cern.ch:6601` +* User: `aims` +* Retrieve credentials with: `tbag show --hg "aims" aims_psqldb_pwd` +* Owner: `aims-admins` + +**Test** + +* Databases: `dbod-aimstest.cern.ch:6616` +* User: `aims` +* Retrieve credentials with: `tbag show --hg "aims" aims_psqldbtest_pwd` +* Owner: `aims-admins` + +**Please be aware when connecting directly to the database. It is case sensitive and might have issues with passwords that contain `@` character.** + +If you change the Database password, please do the following: + +* Update the service account password and update the Teigi stored value: `tbag set --hg aims aims_psqldb_pwd --binary <new_password>` or `tbag set --hg aims aims_psqldbtest_pwd --binary <new_password>` +* Force a Puppet run on each of the AIMS hosts + +Please check [the account management for Oracle accounts](https://cern.service-now.com/service-portal?id=kb_article&n=KB0000829) for further information. + +## Storage + +* CephFS shares: + * `aims.cern.ch` + * `AIMS Service` tenant + * `aims_share` CephFS share + * `aims_id` share-id + * `flax`/`Meyrin CephFS` cluster + * `remote_path`: `'/volumes/_nogroup/cebaadff-7d3d-4a1f-88d0-6eb773325901'` + * `access_key`: `tbag show --hg aims flax.aims_id.secret` + * Size 500GB + * [Metrics](https://filer-carbon.cern.ch/grafana/d/000000111/cephfs-detail?from=now-90d&orgId=1&refresh=1m&to=now&viewPanel=15&var-cluster=flax&var-share=cebaadff-7d3d-4a1f-88d0-6eb773325901) + * `aimstest.cern.ch` + * `AIMS Service` tenant + * `aims_share_test` CephFS share + * `aims_id` share-id + * `dwight`/`Geneva CephFS Testing` cluster + * `remote_path`: `'/volumes/_nogroup/4272339e-0efb-4d0f-b17d-43d4d6ec600b'` + * `access_key`: `tbag show --hg aims dwight.aims_id.secret` + * Size 200GB + * [Metrics](https://filer-carbon.cern.ch/grafana/d/000000111/cephfs-detail?from=now-90d&orgId=1&refresh=1m&to=now&viewPanel=15&var-cluster=dwight&var-share=4272339e-0efb-4d0f-b17d-43d4d6ec600b) + +!!! note "" + `aims.cern.ch`, i.e. production, has backups configured by our CephFS colleagues. See [RQF1823187](https://cern.service-now.com/service-portal?id=ticket&table=u_request_fulfillment&n=RQF1823187). If you need any assistance, contact them or open a Service Now ticket. + + +## Service accounts + +* User: `aims` +* Retrieve credentials with: `tbag show --hg "aims" aims_password` +* Owner: Alex Iribarren +* Purpose: connecting to LANDB and LDAP + +**Password changing procedure:** + +* Annouce with an OTG that the service will be degraded for 5-10min. No new issued installations will work. +* Update the service account password and update the Teigi stored value: `tbag set --hg aims aims_password --binary <new_password>` +* Force a Puppet run on each of the AIMS hosts + +--- + +* User: `linux_private` +* Retrieve credentials with: `tbag show espassword --hg aims` +* Owner: Ulrich Schwickerath +* Purpose: Sending logs to our ES instance + +## Egroups + +* `aims2-upload`: Users with permissions to upload images to AIMS servers +* `aims-admins`: AIMS administrators +* `aims2-cc-admins`: Administrators for hosts on buildings 513, 613, 9994 (SafeHost), 9918 (Wigner), 773 (Network Hub) or 6045 (LHCb containers) + +**Note these egroups can only be replaced for AIMS if updating `CONF` table on the database. Keys correspond to `EGROUP_UPLOAD`, `EGROUP_AIMSSUPPORT` and `EGROUP_SYSADMINS`.** + +## LB Alias + +* `aims.cern.ch`: used for prod instances. +* `aimstest.cern.ch`: used for test instances. +* `aimsdev.cern.ch`: used for dev instances and debugging. Might be empty at any given point. It helps isolating specific nodes. + +## Related GitLab projects + +* AIMS2 hostgroup: <https://gitlab.cern.ch/ai/it-puppet-hostgroup-aims> + * As of July 2021, these are the existing nodes for the hostgroup: + * **prod**: `aims01`, `aims02`, `aims03` + * **test**: `aimstest01`, `aimstest02`, `aimstest03` +* AIMS2 applications: <https://gitlab.cern.ch/linuxsupport/rpms/aims2> +* Other AIMS2 related components: <https://gitlab.cern.ch/linuxsupport/aims> diff --git a/docs/aims2/syncaimstestwithprod.md b/docs/aims2/syncaimstestwithprod.md index 2715356dbc4d5a0d5918e43fa4c4070e75a8bab2..c4a11f99b2411ff5c9a1a5ff95ed21fe3762714e 100644 --- a/docs/aims2/syncaimstestwithprod.md +++ b/docs/aims2/syncaimstestwithprod.md @@ -1,6 +1,8 @@ # Sync AIMS and AIMSTEST -In this section we are assuming you have an already configured DBeaver instance, including TNS names and all (in `/eos/project/o/oracle/public/admin/tnsnames.ora`). +## Using DBeaver and `rsync` + +In this section we are assuming you have an already configured DBeaver instance, including TNS names and all (in `/eos/project/o/oracle/public/admin/tnsnames.ora`). You can have this locally or use [CERN Terminal Servers](https://remotedesktop.web.cern.ch/remotedesktop/). For AIMS and AIMSTEST credentials check `tbag showkeys --hg aims` or `cat /etc/aims2.conf` from within an AIMS prod or test node accordingly: @@ -17,3 +19,7 @@ tbag show aims_psqldb_pwd_admin --hg aims * There may be some conflicts from key collisions. Delete them by hand if you want, or directly empty table data from test instance prior to migration * You now directly use the Datebase export and make it point to the test instance. Check the "ON CONFLICT DO UPDATE SET" setting. 1. Rsync the boot directories to sync all the PXE images: `[root@aims01 ~]# rsync -arv /aims_share/tftpboot/aims/boot/ root@aimstest01:/aims_share/tftpboot/aims/boot` + +## Other methods + +Feel free to add your own way to do so, maybe you have a simpler way using just commands. \ No newline at end of file diff --git a/docs/aims2/troubleshooting.md b/docs/aims2/troubleshooting.md index cad5c0c3e53cc1d1d70aedbf53305549837c6906..401846b37c69db46c6ff8254d66d438984a3634c 100644 --- a/docs/aims2/troubleshooting.md +++ b/docs/aims2/troubleshooting.md @@ -45,15 +45,24 @@ find / -type d -wholename "/var/log/aims2sync.log*" | xargs zgrep "IPXETESTNETBO ### Sample logs and its meaning -`aims2sync` +`aims2server` Entries refer mostly to interface configurations synced to disk, i.e. `/tftpboot/aims/config/.../...` or to synced images that are ready to use. ```bash -Aug 23 22:00:24 aims01.cern.ch aims2sync[17241]: ADD pxe conf for 01-0c-c4-7a-37-8d-e7 / MAC 0c:c4:7a:37:8d:e7(PZPY2NXJFBG) [bios] +Apr 01 17:41:38 aims01.cern.ch server.cgi[3213235]: 188.185.120.186 - ADD pxe conf for 01-a4:bf:01:5e:fb:c1 / MAC a4:bf:01:5e:fb:c1 (RALLY-2225-JCYS) [uefi] ``` -`tftpd` +These correspond to our monitoring. See <https://kojimon.web.cern.ch> +``` +Apr 01 17:41:47 aims01.cern.ch httpd[3095698]: ::1 - - [01/Apr/2022:17:41:47 +0200] "GET /server-status/?auto HTTP/1.1" 200 825 459 "-" "Go-http-client/1.1" +``` + +You may see many other logs but they are self explanatory. + +Bear in mind as of April 2022 we have enabled DB debug level to know the queries being done. It can be removed if desired but has been proven useful for debugging past issues. + +`xinetd` Entries refer to TFTP transactions with the clients, IP corresponds to client's IP and can be checked on <https://network.cern.ch>. Note `in.tftp` comes from the `xinetd` unit. diff --git a/mkdocs.yml b/mkdocs.yml index 7fd4ff808fc7cd1d2416213778dae51ca975f826..655615d407d820fab9ba3edb71f6c971b88cd693 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -57,6 +57,7 @@ nav: - 'Troubleshooting': nomad/troubleshooting.md - 'AIMS2': - 'AIMS2 ': aims2/aims2.md + - 'AIMS2 resources': aims2/resources.md - 'AIMS2 Architecture': aims2/aims2architecture.md - 'AIMS2 diagrams': aims2/aims2diagrams.md - 'AIMS2 PXE workflows': aims2/aims2workflows.md