- Mar 18, 2025
-
-
Nacho Barrientos authored
-
- Mar 17, 2025
-
-
Nacho Barrientos authored
-
Nacho Barrientos authored
-
Nacho Barrientos authored
-
Nacho Barrientos authored
-
Nacho Barrientos authored
-
- Mar 14, 2025
-
-
Nacho Barrientos authored
-
- Mar 04, 2025
-
-
Nacho Barrientos authored
-
- Mar 03, 2025
-
-
Nacho Barrientos authored
This patch forces Helm to abort if kubernetes.clusterName is not set when *.fluentbit.enabled = true.
-
Nacho Barrientos authored
Moreover, if Prometheus remote write is enabled and `metrics.prometheus.server.remoteWrite.username` and `metrics.prometheus.server.remoteWrite.password` are not set the chart will fail to install now.
-
Nacho Barrientos authored
With this patch the chart will fail to install if either tenant.name or tenant.password are not set if they're necessary, example: ``` λ helm -n monitoring diff upgrade kubernetes-monitoring . -f values.yaml -f ibarrien.yaml Error: Failed to render chart: exit status 1: Error: execution error at (cern-it-monitoring-kubernetes/templates/fluentbit-metrics/configmap.yaml:22:13): Tenand password is required ``` If no fluentbit forwarding is configured, i.e.: ```yaml metrics: fluentbit: enabled: false logs: fluentbit: enabled: false ``` then the chart installs fine as they're note required in that case.
-
- Feb 26, 2025
-
-
Nacho Barrientos authored
-
- Feb 14, 2025
-
-
Nacho Barrientos authored
-
- Jan 27, 2025
-
-
Guillermo Facundo Colunga authored
In commit c6c3db28 we added support for lua scripts provided via user values. This commit adds a default value and documents this in the values file. Reported-at: https://its.cern.ch/jira/browse/MONIT-4106 Signed-off-by:
Guillermo Facundo Colunga <guillermo.facundo.colunga@gmail.com> Signed-off-by:
Guillermo Facundo Colunga <guillermo.facundo.colunga@cern.ch>
-
Guillermo Facundo Colunga authored
In commit 8c054c8 we added support for lua scripts provided via user values. This commit adds a default value and documents this in the values file. Reported-at: https://its.cern.ch/jira/browse/MONIT-4106 Signed-off-by:
Guillermo Facundo Colunga <guillermo.facundo.colunga@gmail.com> Signed-off-by:
Guillermo Facundo Colunga <guillermo.facundo.colunga@cern.ch>
-
- Jan 23, 2025
-
-
Guillermo Facundo Colunga authored
-
- Jan 21, 2025
-
-
Guillermo Facundo Colunga authored
In commit 2d80829 we added the support for custom prometheus server relabelings. This commit adds the default value for this option into the values.yaml file. Reported-at: https://its.cern.ch/jira/browse/MONIT-4105 Signed-off-by:
Guillermo Facundo Colunga <guillermo.facundo.colunga@gmail.com> Signed-off-by:
Guillermo Facundo Colunga <guillermo.facundo.colunga@cern.ch>
-
- Nov 27, 2024
-
-
Guillermo Facundo Colunga authored
Some users reported that the current implementation of the metrics flow it is causing errors in the fluent bit components that scrape the metrics from the local prometheus and forwards to open telemetry. This commits inverts the paradigm and now it is prometheus the one doing the remote write into the local fluent bit. After that, fluent bit is doing a remote write as it was doing previously into the monitoring infra. Reported-at: https://its.cern.ch/jira/browse/MONIT-4077 Signed-off-by:
Guillermo Facundo Colunga <guillermo.facundo.colunga@gmail.com> Signed-off-by:
Guillermo Facundo Colunga <guillermo.facundo.colunga@cern.ch>
-
- Nov 12, 2024
-
-
Guillermo Facundo Colunga authored
Description of the change: - Added option to enable alertmanager. - If option is selected then alertmanager is deployed. - Can be deployed with or without an ingress. - Can be deployed in ha or not ha mode. Impact: - No existing value is modified and alertmanager comes disabled by default. Testing: - Installed locally using helm install and custom values. - Enabled the alertmanager with ingress and connected to it. - Created a prometheus rule firing and saw it in alertmanager. JIRA Refs: - MONIT-4063
-
- Oct 28, 2024
-
-
Guillermo Facundo Colunga authored
After requests from users this commits adds the push gateway component to the cluster. For that adds a deployment that deploys a replica of the prometheus gateway and then defines different resources like the service and the ingress to access the push gateway and the servicemonitor to scrape the metrics on it. Testing: - Installed locally via helm install. JIRA Refs: - MONIT-4040
-
- Oct 23, 2024
-
-
Author: Nacho Barrientos <nacho.barrientos@cern.ch>
-
- Oct 16, 2024
-
-
Guillermo Facundo Colunga authored
After adding a component to write the loga from kafka to opensearch we found that the format in which the logs are being sent by the kubernetes clusters is not ideal. In this commit we modify this format so that we filter in and out only relevant fields. JIRA Refs: - MONIT-4021
-
- Oct 09, 2024
-
-
Guillermo Facundo Colunga authored
After some feedback from the kubernetes team we notice the need to point all the images from the chart to use the internal CERN registry. As an example only those images are exposed to the TN. This commit changes all the images to use the internal repository from cern. JIRA Refs: - MONIT-4028
-
- Oct 04, 2024
-
-
Guillermo Facundo Colunga authored
Until now all the components that were not a daemonset and needed to be deployed on a node contained a config option to indicate a node selector for the allocation. This commits adds a new global option that allows to set a node selector that will be applied to all the components that do not declare one. If any component declares a node selector this last one will have always more priority. JIRA Refs: - MONIT-4003
-
Guillermo Facundo Colunga authored
In order to be able to easily be able to check which version of the chart users using the opentelemetry flow are using this commit overrides the user agent header with the chart name and version.
-
- Oct 01, 2024
-
-
when fluentbit loses connection to the metrics endpoint (open telemetry), unsent metrics are dropped and permanently lost. This commit enables the local buffering feature in Fluent Bit, allowing metrics to be temporarily stored in a local buffer. When the connection is reestablished, these buffered metrics are retried and sent to the endpoint. This improves reliability by minimizing metric loss during intermittent connection issues. Proposed-by:
Nacho Barrientos <nacho.barrientos@cern.ch>
-
- Sep 03, 2024
-
-
Guillermo Facundo Colunga authored
Signed-off-by:
Guillermo Facundo Colunga <guillermo.facundo.colunga@cern.ch>
-
- Aug 16, 2024
-
-
Guillermo Facundo Colunga authored
Signed-off-by:
Guillermo Facundo Colunga <guillermo.facundo.colunga@cern.ch>
-
- Aug 15, 2024
-
-
Guillermo Facundo Colunga authored
Signed-off-by:
Guillermo Facundo Colunga <guillermo.facundo.colunga@cern.ch>
-
- Aug 02, 2024
-
-
Guillermo Facundo Colunga authored
Signed-off-by:
Guillermo Facundo Colunga <guillermo.facundo.colunga@cern.ch>
-
Guillermo Facundo Colunga authored
Signed-off-by:
Guillermo Facundo Colunga <guillermo.facundo.colunga@cern.ch>
-
Guillermo Facundo Colunga authored
Signed-off-by:
Guillermo Facundo Colunga <guillermo.facundo.colunga@cern.ch>
-
- Jul 25, 2024
-
-
Borja Garrido Bear authored
-
Borja Garrido Bear authored
-
-
- Jul 11, 2024
-
-
Guillermo Facundo Colunga authored
Signed-off-by:
Guillermo Facundo Colunga <guillermo.facundo.colunga@cern.ch>
-
Guillermo Facundo Colunga authored
This commits updates the values file to add a new field named `metrics.prometheus.server.serviceMonitors` that is a list that can be used to tell the helm chart to create service monitors on cluster creation time. Also adds a template to generate this new resource.
-
Guillermo Facundo Colunga authored
As part of this commit the following additions have been done: * removed templates/namespace in favor of `.Release.namespace` * use `enabled` flags per component * add metrics server dependency (it was missing) * add to the readme file all default values with its description
-
- Jul 02, 2024
-
-
This commit adds a new component to forward metrics. A fluent-bit instance devoted to read metrics from the local prometheus and send them to wherever configured. By default this is used to send metrics to OTEL, but can be configured to have multiple flows. Author: Borja Garrido Bear <borja.garrido.bear@cern.ch> Signed-off-by:
Borja Garrido Bear <borja.garrido.bear@cern.ch>, Guillermo Facundo Colunga <guillermo.facundo.colunga@cern.ch>
-
- Jun 25, 2024
-
-
Guillermo Facundo Colunga authored
- In the Fluent-bit daemon set it had the wrong number of indents. - The key was misspelled everywhere as commomLabels. Changed to commonLabels. Signed-off-by:
Guillermo Facundo Colunga's avatarGuillermo Facundo Colunga <guillermo.facundo.colunga@cern.ch>
-