1. 20 Jan, 2022 2 commits
  2. 28 Oct, 2021 1 commit
  3. 27 Oct, 2021 1 commit
    • Miciah Masters's avatar
      Change default balancing algorithm to "leastconn" · b1a50405
      Miciah Masters authored
      Configure OpenShift router to use the "leastconn" balancing algorithm for
      non-passthrough routes by default.
      
      This was the default algorithm for non-passthrough routes before it was
      changed to "random" in OpenShift 4.8.  The "random" algorithm is expected
      to provide better behavior across reloads or multiple router pod replicas.
      However, it incurs significant memory overhead for each backend that uses
      it, and so we need to change the default back to "leastconn" in order to
      avoid excessive memory usage until we can fix the issue in HAProxy.
      
      This commit fixes bug 2007581.
      
      https://bugzilla.redhat.com/show_bug.cgi?id=2007581
      
      * pkg/operator/controller/ingress/deployment.go (desiredRouterDeployment):
      Change the default balancing algorithm for non-passthrough routes to
      "leastconn", and allow an override to set "random".
      * pkg/operator/controller/ingress/deployment_test.go
      (TestDesiredRouterDeployment): Update to expect "leastconn".
      * test/e2e/operator_test.go
      (TestLoadBalancingAlgorithmUnsupportedConfigOverride): Update to expect
      "leastconn" as the default setting and verify that the override allows
      setting "random".  Add a check that passthrough routes use "source".
      b1a50405
  4. 28 Sep, 2021 1 commit
  5. 10 Sep, 2021 1 commit
  6. 03 Sep, 2021 1 commit
    • Miciah Masters's avatar
      Configure router to use "source" for passthrough · 647edc6c
      Miciah Masters authored
      Configure OpenShift router to use the "source" balancing algorithm for
      passthrough routes in order to provide some session-affinity.
      
      This was the behavior for passthrough routes before OpenShift 4.8, and
      changing it was unintentional.
      
      Follow-up to commit e83b057c.
      
      This commit fixes bug 1997407.
      
      https://bugzilla.redhat.com/show_bug.cgi?id=1997407
      
      * pkg/operator/controller/ingress/deployment.go
      (RouterTCPLoadBalancingAlgorithmEnvName): New const for the
      "ROUTER_TCP_BALANCE_SCHEME" environment variable.
      (desiredRouterDeployment): Set ROUTER_TCP_BALANCE_SCHEME to "source".
      * pkg/operator/controller/ingress/deployment_test.go
      (TestDesiredRouterDeployment): Verify that desiredRouterDeployment sets
      ROUTER_TCP_BALANCE_SCHEME appropriately.
      647edc6c
  7. 26 Aug, 2021 1 commit
  8. 10 Jun, 2021 1 commit
  9. 07 Jun, 2021 2 commits
  10. 04 Jun, 2021 1 commit
  11. 03 Jun, 2021 5 commits
    • Miciah Masters's avatar
      Add local-with-fallback unsupported config override · fcc0f4a2
      Miciah Masters authored
      
      
      * pkg/operator/controller/ingress/load_balancer_service.go
      (desiredLoadBalancerService): Use the new shouldUseLocalWithFallback
      function to determine whether to set the "local-with-fallback" annotation.
      (shouldUseLocalWithFallback): New function.  Check the service's external
      traffic policy and the ingresscontroller's unsupported config overrides and
      return a Boolean value indicating whether "local-with-fallback" should be
      enabled, or an error if the override could not be parsed.
      * pkg/operator/controller/ingress/load_balancer_service_test.go
      (TestShouldUseLocalWithFallback): New test.  Verify that
      shouldUseLocalWithFallback behaves as expected.
      * pkg/operator/controller/ingress/nodeport_service.go
      (ensureNodePortService): Check the error value from desiredNodePortService.
      (desiredNodePortService): Add error return value.  Use the new
      shouldUseLocalWithFallback function (which can return an error) to
      determine whether to set the "local-with-fallback" annotation.
      * pkg/operator/controller/ingress/nodeport_service_test.go
      (TestDesiredNodePortService): Check the error value from
      desiredNodePortService.
      * test/e2e/operator_test.go
      (TestLocalWithFallbackOverrideForLoadBalancerService)
      (TestLocalWithFallbackOverrideForNodePortService): New tests.  Verify that
      the operator does not set the "local-with-fallback" annotation on a
      LoadBalancer or NodePort service if the localWithFallback unsupported
      config override is set to "false".
      Co-authored-by: default avatarcandita <cholman@redhat.com>
      fcc0f4a2
    • Ryan Fredette's avatar
      go mod vendor & go mod tidy · fd3945b1
      Ryan Fredette authored
      fd3945b1
    • Ryan Fredette's avatar
      Use ingress operator config region for AWS session if available · aed9291e
      Ryan Fredette authored
      AWS API connections in certain regions require that the session specify
      the correct region, otherwise STS authentication can fail. This commit
      addresses this by setting the region in the session explictly to match
      the region set in the platform status
      aed9291e
    • Miciah Masters's avatar
      Set the "local-with-fallback" service annotation · 35860222
      Miciah Masters authored
      Set the "traffic-policy.network.alpha.openshift.io/local-with-fallback"
      annotation on LoadBalancer- and NodePort-type services that use the "Local"
      external traffic policy.
      
      This commit is related to bug 1960284.
      
      https://bugzilla.redhat.com/show_bug.cgi?id=1960284
      
      * pkg/operator/controller/ingress/load_balancer_service.go
      (localWithFallbackAnnotation): New const.  Define the annotation key for
      the "local-with-fallback" service annotation.
      (desiredLoadBalancerService): Set the "local-with-fallback" annotation when
      using the "Local" external traffic policy.
      (managedLoadBalancerServiceAnnotations): New variable.  Define the keys for
      annotations that the operator manages on LoadBalancer-type services, which
      now include the "local-with-fallback" annotation.
      (loadBalancerServiceChanged): Fix the update logic to properly handle
      annotations with empty values using go-cmp and
      managedLoadBalancerServiceAnnotations.
      * pkg/operator/controller/ingress/load_balancer_service_test.go
      (TestDesiredLoadBalancerService): Verify that the "local-with-fallback"
      annotation is set when appropriate.
      (TestLoadBalancerServiceChanged): Verify that loadBalancerServiceChanged
      properly handles updates to the "local-with-fallback" annotation.
      * pkg/operator/controller/ingress/nodeport_service.go
      (desiredNodePortService): Set the "local-with-fallback" annotation.
      (managedNodePortServiceAnnotations): New variable.  Define the annotation
      keys that the operator manages on NodePort-type services.
      (nodePortServiceChanged): Check if any of the annotations in
      managedNodePortServiceAnnotations have changed.  Update managed annotations
      if they have changed.
      * pkg/operator/controller/ingress/nodeport_service_test.go
      (TestDesiredNodePortService): Verify that the expected annotations are set.
      (TestNodePortServiceChanged): Add a test case to verify that
      nodePortServiceChanged properly handles updates to the
      "local-with-fallback" annotation.
      35860222
    • Stephen Greene's avatar
      Ingress: Mount router stats secret as a volume · 7b3f4031
      Stephen Greene authored
      https://github.com/openshift/router/pull/291 gave the router
      the ability to use stats credentials contained in separate username
      and password files for security reasons.
      This commit modifies the Ingress deployment so that the router-stats
      secret is mounted into router pods, and the filenames for the username
      and password files are passed to the Router binary via new
      environment variables.
      
      pkg/operator/controller/ingress/deployment.go:
      
      Add a new volume and volume mount for the router stats secret.
      Replace the STATS_USERNAME and STATS_PASSWORD environment variables with
      the new STATS_USER_NAME_FILE and STATS_PASSWORD_FILE environment
      variables.
      
      pkg/operator/controller/ingress/deployment_test.go:
      
      Update `TestDeploymentConfigChanged` to reflect the new stats-auth
      volume.
      
      Update `TestDesiredRouterDeployment` to verify that the expected
      volumes exist with the proper secret references.
      
      This commit is in support of Bug 1955822.
      7b3f4031
  12. 02 Jun, 2021 2 commits
  13. 01 Jun, 2021 1 commit
    • Miciah Masters's avatar
      desiredNodePortService: Add port 1936 · d6dcbfd7
      Miciah Masters authored
      When creating the NodePort-type service for an ingresscontroller that
      specifies the "NodePortService" endpoint publishing strategy type, include
      a "metrics" port with port 1936 so that external load balancers can use the
      /healthz/ready endpoint.
      
      This commit fixes bug 1881210.
      
      https://bugzilla.redhat.com/show_bug.cgi?id=1881210
      
      * pkg/operator/controller/ingress/nodeport_service.go
      (ensureNodePortService): Pass an argument to desiredNodePortService to tell
      it not to add a "metrics" port if the service already exists and omits the
      port.
      (desiredNodePortService): Add a "wantMetricsPort" parameter.  Add port 1936
      to the service's ports spec if wantMetricsPort is true.
      * pkg/operator/controller/ingress/nodeport_service_test.go
      (TestDesiredNodePortService): Add a test case where wantMetricsPort is
      true.  Verify that desiredNodePortService returns the expected service.
      (TestNodePortServiceChanged): Update the service used for tests to be
      consistent with what desiredNodePortService returns.
      * test/e2e/operator_test.go
      (TestNodePortServiceEndpointPublishingStrategy): Verify that the operator
      creates the NodePort-type service with the expected ports.  Verify that the
      operator does not add the "metrics" port if it is is removed from the
      service.
      d6dcbfd7
  14. 27 May, 2021 2 commits
  15. 26 May, 2021 2 commits
    • Stephen Greene's avatar
      Update bindata for Canary daemonset changes · 0fb9e908
      Stephen Greene authored
      Run make update and commit the output.
      0fb9e908
    • Stephen Greene's avatar
      canary: Add priority class to canary daemonset · e2e2b612
      Stephen Greene authored
      OpenShift components are required to specify a
      PriorityClassName for daemonsets/deployments.
      
      This commit ensures that the Canary daemonset has the
      `system-cluster-critical` priority class set.
      
      assets/canary/daemonset.yaml:
      Set the `priorityClassName` field to be `system-cluster-critical`.
      
      pkg/operator/controller/canary/daemonset.go:
      Update the Canary daemonset when `priorityClassName` changes.
      
      pkg/operator/controller/canary/daemonset_test.go:
      Add a test to ensure that `priorityClassName` is set correctly by
      `desiredCanaryDaemonSet`.
      Add a test to verify that the Canary daemonset is updated
      when `priorityClassName` is mutated.
      
      This commit is in support of Bug 1954892.
      e2e2b612
  16. 20 May, 2021 2 commits
  17. 18 May, 2021 1 commit
  18. 13 May, 2021 1 commit
  19. 12 May, 2021 1 commit
  20. 11 May, 2021 7 commits
    • Stephen Greene's avatar
      ingress: Fix up openshift-ingress namespace reconciliation · 0faf81ce
      Stephen Greene authored
      Commit `ea085e72` added logic to reconcile the Ingress namespace during
      cluster upgrades. This follow-up commit fixes some mistakes introduced
      by that commit.
      
      pkg/operator/controller/ingress/namespace.go:
      
      Fix a typo: `opensift.io/cluster-monitoring`.
      We are, after all, in the computing business, and not the baking
      business.
      
      Check if annotation/label map values exist since comparing a
      non-existent map value to the empty string will always return true
      in this case.
      
      pkg/operator/controller/ingress/namespace_test.go:
      
      Add unit test to prove that a map value that does not exist
      is equivalent to the empty string.
      
      Rename the `original` namespace to `desired`.
      Switch ordering of `desired` and `mutated` in `routerNamespaceChanged`
      calling sites to better reflect the intent of the unit tests.
      0faf81ce
    • OpenShift Merge Robot's avatar
      Merge pull request #607 from Miciah/BZ1955854-compute-available-and-degraded-from-default-ingress · b7f990e5
      OpenShift Merge Robot authored
      Bug 1955854: Compute Available and Degraded from default ingress
      b7f990e5
    • Clayton Coleman's avatar
      test/e2e: TestHTTPHeaderBufferSize must wait for old pod to delete · 3a15611f
      Clayton Coleman authored
      The addition of minReadySeconds slows the rate at which config changes
      rollout - the old logic in the test made an update to the IC and assumed
      the new pod being ready == high likelihood the new pod and router config
      was live, but instead the old pod was actually just as likely to get
      the request.
      
      Wait until the old pod is gone before continuing.
      3a15611f
    • Miheer Salunke's avatar
      BZ https://bugzilla.redhat.com/1901648 · 9bb8cbb1
      Miheer Salunke authored
      The following use case is fixed now by this fix:
      
      1. Create Ingress with a custom domain
      2. Ingress status gets updated by OpenShift Ingress controller with router canonical hostname
      3. Use external-dns to sync with Route 53
      
      The problem was the canonical router hostname didn't exist in the DNS. It is not created by OpenShift. OpenShift creates this *.apps.<cluster_name>.<base_domain> DNS record and not this one apps.<cluster_name>.<base_domain>. So canonical router hostname was not right. Now this fix sets it to router-default.apps.<cluster_name>.<base_domain>
      9bb8cbb1
    • Miciah Masters's avatar
      status: Set Available/Degraded from default ingress · 3c1bcf7b
      Miciah Masters authored
      Use only the default ingresscontroller for computing the clusteroperator's
      "Available" and "Degraded" status conditions.  The status of other ingress-
      controllers can be monitored using alerts; the clusteroperator should only
      report status as required for basic cluster functionality, and core
      components should depend only on the default ingresscontroller.
      
      This commit is related to bug 1955854.
      
      https://bugzilla.redhat.com/show_bug.cgi?id=1955854
      
      * pkg/operator/controller/status/controller.go (Reconcile): Pass
      state.IngressControllers to computeOperatorStatusVersions.
      (computeOperatorDegradedCondition, computeOperatorAvailableCondition): Use
      only the default ingresscontroller to compute status conditions.
      3c1bcf7b
    • Miciah Masters's avatar
      alerts: Add ingresscontroller degraded/unavailable · 434b24eb
      Miciah Masters authored
      Add an ingress_controller_conditions Prometheus metric that reports the
      status conditions of ingresscontrollers, and add Prometheus rules to raise
      alerts if an ingresscontroller is unavailable or degraded.
      
      This commit is related to bug 1955854.
      
      https://bugzilla.redhat.com/show_bug.cgi?id=1955854
      
      * cmd/ingress-operator/start.go (start): Call StartMetricsListener from the
      operator package, and call RegisterMetrics from the ingress and canary
      controller packages.
      * manifests/0000_90_ingress-operator_03_prometheusrules.yaml: Add alerts
      using the new ingress_controller_conditions metric to warn if an
      ingresscontroller is degraded or unavailable.
      * pkg/manifests/bindata.go: Regenerate.
      * pkg/operator/controller/canary/metrics.go (registerCanaryMetrics): Rename
      from this...
      (RegisterMetrics): ...to this.
      (StartMetricsListener): Move from here...
      * pkg/operator/metrics.go: ...to here.  New file.
      * pkg/operator/controller/ingress/metrics.go (ingressControllerConditions):
      New variable.  Define a "ingress_controller_conditions" Prometheus gauge.
      (metricsList): New variable.  Define the list of metrics for this
      controller.  Currently this list comprises ingressControllerConditions.
      (reportedConditions): New variable.  Define the ingresscontroller status
      conditions that are published in the ingress_controller_conditions metric.
      (SetIngressControllerConditionsMetric): New function.  Update the new
      "ingress_controller_conditions" gauge with the status conditions of the
      given ingresscontroller.
      (RegisterMetrics): New function.  Register metricsList with Prometheus.
      * pkg/operator/controller/ingress/status.go (syncIngressControllerStatus):
      Call SetIngressControllerConditionsMetric after updating status conditions.
      434b24eb
    • Miciah Masters's avatar
      canary: Delete extra "on" in log message · 313ea5fd
      Miciah Masters authored
      * pkg/operator/controller/canary/metrics.go (StartMetricsListener): Delete
      trailing "on" in the "starting metrics listener on " log message.
      313ea5fd
  21. 06 May, 2021 2 commits
    • Clayton Coleman's avatar
      Ingress rollouts should specify minReadySeconds · d8af5ea7
      Clayton Coleman authored
      Ingress components may run directly behind load balancers and thus
      need to delay deployment rollout long enough for the load balancers
      to see the new process. Without minReadySeconds, the upgrade process
      will immediately delete the old pod as soon as the new pod is ready,
      which means that the LB may briefly see no pods, which then triggers
      on some LB the behavior of including all hosts (generally if all
      hosts are unhealthy the LB will allow requests to go to any host
      as opposed to failing closed).
      
      In the future we may wish to allow minReadySeconds to be customized
      by customers since not all load balancers will take less than 30s
      to converge. However, 30s is a reasonable start.
      d8af5ea7
    • OpenShift Merge Robot's avatar
      Merge pull request #577 from Miciah/BZ1900819-specify-topology-spread-constraints · f98fde81
      OpenShift Merge Robot authored
      Bug 1900819: Specify topology spread constraints
      f98fde81
  22. 05 May, 2021 1 commit
  23. 04 May, 2021 1 commit
    • Miciah Masters's avatar
      Specify topology spread constraints · 52ed327c
      Miciah Masters authored
      Specify topology spread constraints on router deployments to encourage the
      scheduler to spread router pod replicas across availability zones.
      
      Add a to-do to replace the hard anti-affinity rule with soft anti-affinity
      once the descheduler has support for enforcing soft anti-affinity rules.
      
      This commit fixes bug 1900819.
      
      https://bugzilla.redhat.com/show_bug.cgi?id=1900819
      
      * pkg/operator/controller/ingress/deployment.go (desiredRouterDeployment):
      Rename needDeploymentHash to configureAffinity as we now always need to
      compute the deployment hash but do not always specify an affinity policy.
      Specify spec.template.spec.topologySpreadConstraints.  Specify max skew 1,
      unsatisfiable-constraint action "ScheduleAnyway", topology key
      "topology.kubernetes.io/zone", and a label selector on the deployment hash
      label so as to specify a preference for spreading replicas of the same
      generation out but not prevent scheduling replicas of different generations
      on the same node or scheduling more replicas than there are availability
      zones.  Add to-do for changing to soft anti-affinity.
      (hashableDeployment): Hash spec.template.spec.topologySpreadConstraints.
      (cmpMatchExpressions, zeroOutDeploymentHash): New functions, factored out
      of hashableDeployment.
      * pkg/operator/controller/ingress/deployment_test.go (checkDeploymentHash):
      New function, factored out of TestDesiredRouterDeployment.
      (TestDesiredRouterDeployment): Use checkDeploymentHash.  Expect the
      deployment hash label always to be set.
      (TestDeploymentConfigChanged): Add test cases for
      spec.template.spec.topologySpreadConstraints.
      52ed327c