Details

    • Story Points:
      5

      Description

      Given https://jira.mesosphere.com/browse/COPS-4491 customers run into problems if they have a (albeit dangerously looking) working cluster but dead old CockroachDB nodes from replacing DC/OS master nodes in dynamic Exhibitor setup.

      CockroachDB won't forget dead nodes, however our CockroachDB dcos-check checks for no underreplicated and no unavailable ranges on all nodes known to CockroachDB, no matter dead or not.

      The way to go here would be to restrict the old CockroachDB dcos-check to only account for no underreplicated and unavailable ranges of currently live nodes.

      Then introduce a second dcos-check that asserts that we have N live CockroachDB nodes, where N is equal to the number of expected master nodes.

      Since the checks are executed simultaneously on node-poststart both conditions must be met in order for DC/OS to report healthy.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              timweidner Tim Weidner (Inactive)
              Team:
              Mesosphere
              Watchers:
              Tim Weidner (Inactive)
            • Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: