Given https://jira.mesosphere.com/browse/COPS-4491 customers run into problems if they have a (albeit dangerously looking) working cluster but dead old CockroachDB nodes from replacing DC/OS master nodes in dynamic Exhibitor setup.
CockroachDB won't forget dead nodes, however our CockroachDB dcos-check checks for no underreplicated and no unavailable ranges on all nodes known to CockroachDB, no matter dead or not.
The way to go here would be to restrict the old CockroachDB dcos-check to only account for no underreplicated and unavailable ranges of currently live nodes.
Then introduce a second dcos-check that asserts that we have N live CockroachDB nodes, where N is equal to the number of expected master nodes.
Since the checks are executed simultaneously on node-poststart both conditions must be met in order for DC/OS to report healthy.