Details

    • Type: Task
    • Status: Open
    • Priority: Medium
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: zookeeper
    • Labels:

      Description

      Currently we provide instructions to stop the dcos-exhibitor systemd unit and thus the ZooKeeper on the master node that we're taking a ZooKeeper backup from prior to taking said backup.

      The backup procedure we provide with dcos-zk is designed in such a way that it SHOULD allow ZooKeeper to be able to restore from a backup that was taken live, without stopping the ZooKeeper instance.

      We haven't tested this yet thoroughly, that's why our instructions say otherwise.

      In order to test this with reasonable confidence, we would need to have an E2E test that writes to ZooKeeper constantly (in a separate thread for example) while the backup is taken live.
      If ZooKeeper can restore from this backup (without necessarily checking everything that was written) we're reasonably confident that the backup procedure can be done live.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                timweidner Tim Weidner (Inactive)
                Team:
                Mesosphere
                Watchers:
                Tim Weidner (Inactive)
              • Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: