Uploaded image for project: 'DC/OS'
  1. DC/OS
  2. DCOS_OSS-4351

ipvs real server information won't updated

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: High
    • Resolution: Won't Do
    • Affects Version/s: DC/OS 1.11.2
    • Fix Version/s: None
    • Component/s: navstar
    • Labels:

      Description

      we find an api service sometimes get timeout return after we update it yesterday. 

      everything about this api service looks like ok on marathon page. then we login the server which the service instance is running and check its ipvs table.

      TCP  11.137.55.233:8080 wlc                                                                                                                                      
        -> 9.0.7.9:8080                 Masq    1      0          0         
      
        -> 9.0.9.6:8080                 Masq    1      0          0
      

      9.0.9.6 is the current container address which is running. but 9.0.7.9 is the container address which has been killed when marathon update the api service.

      so when virtual server access  9.0.7.9 its always timeout.

       

      we try to delete by ipvsadm but after a while the dead server comeback again. maybe navstar has persistent storage with ipvs table and always sync from each other in cluster.

       

      so why the killed real server information won't be cleanup in cluster?

        Attachments

        1. container can't ping.png
          container can't ping.png
          206 kB
        2. container was exited.png
          container was exited.png
          228 kB
        3. dcos-mesos-slave log.png
          dcos-mesos-slave log.png
          974 kB
        4. host information.png
          host information.png
          209 kB
        5. ipvs table.png
          ipvs table.png
          330 kB
        6. marathon status page1.png
          marathon status page1.png
          485 kB
        7. marathon status page2.png
          marathon status page2.png
          140 kB
        8. mesos status page.png
          mesos status page.png
          414 kB
        9. pidof process.png
          pidof process.png
          192 kB
        10. vips.txt
          23 kB

          Activity

            People

            • Assignee:
              dgoel Deepak Goel
              Reporter:
              sisphyus Sisphyus
              Team:
              DELETE Networking Team
              Watchers:
              Deepak Goel, Sisphyus
            • Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Zendesk Support

                  NextupJiraPlusStatus

                  Error rendering 'slack.nextup.jira:nextup-jira-plus-status'. Please contact your JIRA administrators.