Uploaded image for project: 'DC/OS'
  1. DC/OS
  2. DCOS_OSS-1391

Segfault when statistics is not present in mesos agent containers endpoint in dcos-metrics

    Details

      Description

      dcos-metrics polls the mesos agent container endpoint and expects each container object to have a statistics object as a child. If it's not present, the whole process can segfault and crash out. 

      I triaged with Gastón Kleiman and we determined that the /containers endpoint ought to serve statistics, but may warn and not serve them in some cases. We further determined that all the affected containers belonged to Metronome, which should probably also be fixed.

      H/T Fabian Baier who noticed this issue and determined the cause.

      dcos-metrics should handle missing statistics gracefully.

        Attachments

          Activity

            People

            • Assignee:
              philip Philip Norman (Inactive)
              Reporter:
              philip Philip Norman (Inactive)
              Team:
              DELETE Cluster Ops Team
              Watchers:
              Cathy Daw, Jan-Philip Gehrcke (Inactive), Kevin Klues (Inactive), Marco Monaco, Philip Norman (Inactive)
            • Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Zendesk Support

                  NextupJiraPlusStatus

                  Error rendering 'slack.nextup.jira:nextup-jira-plus-status'. Please contact your JIRA administrators.