dcos-metrics polls the mesos agent container endpoint and expects each container object to have a statistics object as a child. If it's not present, the whole process can segfault and crash out.
I triaged with Gastón Kleiman and we determined that the /containers endpoint ought to serve statistics, but may warn and not serve them in some cases. We further determined that all the affected containers belonged to Metronome, which should probably also be fixed.
H/T Fabian Baier who noticed this issue and determined the cause.
dcos-metrics should handle missing statistics gracefully.