Affects Version/s: None
Fix Version/s: DC/OS 1.13.0
Sprint:Observability Team Sprint 32, Observability Team Sprint 33, Observability Team Sprint 34, Observability Team Sprint 35, Observability Team Sprint 36, Observability Team Sprint 37
Parent Initiative:D2IQ-44281 - [DC/OS] Instrument and Transmit Metrics for Critical Components (via dcos-telegraf) to enable operator visibility to customer workload from a single service
We should adapt Telegraf's prometheus input plugin to discover Mesos tasks.
The plugin already has built-in Kubernetes service discovery. We should add a mechanism which works as follows:
If the task has the DCOS_METRICS_FORMAT=prometheus
Then the Prometheus plugin should add a target for every portConfig entry with the DCOS_METRICS_PORT=true label, eg:
This target should ensure that all metrics are tagged with:
- service_name (ie framework name)
Metrics must also be tagged with any labels on the task which are prefixed DCOS_METRICS_. For example, any metric from a task labeled DCOS_METRICS_FOO=bar must be tagged FOO=bar
This code should eventually be contributed back up to the Telegraf repository in the influxdata organisation. However, this ticket may be considered done if the code is only in the dcos organisation, as long as another ticket tracks progress on contributing that code upstream.