Еше вопрос ловлю вот такой Alert кто может посoветвать откуда начать распутывать клубок :
[FIRING:1] InstanceDown (job "prod-federate")
@channel Alert details:
Alert: Instance (prometheus.prod.internal.мими:80) of job prod-federate has been down for more than 3 minutes. - critical
Description: Instance is down for more than 1 minute
Details:
• alertname: InstanceDown
• env: prod
• instance: prometheus.prod.internal.мими:80
• job: prod-federate
• k8s_cluster: prod
• severity: critical