Prometheus+Alertmanager支持多种报警方式,所以可以通过不程度发送不同的报警,提高报警的效率。
关键的Alermanager配置
global: resolve_timeout: 5m smtp_smarthost: 'smtp.163.com:25' smtp_from: 'zhouzhifei@163.com' smtp_auth_username: 'zhouzhifei@163.com' smtp_auth_password: 'password' smtp_require_tls: false route: group_by: ['alertname', 'cluster'] group_wait: 30s group_interval: 5m repeat_interval: 5m receiver: default routes: - receiver: warning match: severity: warning - receiver: critical match: severity: critical receivers: - name: 'warning' email_configs: - to: 'zhouzhifei@yunwei.com' send_resolved: true - name: 'critical' webhook_configs: - url: 'http://abcdocker-dingding-hook:8060' send_resolved: true
关键的rules配置
groups: - name: test-rule rules: - alert: "内存报警" expr: 100 - ((node_memory_MemAvailable * 100) / node_memory_MemTotal) > 10 for: 1s labels: severity: warning annotations: summary: "服务名:{{$labels.alertname}}" description: "业务500报警: {{ $value }}" value: "{{ $value }}" - name: test-rule2 rules: - alert: "内存报警" expr: 100 - ((node_memory_MemAvailable * 100) / node_memory_MemTotal) > 40 for: 1s labels: severity: critical annotations: summary: "服务名:{{$labels.alertname}}" description: "业务500报警: {{ $value }}" value: "{{ $value }}"
文章评论