正常情况下使用Prometheus对机器做监控,比如监控CPU、内存、磁盘等信息, 都是在机器上安装一个node exporter, 然后将metrics接入到Prometheus中。在k8s环境下, 我们可以使用k8s来管理, 实现自动化监控。
node exporter是针对主机节点的, 需要在每台node节点上安装, 那么daemonset控制器是最合理的选择。 网络使用Host Network模式, 在主机上直接暴露一个端口。
部署node exporter
使用yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
| apiVersion: apps/v1 kind: DaemonSet metadata: name: node-exporter namespace: monitor labels: name: node-exporter spec: selector: matchLabels: name: node-exporter template: metadata: labels: name: node-exporter spec: hostPID: true hostIPC: true hostNetwork: true containers: - name: node-exporter image: prom/node-exporter:v1.3.1 ports: - containerPort: 9100 resources: requests: cpu: 0.15 securityContext: privileged: true args: - --path.procfs - /host/proc - --path.sysfs - /host/sys - --collector.filesystem.ignored-mount-points - '"^/(sys|proc|dev|host|etc)($|/)"' - '--web.listen-address=:9100' volumeMounts: - name: dev mountPath: /host/dev - name: proc mountPath: /host/proc - name: sys mountPath: /host/sys - name: rootfs mountPath: /rootfs tolerations: - key: "node-role.kubernetes.io/master" operator: "Exists" effect: "NoSchedule" volumes: - name: proc hostPath: path: /proc - name: dev hostPath: path: /dev - name: sys hostPath: path: /sys - name: rootfs hostPath: path: /
|
配置prometheus
关于Prometheus标签处理, 可以看这篇文章
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
| apiVersion: v1 kind: ConfigMap metadata: name: prometheus-config namespace: monitor data: web.yml: | basic_auth_users: admin: $2y$05$UKSS18ztdsUNoEuXYScr2OE1TCMe1hWnmD6JuwUi/uPTJayHIakae prometheus.yml: | global: scrape_interval: 15s evaluation_interval: 15s # 全局标签 external_labels: env: prod dept: ops project: datong
scrape_configs: - job_name: 'kubernetes-node' kubernetes_sd_configs: - role: node relabel_configs: - source_labels: [__address__] regex: '(.*):10250' replacement: '${1}:9100' target_label: __address__ action: replace - action: labelmap regex: __meta_kubernetes_node_label_(.+)
|
这里是利用了Prometheus的元数据,将kubelet的地址更换成了node exporter的地址, 端口换成了9100来实现自动化监控的。