外部 prometheus监控k8s集群资源

news/2024/6/23 18:14:55

prometheus监控k8s集群资源

  • 一,通过CADvisior 监控pod的资源状态
    • 1.1 授权外边用户可以访问prometheus接口。
    • 1.2 获取token保存
    • 1.3 配置prometheus.yml 启动并查看状态
    • 1.4 Grafana 导入仪表盘
  • 二,通过kube-state-metrics 监控k8s资源状态
    • 2.1 部署 kube-state-metrics
    • 2.2 配置prometheus.yml
    • 2.3 Grafana 导入仪表盘
    • 2.4 Grafana没有数据,添加路由转发

二进制安装的prometheus,监控k8s集群信息。

监控指标实现方式举例
Pod资源利用率cAdvisor容器CPU、内存利用率
K8s资源状态kube-state-metricscontroller控制器、Node、Namespace、Pod、ReplicaSet、service等

一,通过CADvisior 监控pod的资源状态

1.1 授权外边用户可以访问prometheus接口。

apiVersion: v1
kind: ServiceAccount
metadata:name: prometheusnamespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:name: prometheus
rules:
- apiGroups:- ""resources:- nodes- services- endpoints- pods- nodes/proxyverbs:- get- list- watch
- apiGroups:- "extensions"resources:- ingressesverbs:- get- list- watch
- apiGroups:- ""resources:- configmaps- nodes/metricsverbs:- get
- nonResourceURLs:- /metricsverbs:- get
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:name: prometheus
roleRef:apiGroup: rbac.authorization.k8s.iokind: ClusterRolename: prometheus
subjects:
- kind: ServiceAccountname: prometheusnamespace: kube-system
kubectl apply -f rbac.yaml

1.2 获取token保存

kubectl get secrets -n kube-system |grep prometheus #查看toekn name
name:prometheus-token-vgxhckubectl describe secret prometheus-token-vgxhc -n kube-system > token.k8s
#kubectl get secrets -n kube-system -o yaml prometheus-token-vgxhc |grep token
scp token.k8s prometheus #拷贝到prometheus服务器prometheus的目录下

我的token放在 /opt/monitor/prometheus/token.k8s

1.3 配置prometheus.yml 启动并查看状态

vim prometheus.yml

  - job_name: kubernetes-nodes-cadvisormetrics_path: /metricsscheme: httpskubernetes_sd_configs:- role: nodeapi_server: https://172.18.0.0:6443bearer_token_file: /opt/monitor/prometheus/token.k8s tls_config:insecure_skip_verify: truebearer_token_file: /opt/monitor/prometheus/token.k8s tls_config:insecure_skip_verify: truerelabel_configs:# 将标签(.*)作为新标签名,原有值不变- action: labelmapregex: __meta_kubernetes_node_label_(.*)# 修改NodeIP:10250为APIServerIP:6443- action: replaceregex: (.*)source_labels: ["__address__"]target_label: __address__replacement: 172.18.0.0:6443# 实际访问指标接口 https://NodeIP:10250/metrics/cadvisor 这个接口只能APISERVER访问,故此重新标记标签使用APISERVER代理访问- action: replacesource_labels: [__meta_kubernetes_node_name]target_label: __metrics_path__regex: (.*)replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor 
./promtool check config prometheus.yml 
重启prometheus 或 kill -HUP PrometheusPid

在prometheus的target页面查看
http://172.18.0.0:9090
在这里插入图片描述

1.4 Grafana 导入仪表盘

导入3119 仪表盘
在这里插入图片描述
在这里插入图片描述完成pod资源监控

二,通过kube-state-metrics 监控k8s资源状态

2.1 部署 kube-state-metrics

apiVersion: v1
kind: ServiceAccount
metadata:name: kube-state-metricsnamespace: kube-systemlabels:kubernetes.io/cluster-service: "true"addonmanager.kubernetes.io/mode: Reconcile
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:name: kube-state-metricslabels:kubernetes.io/cluster-service: "true"addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups: [""]resources:- configmaps- secrets- nodes- pods- services- resourcequotas- replicationcontrollers- limitranges- persistentvolumeclaims- persistentvolumes- namespaces- endpointsverbs: ["list", "watch"]
- apiGroups: ["apps"]resources:- statefulsets- daemonsets- deployments- replicasetsverbs: ["list", "watch"]
- apiGroups: ["batch"]resources:- cronjobs- jobsverbs: ["list", "watch"]
- apiGroups: ["autoscaling"]resources:- horizontalpodautoscalersverbs: ["list", "watch"]
- apiGroups: ["networking.k8s.io", "extensions"]resources:- ingresses verbs: ["list", "watch"]
- apiGroups: ["storage.k8s.io"]resources:- storageclasses verbs: ["list", "watch"]
- apiGroups: ["certificates.k8s.io"]resources:- certificatesigningrequestsverbs: ["list", "watch"]
- apiGroups: ["policy"]resources:- poddisruptionbudgets verbs: ["list", "watch"]---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:name: kube-state-metrics-resizernamespace: kube-systemlabels:kubernetes.io/cluster-service: "true"addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups: [""]resources:- podsverbs: ["get"]
- apiGroups: ["extensions","apps"]resources:- deploymentsresourceNames: ["kube-state-metrics"]verbs: ["get", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1 
kind: ClusterRoleBinding
metadata:name: kube-state-metricslabels:kubernetes.io/cluster-service: "true"addonmanager.kubernetes.io/mode: Reconcile
roleRef:apiGroup: rbac.authorization.k8s.iokind: ClusterRolename: kube-state-metrics
subjects:
- kind: ServiceAccountname: kube-state-metricsnamespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:name: kube-state-metricsnamespace: kube-systemlabels:kubernetes.io/cluster-service: "true"addonmanager.kubernetes.io/mode: Reconcile
roleRef:apiGroup: rbac.authorization.k8s.iokind: Rolename: kube-state-metrics-resizer
subjects:
- kind: ServiceAccountname: kube-state-metricsnamespace: kube-system---apiVersion: apps/v1
kind: Deployment
metadata:name: kube-state-metricsnamespace: kube-systemlabels:k8s-app: kube-state-metricskubernetes.io/cluster-service: "true"addonmanager.kubernetes.io/mode: Reconcileversion: v1.3.0
spec:selector:matchLabels:k8s-app: kube-state-metricsversion: v1.3.0replicas: 1template:metadata:labels:k8s-app: kube-state-metricsversion: v1.3.0annotations:scheduler.alpha.kubernetes.io/critical-pod: ''spec:priorityClassName: system-cluster-criticalserviceAccountName: kube-state-metricscontainers:- name: kube-state-metricsimage: harbor.cpit.com.cn/monitor/kube-state-metrics:v1.8.0ports:- name: http-metricscontainerPort: 8080- name: telemetrycontainerPort: 8081readinessProbe:httpGet:path: /healthzport: 8080initialDelaySeconds: 5timeoutSeconds: 5- name: addon-resizerimage: harbor.cpit.com.cn/monitor/addon-resizer:1.8.6resources:limits:cpu: 1000mmemory: 500Mirequests:cpu: 1000mmemory: 500Mienv:- name: MY_POD_NAMEvalueFrom:fieldRef:fieldPath: metadata.name- name: MY_POD_NAMESPACEvalueFrom:fieldRef:fieldPath: metadata.namespacevolumeMounts:- name: config-volumemountPath: /etc/configcommand:- /pod_nanny- --config-dir=/etc/config- --container=kube-state-metrics- --cpu=100m- --extra-cpu=1m- --memory=100Mi- --extra-memory=2Mi- --threshold=5- --deployment=kube-state-metricsvolumes:- name: config-volumeconfigMap:name: kube-state-metrics-config
---
# Config map for resource configuration.
apiVersion: v1
kind: ConfigMap
metadata:name: kube-state-metrics-confignamespace: kube-systemlabels:k8s-app: kube-state-metricskubernetes.io/cluster-service: "true"addonmanager.kubernetes.io/mode: Reconcile
data:NannyConfiguration: |-apiVersion: nannyconfig/v1alpha1kind: NannyConfiguration---apiVersion: v1
kind: Service
metadata:name: kube-state-metricsnamespace: kube-systemlabels:kubernetes.io/cluster-service: "true"addonmanager.kubernetes.io/mode: Reconcilekubernetes.io/name: "kube-state-metrics"annotations:prometheus.io/scrape: 'true'
spec:ports:- name: http-metricsport: 8080targetPort: http-metricsprotocol: TCP- name: telemetryport: 8081targetPort: telemetryprotocol: TCPselector:k8s-app: kube-state-metrics

部署

kubectl apply -f kube-state-metrics.yaml
kubectl get pods -n kube-system

在这里插入图片描述
pod的正常运行

2.2 配置prometheus.yml

- job_name: kubernetes-service-endpointskubernetes_sd_configs:- role: endpointsapi_server: https://192.168.0.0:6443bearer_token_file: /opt/monitor/prometheus/token.k8stls_config:insecure_skip_verify: truebearer_token_file: /opt/monitor/prometheus/token.k8stls_config:insecure_skip_verify: trueService没配置注解prometheus.io/scrape的不采集relabel_configs:- action: keepregex: truesource_labels:- __meta_kubernetes_service_annotation_prometheus_io_scrape重命名采集目标协议- action: replaceregex: (https?)source_labels:- __meta_kubernetes_service_annotation_prometheus_io_schemetarget_label: __scheme__重命名采集目标指标URL路径- action: replaceregex: (.+)source_labels:- __meta_kubernetes_service_annotation_prometheus_io_pathtarget_label: __metrics_path__重命名采集目标地址- action: replaceregex: ([^:]+)(?::\d+)?;(\d+)replacement: $1:$2source_labels:- __address__- __meta_kubernetes_service_annotation_prometheus_io_porttarget_label: __address__将K8s标签(.*)作为新标签名,原有值不变- action: labelmapregex: __meta_kubernetes_service_label_(.+)生成命名空间标签- action: replacesource_labels:- __meta_kubernetes_namespacetarget_label: kubernetes_namespace生成Service名称标签- action: replacesource_labels:- __meta_kubernetes_service_nametarget_label: kubernetes_service_name
./promtool check config prometheus.yml 
重启prometheus 或 kill -HUP PrometheusPid

在prometheus的target页面查看
http://172.18.0.0:9090
在这里插入图片描述

2.3 Grafana 导入仪表盘

Grafana导入k8s集群资源对象监控仪表盘 6417

在这里插入图片描述
完成k8s集群资源对象监控仪表盘监控

2.4 Grafana没有数据,添加路由转发

ip route
ip route add 172.40.0.0/16 via 172.18.2.30 dev eth0
ip route

#172.40.1.208:kube-state-metrics pod 集群内部ip
#172.18.2.30:k8s master 节点ip

然后在查看Grafana仪表盘。


https://www.xjx100.cn/news/3092702.html

相关文章

特殊token的特殊用途

特殊token的特殊用途 特殊voc设计传统的特殊token 用途特殊用途例子特殊voc设计 普通token1 。。。。普通token1000,特殊token1,,,,,特殊token100 ,特殊指示token1,,,特殊指示token100 传统的特殊token 用途 在您提供的示例中,有1000个普通 token(从普通 token …

【代码随想录】刷题笔记Day33

前言 Day33虽说是一个月,但是从第一篇开始实际上已经过了8个月了,得抓紧啊 46. 全排列 - 力扣(LeetCode) 前面组合就强调过差别了,这道题是排序,因此每次要从头到尾扫,结合used数组 class So…

在工作中被架空时怎么办?

在工作中被架空时,可以采取以下措施来处理: 保持冷静:不要因为被架空而情绪激动或做出过激的回应,这可能会进一步恶化情况。确认事实:首先要确认自己是否真的被架空了。有时候,这可能只是暂时的困难或误解…

C语言进阶之冒泡排序

✨ 猪巴戒:个人主页✨ 所属专栏:《C语言进阶》 🎈跟着猪巴戒,一起学习C语言🎈 目录 前情回顾 1、回调函数 2、冒泡排序 3、库函数qsort cmp(sqort中的比较函数,需要我们自定义) …

matlab设置背景颜色

matlab默认的背景颜色是纯白RGB(255,255,255),纯白太刺眼,看久了,眼睛会酸胀、疼痛,将其改成豆沙绿RGB(205,123,90),或者给出浅绿色RGB(128,255,255), 颜色就会柔和很多,眼睛感觉更舒适。     下面介绍在…

pygame播放视频并实现音视频同步

一、前言 在我接触pygame时最新的pygame已经不支持movie模块,这就导致在pygame播放视频变成一个问题,网上搜了下解决方案有两个: 一是使用opencv播放视频,再结合pygame.mixer来播放音频 二是使用moviepy播放视频,再…

正则笔记(持续更新)

1. java 正则替换 指定字符及其之前的字符 System.out.println("em_4b6add2cfb415db2".replaceFirst("\\.*._",""));//结果 -> e4b6add2cfb415db22. java 正则替换 指定字符及其之后的字符 String name "name.keyword^1.0" ; St…

sqlmap的使用笔记及示例

sqlmap的使用笔记 文章目录 sqlmap的使用笔记1. 目标2. 脱库2.1. 脱库(补充) 3. 其他3.1. 其他(补充) 1. 目标 操作作用必要示例-u指定URL,检测注入点sqlmap -u http://example.com/?id1-m指定txt,里面有…