为了更好的扩缩容,自定义一些指标,作为 HPA 的标准,在本文中采用 nsq 的队列长度作为指标
准备工作,准备一个 K8S 集群,然后安装下面两个组件
kubectl apply -f kube-prometheus-0.4.0/manifests/
将 prometheus-service-name、namespace 改成实际的名称
kubectl create namespace custom-metrics
helm install prometheus-adapter stable/prometheus-adapter -n custom-metrics --set prometheus.url=http://<prometheus-service-name>.<namespace>.svc
指标的格式是三行
第一行是 HELP 开头的基础信息
第二行是 TYPE 开头的指标名称和指标类型
第三行是具体的指标值
# HELP myapp_processed_ops_total The total number of processed events
# TYPE myapp_processed_ops_total counter
myapp_processed_ops_total 5
Counter: 累积量度,代表一个单调递增的计数器
Gague: 通常用于测量值,例如温度或当前内存使用情况,但也可以用于上升和下降的“计数”,例如正在运行的 goroutine 的数量。
Histogram: 直方图对观察值(通常是请求持续时间或响应大小之类的东西)进行采样,并将其计数在可配置的存储桶中。
Summary: 采样观察值(通常是请求持续时间和响应大小之类的东西)
应用暴露指标通常有两种方式,一种是通过自身提供 metrics url 接口,第二种是 exporter 将指标导出成 Prometheus 的指标
通常在应用层暴露 http 接口,就像多提供一个健康检查的接口一样
当使用的中间件(数据库、消息队列)没有提供接口时,可以使用这种方式,产用外接一个 exporter 的方式,获取它的指标信息
使用 prometheus 提供的 golang client,可以简单的增加指标。prometheus 同时还支持 python、java、ruby 的客户端
这样就可以拿到指标了
package main
import (
"net/http"
"time"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto"
"github.com/prometheus/client_golang/prometheus/promhttp"
)
func recordMetrics() {
go func() {
for {
opsProcessed.Inc()
time.Sleep(2 * time.Second)
}
}()
}
var (
opsProcessed = promauto.NewCounter(prometheus.CounterOpts{
Name: "myapp_processed_ops_total",
Help: "The total number of processed events",
})
)
func main() {
recordMetrics()
http.Handle("/metrics", promhttp.Handler())
http.ListenAndServe(":2112", nil)
}
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
namespace: monitoring
name: pm
labels:
k8s-app: pm-exporter
spec:
jobLabel: app
podMetricsEndpoints:
- port: metrics
interval: 15s
selector:
matchLabels:
app: app-for-pm
namespaceSelector:
matchNames:
- default
需要在 pod 的 template 中添加
annotations:
prometheus.io/scrape: "true"
prometheus.io/path: "/metrics"
prometheus.io/port: "port" # pod实际的端口
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: sm
namespace: monitoring
labels:
k8s-app: sm-exporter
spec:
jobLabel: app
endpoints:
- port: metrics
interval: 15s
selector:
matchLabels:
app: sm
namespaceSelector:
matchNames:
- default
需要在 Service 上指定名称,端口
spec:
ports:
- port: 9117
protocol: TCP
targetPort: 9117
name: metrics
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: nsq-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nsq-deployment # 需要扩缩容的deployment名字
minReplicas: 60
maxReplicas: 150
metrics:
- type: Object
object:
metric:
name: "metrics-name" # 需要监控的指标
selector:
matchLabels:
"topic": "topic-name"
describedObject:
apiVersion: "v1"
kind: Service
name: nsq-exporter-svc # 从哪里获取指标
target:
type: Value
value: 10
---
apiVersion: v1
kind: Pod
metadata:
name: nsq-exporter
labels:
app: nsq-exporter
spec:
containers:
- name: nsq-exporter
image: nsq-prometheus-exporter:latest
imagePullPolicy: IfNotPresent
ports:
- containerPort: 9117
name: metrics
readinessProbe:
tcpSocket:
port: 9117
initialDelaySeconds: 3
periodSeconds: 10
env:
- name: NSQD_ADDR
value: "http://nsqd-2.nsqd:4151/stats"
args: ["-nsqd.addr", "$(NSQD_ADDR)"]
---
apiVersion: v1
kind: Service
metadata:
labels:
app: nsq-exporter
name: nsq-exporter-svc
spec:
ports:
- port: 9117
protocol: TCP
targetPort: 9117
name: metrics
selector:
app: nsq-exporter
type: ClusterIP # ClusterIP or LoadBalancer
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: nsq
namespace: monitoring
labels:
k8s-app: nsq-exporter
spec:
jobLabel: app
endpoints:
- port: metrics
interval: 15s
selector:
matchLabels:
app: nsq-exporter
namespaceSelector:
matchNames:
- default
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: video-post-vid-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: app-deployment # 需要扩缩容的deployment名字
minReplicas: 60
maxReplicas: 150
metrics:
- type: Object
object:
metric:
name: "nsq_channel_depth" # 需要监控的指标
selector:
matchLabels:
"topic": "topic-name"
describedObject:
apiVersion: "v1"
kind: Service
name: nsq-exporter-svc # 从哪里获取指标
target:
type: Value
value: 10