LOKI使用测试
环境准备
本地为mac,采用kind搭建k8s集群,配置文件为:
# kind-cfg.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
extraPortMappings:
- containerPort: 5601
hostPort: 5601
listenAddress: "0.0.0.0"
protocol: tcp
- role: worker
- role: worker
启动集群:
# 创建k8s集群
kind create cluster --name=localk8s --config=kind-cfg.yaml
# control-plane污点处理
kubectl taint nodes --all node-role.kubernetes.io/control-plane:NoSchedule-
promtail+loki+grafana配置
loki官方提供了很多种部署方案。我这边选择simple-scalable
方案,适用于日志体量较大,但不会达到巨大程度的k8s集群进行数据收集。通过helm进行快速配置:
- 添加grafana helm repo:
helm repo add grafana https://grafana.github.io/helm-charts helm repo update
- 下载loki和promtail的helm安装包:
helm pull grafana/loki-simple-scalable --untar helm pull grafana/promtail --untar
- 对loki helm包的
values.yaml
进行修改:
loki:
# ...
auth_enabled: false # 避免promtail发送到writer时出现401错误
monitoring:
selfMonitoring:
enabled: false # 不监控和采集loki组件自身产生的日志
grafanaAgent:
installOperator: false # 不监控和采集loki组件自身产生的日志
write:
replicas: 3 # 需配置
persistence:
size: 10Gi # pvc 需配置
storageClass: null # 看是否需要修改
read:
replicas: 3 # 需配置
persistence:
size: 10Gi # pvc 需配置
storageClass: null # 看是否需要修改
gateway:
replicas: 1 # nginx 需配置
minio:
enabled: true # 开启s3存储特性
accessKey: enterprise-logs # 登录accessKey
secretKey: supersecret # 登录secretKey
buckets:
- name: chunks
policy: readwrite # 修改bucket policy,否则writer无法将收集的数据写入
purge: false
- name: ruler
policy: readwrite # 是否需要修改待定
purge: false
- name: admin
policy: readwrite # 是否需要修改待定
purge: false
persistence:
size: 5Gi # 需配置
resources:
requests:
cpu: 100m # 需配置
memory: 128Mi # 需配置
- 对promtail helm包的
values.yaml
进行修改,可参考文档
# -- Resource requests and limits
resources: {}
# limits:
# cpu: 200m
# memory: 128Mi
# requests:
# cpu: 100m
# memory: 128Mi
# -- Affinity configuration for pods
affinity: {}
# -- Tolerations for pods. By default, pods will be scheduled on master/control-plane nodes.
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
config:
snippets:
scrapeConfigs: |
- job_name: kubernetes-pods
relabel_configs:
- source_labels:
- __meta_kubernetes_namespace # 对不需要采集日志的namespace进行丢弃
regex: ^(loki|kube-system)$ # 正则匹配
action: drop
- 新建k8s namespace,安装helm包:
kubectl create ns loki # 本地helm包 helm upgrade --install loki ./loki-simple-scalable -n loki # 本地helm包 helm upgrade --install promtail ./promtail -n loki # grafana直接用官方的 不作修改 helm upgrade --install grafana grafana/grafana -n loki
部署完成后,应该可以看到:
$ kubectl get pod -n loki
NAME READY STATUS RESTARTS AGE
grafana-5d69687778-2m6mc 1/1 Running 0 3h4m
loki-gateway-55fccf8654-8zlvq 1/1 Running 0 3h23m
loki-grafana-agent-operator-684b478b77-qsxhk 1/1 Running 0 3h23m
loki-logs-6znhx 2/2 Running 0 178m
loki-logs-dvzgg 2/2 Running 0 3h22m
loki-logs-q6t4p 2/2 Running 0 3h22m
loki-minio-858c5777bb-q5whm 1/1 Running 0 3h23m
loki-read-0 1/1 Running 0 158m
loki-read-1 1/1 Running 0 158m
loki-read-2 1/1 Running 0 159m
loki-write-0 1/1 Running 0 157m
loki-write-1 1/1 Running 0 158m
loki-write-2 1/1 Running 0 159m
promtail-dslwb 1/1 Running 0 3h6m
promtail-g7g2n 1/1 Running 0 3h6m
promtail-vt27w 1/1 Running 0 3h6m
- 将grafana forward到3001端口,登录 http://localhost:3001
kubectl port-forward -n loki grafana-5d69687778-2m6mc 3001:3000 # 获取密码 账号为admin kubectl get secret -n loki grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
- 添加loki类型的datasource,指定url为 http://loki-gateway (即k8s svc loki-gateway)
- 在
Dashboard
或者Explore
界面,在log browser
选择对应的筛选条件:
- 查询样例:
我在本地创建了一个测试镜像
version
,具体功能为每10秒输出一次信息。在namespace version
部署了两个pod,一个叫version
,一个叫version2
,通过指定container筛选条件,可以查找到两个pod的日志,选择unique labels
选项,可以看到每条日志所在的node、namespace、podName等信息
按照查询语法,可以搜索日志(json格式)里的字段信息:
部署分析
loki-logs
和promtail
作为DaemonSet
会部署到所有需要收集日志的node上(包括backend和pdb的node),其他的组件可以固定在特定的node上,避免对计算节点产生过多性能影响。
# DaemonSet
$ kubectl get ds -n loki
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
loki-logs 3 3 3 3 3 <none> 4h7m
promtail 3 3 3 3 3 <none> 3h51m
# StatefulSet
$ kubectl get sts -n loki
NAME READY AGE
loki-read 3/3 4h10m
loki-write 3/3 4h10m
# ReplicaSet
$ kubectl get rs -n loki
NAME DESIRED CURRENT READY AGE
grafana-5d69687778 1 1 1 3h52m
loki-gateway-55fccf8654 1 1 1 4h11m
loki-grafana-agent-operator-684b478b77 1 1 1 4h11m
loki-minio-858c5777bb 1 1 1 4h11m
# Deployment
$ kubectl get deploy -n loki
NAME READY UP-TO-DATE AVAILABLE AGE
grafana 1/1 1 1 3h52m
loki-gateway 1/1 1 1 4h11m
loki-grafana-agent-operator 1/1 1 1 4h11m
loki-minio 1/1 1 1 4h11m
# Service
$ kubectl get svc -n loki
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
grafana ClusterIP 10.96.193.187 <none> 80/TCP 3h59m
loki-gateway ClusterIP 10.96.123.104 <none> 80/TCP 4h18m
loki-memberlist ClusterIP None <none> 7946/TCP 4h18m
loki-minio ClusterIP 10.96.24.177 <none> 9000/TCP 4h18m
loki-read ClusterIP 10.96.215.176 <none> 3100/TCP,9095/TCP 4h18m
loki-read-headless ClusterIP None <none> 3100/TCP,9095/TCP 4h18m
loki-write ClusterIP 10.96.50.235 <none> 3100/TCP,9095/TCP 4h18m
loki-write-headless ClusterIP None <none> 3100/TCP,9095/TCP 4h18m
# Service Account
$ kubectl get sa -n loki
NAME SECRETS AGE
default 0 4h18m
grafana 0 4h
loki 0 4h18m
loki-grafana-agent 0 4h18m
loki-grafana-agent-operator 0 4h18m
loki-minio 0 4h18m
loki-minio-update-prometheus-secret 0 4h18m
promtail 0 4h1m
PREVIOUSStress Testing Tool - Jmeter
NEXTgolang对fork仓库的使用