对于日志收集,大家最熟悉的是ELK,这是三套系统的缩写,分别为:

  • E:elasticsearch 全文搜索引擎
  • L:logstach 日志收集
  • K:kibana 数据展示

在Kubernetes下,一般将日志收集替换为F:fluented,这样就有了我们的EFK


Elasticsearch

对于这三套系统,有分别不同的部署方式,elasticsearch负责全文搜索,同时也负责存储,是有状态的,所以需要使用StatefulSet进行部署,因为我们已经有了给予NFS的持久化存储,所以在部署的时候,直接采用PVC申请StorageClass即可。
持久化存储:Kubernetes使用NFS做持久化存储 - SPEX

Fluentd

EFK这套组合是采用的push方式进行数据采集,所以fluentd只负责采集后push到elasticsearch服务即可,但因为我们要采集的主要是Kuberentes的日志,需要每台node都需要部署一套,所以需要采用DaemonSet进行部署,同时设定NodeSelector,对需要采集的机器打上标签。

Kibana

kibana主要做web展示,本身也是无状态的,这里需要考虑用何种方式访问,最简单的是NodePort,比较好的是用Nginx-Ingress

虽说读作EFK,但部署的时候,我们按照EKF的方式来做,先部署全文搜索,再部署数据展示,最后再部署日志收集服务。
所有需要用到的镜像都已经上传至内网的harbor中,加快部署速度与稳定性
原镜像地址参见:Docker @ Elastic


1.Namespace

EFK的所有安装操作都在一个全新的namespace中进行。

kubectl create ns logging

2.Elasticsearch集群

采用3节点集群。

2.1 Service

创建headless服务

kind: Service
apiVersion: v1
metadata:
  name: elasticsearch
  namespace: logging
  labels:
    app: elasticsearch
spec:
  selector:
    app: elasticsearch
  clusterIP: None
  ports:
    - port: 9200
      name: rest
    - port: 9300
      name: inter-node

2.2 StatefulSet

2.2.1 集群部署cluster.initial_master_nodes

目前最新的elasticsearch 7.3.0版本。
es在7.0版本开始就变更了集群部署模式,如下:[Bootstrapping a cluster | Elasticsearch Reference [master] | Elastic](https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-discovery-bootstrap-cluster.html)
简单来说就是需要node.namecluster.initial_master_nodes中的node必须完全一致,而cluster.initial_master_nodes中我们的node是dns地址:

  • node.name = metadata.name
  • cluster.initial_master_nodes = metadata.name.servicename.namespacename.svc.cluster.local

好消息是同处一个namespace下,可以简写为metadata.name.servicename,所以我们只需要将metadata.name加个后缀即可启动集群。
这里我将集群地址写死了,如果需要增加集群node,并不能通过kubectl scale进行。

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: es-cluster
  namespace: logging
spec:
  serviceName: elasticsearch
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      containers:
      - name: elasticsearch
        image: harbor.s.com/elasticsearch/elasticsearch:7.3.0
        resources:
            limits:
              cpu: 1000m
            requests:
              cpu: 100m
        ports:
        - containerPort: 9200
          name: rest
          protocol: TCP
        - containerPort: 9300
          name: inter-node
          protocol: TCP
        volumeMounts:
        - name: data
          mountPath: /usr/share/elasticsearch/data
        env:
          - name: cluster.name
            value: k8s-logs
          - name: METANAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: node.name
            value: "$(METANAME).elasticsearch"
          - name: discovery.seed_hosts
            value: "es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch"
          - name: cluster.initial_master_nodes
            value: "es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch"
          - name: ES_JAVA_OPTS
            value: "-Xms512m -Xmx512m"
      initContainers:
      - name: fix-permissions
        image: harbor.s.com/library/busybox:latest
        command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
        securityContext:
          privileged: true
        volumeMounts:
        - name: data
          mountPath: /usr/share/elasticsearch/data
      - name: increase-vm-max-map
        image: harbor.s.com/library/busybox:latest
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
        securityContext:
          privileged: true
      - name: increase-fd-ulimit
        image: harbor.s.com/library/busybox:latest
        command: ["sh", "-c", "ulimit -n 65536"]
        securityContext:
          privileged: true             
  volumeClaimTemplates:
  - metadata:
      name: data
      labels:
        app: elasticsearch
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: managed-nfs-storage 
      resources:
        requests:
          storage: 500Mi

2.3 确认

# 代理
kubectl port-forward es-cluster-0 9200:9200 -n logging

# 新tty查看
curl http://localhost:9200/_cat/health

3. Kibana

在Kibana 7.3版本中,获取es地址的环境变量值变更为ELASTICSEARCH_HOSTS,与以前的版本不同。
可以采用nodePort方式进行部署:

apiVersion: v1
kind: Service
metadata:
  name: kibana
  namespace: logging
  labels:
    app: kibana
spec:
  ports:
  - port: 5601
  type: NodePort
  selector:
    app: kibana

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana
  namespace: logging
  labels:
    app: kibana
spec:
  selector:
    matchLabels:
      app: kibana
  template:
    metadata:
      labels:
        app: kibana
    spec:
      containers:
      - name: kibana
        image: harbor.s.com/elasticsearch/kibana:7.3.0
        resources:
          limits:
            cpu: 1000m
          requests:
            cpu: 100m
        env:
          - name: ELASTICSEARCH_HOSTS
            value: http://elasticsearch:9200
        ports:
        - containerPort: 5601

当然更推荐使用ingress方式进行部署,这里使用kibana.s.com这个域名进行访问,可以添加内网dns进行指向,也可以修改本机hosts添加条目。

apiVersion: v1
kind: Service
metadata:
  name: kibana
  namespace: logging
  labels:
    app: kibana
spec:
  ports:
  - port: 5601
  type: ClusterIP
  clusterIP: None
  selector:
    app: kibana

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana
  namespace: logging
  labels:
    app: kibana
spec:
  selector:
    matchLabels:
      app: kibana
  template:
    metadata:
      labels:
        app: kibana
    spec:
      containers:
      - name: kibana
        image: harbor.s.com/elasticsearch/kibana:7.3.0
        resources:
          limits:
            cpu: 1000m
          requests:
            cpu: 100m
        env:
          - name: ELASTICSEARCH_HOSTS
            value: http://elasticsearch:9200
        ports:
        - containerPort: 5601

---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: kibana-ingress
  namespace: logging
spec:
  rules:
  - host: kibana.s.com
    http:
      paths:
      - backend:
          serviceName: kibana
          servicePort: 5601

4. Fluentd

4.1 日志源配置

<source>

@id fluentd-containers.log

@type tail

path /var/log/containers/*.log

pos_file /var/log/fluentd-containers.log.pos

time_format %Y-%m-%dT%H:%M:%S.%NZ

tag raw.kubernetes.*

format json

read_from_head true

</source>

上面配置部分参数说明如下:

  • id:表示引用该日志源的唯一标识符,该标识可用于进一步过滤和路由结构化日志数据
  • type:Fluentd 内置的指令,tail表示 Fluentd 从上次读取的位置通过 tail 不断获取数据,另外一个是http表示通过一个 GET 请求来收集数据。
  • path:tail类型下的特定参数,告诉 Fluentd 采集/var/log/containers目录下的所有日志,这是 docker 在 Kubernetes 节点上用来存储运行容器 stdout 输出日志数据的目录。
  • pos_file:检查点,如果 Fluentd 程序重新启动了,它将使用此文件中的位置来恢复日志数据收集。
  • tag:用来将日志源与目标或者过滤器匹配的自定义字符串,Fluentd 匹配源/目标标签来路由日志数据。

4.2 路由配置

<match **>

@id elasticsearch

@type elasticsearch

@log_level info

include_tag_key true

type_name fluentd

host "#{ENV['OUTPUT_HOST']}"

port "#{ENV['OUTPUT_PORT']}"

logstash_format true

<buffer>

@type file

path /var/log/fluentd-buffers/kubernetes.system.buffer

flush_mode interval

retry_type exponential_backoff

flush_thread_count 2

flush_interval 5s

retry_forever

retry_max_interval 30

chunk_limit_size "#{ENV['OUTPUT_BUFFER_CHUNK_LIMIT']}"

queue_limit_length "#{ENV['OUTPUT_BUFFER_QUEUE_LIMIT']}"

overflow_action block

</buffer>
  • match:标识一个目标标签,后面是一个匹配日志源的正则表达式,我们这里想要捕获所有的日志并将它们发送给 Elasticsearch,所以需要配置成**。
  • id:目标的一个唯一标识符。
  • type:支持的输出插件标识符,我们这里要输出到 Elasticsearch,所以配置成 elasticsearch,这是 Fluentd 的一个内置插件。
  • log_level:指定要捕获的日志级别,我们这里配置成info,表示任何该级别或者该级别以上(INFO、WARNING、ERROR)的日志都将被路由到 Elsasticsearch。
  • host/port:定义 Elasticsearch 的地址,也可以配置认证信息,我们的 Elasticsearch 不需要认证,所以这里直接指定 host 和 port 即可。
  • logstash_format:Elasticsearch 服务对日志数据构建反向索引进行搜索,将 logstash_format 设置为true,Fluentd 将会以 logstash 格式来转发结构化的日志数据。
  • Buffer: Fluentd 允许在目标不可用时进行缓存,比如,如果网络出现故障或者 Elasticsearch 不可用的时候。缓冲区配置也有助于降低磁盘的 IO。

4.3 ConfigMap

kind: ConfigMap
apiVersion: v1
metadata:
  name: fluentd-config
  namespace: logging
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
data:
  system.conf: |-
    <system>
      root_dir /tmp/fluentd-buffers/
    </system>
  containers.input.conf: |-
    <source>
      @id fluentd-containers.log
      @type tail
      path /var/log/containers/*.log
      pos_file /var/log/es-containers.log.pos
      time_format %Y-%m-%dT%H:%M:%S.%NZ
      localtime
      tag raw.kubernetes.*
      format json
      read_from_head true
    </source>
    # Detect exceptions in the log output and forward them as one log entry.
    <match raw.kubernetes.**>
      @id raw.kubernetes
      @type detect_exceptions
      remove_tag_prefix raw
      message log
      stream stream
      multiline_flush_interval 5
      max_bytes 500000
      max_lines 1000
    </match>
  system.input.conf: |-
    # Logs from systemd-journal for interesting services.
    <source>
      @id journald-docker
      @type systemd
      filters [{ "_SYSTEMD_UNIT": "docker.service" }]
      <storage>
        @type local
        persistent true
      </storage>
      read_from_head true
      tag docker
    </source>
    <source>
      @id journald-kubelet
      @type systemd
      filters [{ "_SYSTEMD_UNIT": "kubelet.service" }]
      <storage>
        @type local
        persistent true
      </storage>
      read_from_head true
      tag kubelet
    </source>
  forward.input.conf: |-
    # Takes the messages sent over TCP
    <source>
      @type forward
    </source>
  output.conf: |-
    # Enriches records with Kubernetes metadata
    <filter kubernetes.**>
      @type kubernetes_metadata
    </filter>
    <match **>
      @id elasticsearch
      @type elasticsearch
      @log_level info
      include_tag_key true
      host elasticsearch
      port 9200
      logstash_format true
      request_timeout    30s
      <buffer>
        @type file
        path /var/log/fluentd-buffers/kubernetes.system.buffer
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 2
        flush_interval 5s
        retry_forever
        retry_max_interval 30
        chunk_limit_size 2M
        queue_limit_length 8
        overflow_action block
      </buffer>
    </match>

4.4 DaemonSet

4.4.1 准备image

这里任意选择了一个image。

# pull
docker pull monotek/fluentd-elasticsearch
# tag 默认版本为latest
docker tag monotek/fluentd-elasticsearch harbor.s.com/elasticsearch/fluentd-elasticsearch
4.4.2 RBAC、DaemonSet
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd-es
  namespace: logging
  labels:
    k8s-app: fluentd-es
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluentd-es
  labels:
    k8s-app: fluentd-es
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
  - ""
  resources:
  - "namespaces"
  - "pods"
  verbs:
  - "get"
  - "watch"
  - "list"
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluentd-es
  labels:
    k8s-app: fluentd-es
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
subjects:
- kind: ServiceAccount
  name: fluentd-es
  namespace: logging
  apiGroup: ""
roleRef:
  kind: ClusterRole
  name: fluentd-es
  apiGroup: ""
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd-es
  namespace: logging
  labels:
    k8s-app: fluentd-es
    version: v2.4.0
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  selector:
    matchLabels:
      k8s-app: fluentd-es
      version: v2.4.0
  template:
    metadata:
      labels:
        k8s-app: fluentd-es
        kubernetes.io/cluster-service: "true"
        version: v2.4.0
      # This annotation ensures that fluentd does not get evicted if the node
      # supports critical pod annotation based priority scheme.
      # Note that this does not guarantee admission on the nodes (#40573).
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
      serviceAccountName: fluentd-es
      containers:
      - name: fluentd-es
        image: harbor.s.com/elasticsearch/fluentd-elasticsearch
        env:
        - name: FLUENTD_ARGS
          value: --no-supervisor -q
        resources:
          limits:
            memory: 500Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /data/docker/containers
          readOnly: true
        - name: config-volume
          mountPath: /etc/fluent/config.d
      nodeSelector:
        beta.kubernetes.io/fluentd-ds-ready: "true"
      tolerations:
      - key: node-role.kubernetes.io/master
        operator: Exists
        effect: NoSchedule
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /data/docker/containers
      - name: config-volume
        configMap:
          name: fluentd-config
4.4.3 Label
# 查看
kubectl get nodes --show-labels

# 将需要采集的节点打上标签
kubectl label node nodename beta.kubernetes.io/fluentd-ds-ready=true

感谢

1.kubernetes 日志架构-www.qikqiak.com|阳明的博客|Kubernetes|Docker|istio|Python|Golang|Cloud Native

Last modification:August 16th, 2019 at 12:34 pm