Kubernetes 容器编排：云原生应用部署与管理

Kubernetes 是现代云原生应用的标准编排平台，提供了强大的容器管理、服务发现、负载均衡和自动扩缩容能力。本文将深入探讨 Kubernetes 的核心概念和实践应用。

Kubernetes 核心概念

基础资源对象

# Pod - 最小部署单元
apiVersion: v1
kind: Pod
metadata:
  name: web-app-pod
  labels:
    app: web-app
    version: v1.0
  annotations:
    description: 'Web application pod'
spec:
  containers:
    - name: web-container
      image: nginx:1.21-alpine
      ports:
        - containerPort: 80
          name: http
      env:
        - name: NODE_ENV
          value: 'production'
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: db-secret
              key: url
      resources:
        requests:
          memory: '128Mi'
          cpu: '100m'
        limits:
          memory: '256Mi'
          cpu: '200m'
      livenessProbe:
        httpGet:
          path: /health
          port: 80
        initialDelaySeconds: 30
        periodSeconds: 10
      readinessProbe:
        httpGet:
          path: /ready
          port: 80
        initialDelaySeconds: 5
        periodSeconds: 5
      volumeMounts:
        - name: config-volume
          mountPath: /etc/config
        - name: data-volume
          mountPath: /data
  volumes:
    - name: config-volume
      configMap:
        name: app-config
    - name: data-volume
      persistentVolumeClaim:
        claimName: data-pvc
  restartPolicy: Always
  nodeSelector:
    disktype: ssd
  tolerations:
    - key: 'node-type'
      operator: 'Equal'
      value: 'compute'
      effect: 'NoSchedule'

---
# Service - 服务发现和负载均衡
apiVersion: v1
kind: Service
metadata:
  name: web-app-service
  labels:
    app: web-app
spec:
  selector:
    app: web-app
  ports:
    - name: http
      port: 80
      targetPort: 80
      protocol: TCP
    - name: https
      port: 443
      targetPort: 443
      protocol: TCP
  type: ClusterIP
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800

---
# Deployment - 应用部署管理
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app-deployment
  labels:
    app: web-app
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
  selector:
    matchLabels:
      app: web-app
  template:
    metadata:
      labels:
        app: web-app
        version: v1.0
    spec:
      containers:
        - name: web-container
          image: myapp:v1.0
          ports:
            - containerPort: 3000
          env:
            - name: NODE_ENV
              value: 'production'
          resources:
            requests:
              memory: '256Mi'
              cpu: '200m'
            limits:
              memory: '512Mi'
              cpu: '500m'
          livenessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 30
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /ready
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 5
      imagePullSecrets:
        - name: registry-secret

---
# ConfigMap - 配置管理
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  app.properties: |
    server.port=3000
    database.host=postgres-service
    database.port=5432
    redis.host=redis-service
    redis.port=6379
  nginx.conf: |
    server {
        listen 80;
        server_name localhost;
        
        location / {
            proxy_pass http://web-app-service:3000;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
        }
        
        location /health {
            access_log off;
            return 200 "healthy\n";
        }
    }

---
# Secret - 敏感信息管理
apiVersion: v1
kind: Secret
metadata:
  name: db-secret
type: Opaque
data:
  username: cG9zdGdyZXM= # base64 encoded 'postgres'
  password: cGFzc3dvcmQ= # base64 encoded 'password'
  url: cG9zdGdyZXNxbDovL3Bvc3RncmVzOnBhc3N3b3JkQHBvc3RncmVzLXNlcnZpY2U6NTQzMi9teWRi

---
# PersistentVolumeClaim - 持久化存储
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: fast-ssd

高级资源对象

# Ingress - 外部访问管理
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web-app-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
    nginx.ingress.kubernetes.io/ssl-redirect: 'true'
    nginx.ingress.kubernetes.io/rate-limit: '100'
    cert-manager.io/cluster-issuer: 'letsencrypt-prod'
spec:
  tls:
    - hosts:
        - myapp.example.com
      secretName: myapp-tls
  rules:
    - host: myapp.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: web-app-service
                port:
                  number: 80
          - path: /api
            pathType: Prefix
            backend:
              service:
                name: api-service
                port:
                  number: 8080

---
# HorizontalPodAutoscaler - 自动扩缩容
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app-deployment
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Percent
          value: 10
          periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
        - type: Percent
          value: 100
          periodSeconds: 15
        - type: Pods
          value: 4
          periodSeconds: 15
      selectPolicy: Max

---
# NetworkPolicy - 网络安全策略
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: web-app-netpol
spec:
  podSelector:
    matchLabels:
      app: web-app
  policyTypes:
    - Ingress
    - Egress
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              name: frontend
        - podSelector:
            matchLabels:
              app: nginx
      ports:
        - protocol: TCP
          port: 3000
  egress:
    - to:
        - podSelector:
            matchLabels:
              app: database
      ports:
        - protocol: TCP
          port: 5432
    - to: []
      ports:
        - protocol: TCP
          port: 53
        - protocol: UDP
          port: 53

应用部署策略

1. 蓝绿部署

# 蓝绿部署配置
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: web-app-rollout
spec:
  replicas: 5
  strategy:
    blueGreen:
      activeService: web-app-active
      previewService: web-app-preview
      autoPromotionEnabled: false
      scaleDownDelaySeconds: 30
      prePromotionAnalysis:
        templates:
          - templateName: success-rate
        args:
          - name: service-name
            value: web-app-preview
      postPromotionAnalysis:
        templates:
          - templateName: success-rate
        args:
          - name: service-name
            value: web-app-active
  selector:
    matchLabels:
      app: web-app
  template:
    metadata:
      labels:
        app: web-app
    spec:
      containers:
        - name: web-container
          image: myapp:latest
          ports:
            - containerPort: 3000
          resources:
            requests:
              memory: '256Mi'
              cpu: '200m'
            limits:
              memory: '512Mi'
              cpu: '500m'

---
apiVersion: v1
kind: Service
metadata:
  name: web-app-active
spec:
  selector:
    app: web-app
  ports:
    - port: 80
      targetPort: 3000

---
apiVersion: v1
kind: Service
metadata:
  name: web-app-preview
spec:
  selector:
    app: web-app
  ports:
    - port: 80
      targetPort: 3000

2. 金丝雀部署

# 金丝雀部署配置
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: web-app-canary
spec:
  replicas: 10
  strategy:
    canary:
      steps:
        - setWeight: 10
        - pause: { duration: 1m }
        - setWeight: 20
        - pause: { duration: 1m }
        - setWeight: 50
        - pause: { duration: 2m }
        - setWeight: 100
      canaryService: web-app-canary
      stableService: web-app-stable
      trafficRouting:
        nginx:
          stableIngress: web-app-stable-ingress
          annotationPrefix: nginx.ingress.kubernetes.io
          additionalIngressAnnotations:
            canary-by-header: X-Canary
      analysis:
        templates:
          - templateName: success-rate
          - templateName: latency
        startingStep: 2
        args:
          - name: service-name
            value: web-app-canary
  selector:
    matchLabels:
      app: web-app
  template:
    metadata:
      labels:
        app: web-app
    spec:
      containers:
        - name: web-container
          image: myapp:v2.0
          ports:
            - containerPort: 3000

---
# 分析模板
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: success-rate
spec:
  args:
    - name: service-name
  metrics:
    - name: success-rate
      interval: 30s
      count: 5
      successCondition: result[0] >= 0.95
      provider:
        prometheus:
          address: http://prometheus:9090
          query: |
            sum(rate(http_requests_total{service="{{args.service-name}}",status!~"5.."}[2m])) /
            sum(rate(http_requests_total{service="{{args.service-name}}"}[2m]))

服务网格集成

1. Istio 服务网格

# Istio Gateway
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: web-app-gateway
spec:
  selector:
    istio: ingressgateway
  servers:
    - port:
        number: 80
        name: http
        protocol: HTTP
      hosts:
        - myapp.example.com
      tls:
        httpsRedirect: true
    - port:
        number: 443
        name: https
        protocol: HTTPS
      tls:
        mode: SIMPLE
        credentialName: myapp-tls
      hosts:
        - myapp.example.com

---
# Virtual Service
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: web-app-vs
spec:
  hosts:
    - myapp.example.com
  gateways:
    - web-app-gateway
  http:
    - match:
        - headers:
            canary:
              exact: 'true'
      route:
        - destination:
            host: web-app-service
            subset: canary
          weight: 100
    - route:
        - destination:
            host: web-app-service
            subset: stable
          weight: 90
        - destination:
            host: web-app-service
            subset: canary
          weight: 10
      fault:
        delay:
          percentage:
            value: 0.1
          fixedDelay: 5s
      retries:
        attempts: 3
        perTryTimeout: 2s

---
# Destination Rule
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: web-app-dr
spec:
  host: web-app-service
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100
      http:
        http1MaxPendingRequests: 50
        maxRequestsPerConnection: 10
    loadBalancer:
      simple: LEAST_CONN
    outlierDetection:
      consecutiveErrors: 3
      interval: 30s
      baseEjectionTime: 30s
  subsets:
    - name: stable
      labels:
        version: v1.0
      trafficPolicy:
        circuitBreaker:
          maxConnections: 50
          maxPendingRequests: 25
          maxRetries: 3
    - name: canary
      labels:
        version: v2.0
      trafficPolicy:
        circuitBreaker:
          maxConnections: 25
          maxPendingRequests: 10
          maxRetries: 2

---
# Service Entry - 外部服务访问
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: external-api
spec:
  hosts:
    - api.external.com
  ports:
    - number: 443
      name: https
      protocol: HTTPS
  location: MESH_EXTERNAL
  resolution: DNS

2. 安全策略

# PeerAuthentication - mTLS 配置
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: production
spec:
  mtls:
    mode: STRICT

---
# AuthorizationPolicy - 访问控制
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: web-app-authz
  namespace: production
spec:
  selector:
    matchLabels:
      app: web-app
  rules:
    - from:
        - source:
            principals: ['cluster.local/ns/frontend/sa/frontend-sa']
      to:
        - operation:
            methods: ['GET', 'POST']
            paths: ['/api/*']
    - from:
        - source:
            namespaces: ['monitoring']
      to:
        - operation:
            methods: ['GET']
            paths: ['/metrics', '/health']
    - from:
        - source:
            requestPrincipals: ['*']
      when:
        - key: request.headers[authorization]
          values: ['Bearer *']
      to:
        - operation:
            methods: ['GET', 'POST', 'PUT', 'DELETE']

---
# RequestAuthentication - JWT 验证
apiVersion: security.istio.io/v1beta1
kind: RequestAuthentication
metadata:
  name: jwt-auth
  namespace: production
spec:
  selector:
    matchLabels:
      app: web-app
  jwtRules:
    - issuer: 'https://auth.example.com'
      jwksUri: 'https://auth.example.com/.well-known/jwks.json'
      audiences:
        - 'myapp-api'
      forwardOriginalToken: true

监控和可观测性

1. Prometheus 监控

# ServiceMonitor - Prometheus 服务发现
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: web-app-monitor
  labels:
    app: web-app
spec:
  selector:
    matchLabels:
      app: web-app
  endpoints:
    - port: metrics
      interval: 30s
      path: /metrics
      honorLabels: true

---
# PrometheusRule - 告警规则
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: web-app-rules
spec:
  groups:
    - name: web-app.rules
      rules:
        - alert: HighErrorRate
          expr: |
            (
              sum(rate(http_requests_total{app="web-app",status=~"5.."}[5m])) /
              sum(rate(http_requests_total{app="web-app"}[5m]))
            ) > 0.05
          for: 5m
          labels:
            severity: critical
            app: web-app
          annotations:
            summary: 'High error rate detected'
            description: 'Error rate is {{ $value | humanizePercentage }} for app {{ $labels.app }}'

        - alert: HighLatency
          expr: |
            histogram_quantile(0.95,
              sum(rate(http_request_duration_seconds_bucket{app="web-app"}[5m])) by (le)
            ) > 0.5
          for: 5m
          labels:
            severity: warning
            app: web-app
          annotations:
            summary: 'High latency detected'
            description: '95th percentile latency is {{ $value }}s for app {{ $labels.app }}'

        - alert: PodCrashLooping
          expr: |
            rate(kube_pod_container_status_restarts_total{pod=~"web-app-.*"}[15m]) > 0
          for: 5m
          labels:
            severity: critical
            app: web-app
          annotations:
            summary: 'Pod is crash looping'
            description: 'Pod {{ $labels.pod }} is restarting frequently'

2. 日志收集

# Fluentd DaemonSet
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-system
spec:
  selector:
    matchLabels:
      name: fluentd
  template:
    metadata:
      labels:
        name: fluentd
    spec:
      serviceAccountName: fluentd
      containers:
        - name: fluentd
          image: fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch
          env:
            - name: FLUENT_ELASTICSEARCH_HOST
              value: 'elasticsearch.logging.svc.cluster.local'
            - name: FLUENT_ELASTICSEARCH_PORT
              value: '9200'
            - name: FLUENT_ELASTICSEARCH_SCHEME
              value: 'http'
          resources:
            limits:
              memory: 200Mi
            requests:
              cpu: 100m
              memory: 200Mi
          volumeMounts:
            - name: varlog
              mountPath: /var/log
            - name: varlibdockercontainers
              mountPath: /var/lib/docker/containers
              readOnly: true
            - name: fluentd-config
              mountPath: /fluentd/etc
      volumes:
        - name: varlog
          hostPath:
            path: /var/log
        - name: varlibdockercontainers
          hostPath:
            path: /var/lib/docker/containers
        - name: fluentd-config
          configMap:
            name: fluentd-config

运维自动化

1. GitOps 工作流

# ArgoCD Application
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: web-app
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/myorg/web-app-k8s
    targetRevision: HEAD
    path: manifests
    helm:
      valueFiles:
        - values-production.yaml
  destination:
    server: https://kubernetes.default.svc
    namespace: production
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true
    retry:
      limit: 5
      backoff:
        duration: 5s
        factor: 2
        maxDuration: 3m
  revisionHistoryLimit: 10

2. 备份和恢复

# Velero Backup
apiVersion: velero.io/v1
kind: Backup
metadata:
  name: web-app-backup
  namespace: velero
spec:
  includedNamespaces:
    - production
  includedResources:
    - deployments
    - services
    - configmaps
    - secrets
    - persistentvolumeclaims
  labelSelector:
    matchLabels:
      app: web-app
  storageLocation: default
  volumeSnapshotLocations:
    - default
  ttl: 720h0m0s

---
# Scheduled Backup
apiVersion: velero.io/v1
kind: Schedule
metadata:
  name: web-app-daily-backup
  namespace: velero
spec:
  schedule: '0 2 * * *' # 每天凌晨2点
  template:
    includedNamespaces:
      - production
    labelSelector:
      matchLabels:
        app: web-app
    storageLocation: default
    ttl: 168h0m0s # 保留7天

总结

Kubernetes 容器编排为现代应用提供了强大的部署和管理能力：

资源管理：Pod、Service、Deployment 等核心资源
部署策略：蓝绿部署、金丝雀部署等高级部署模式
服务网格：Istio 提供的流量管理和安全策略
可观测性：Prometheus 监控和日志收集
自动化运维：GitOps 和备份恢复策略

掌握 Kubernetes，你就能构建出可扩展、高可用的云原生应用！

Kubernetes 是云原生时代的核心技术，值得深入学习和实践。