Skip to main content
πŸš€ Claude Code Bootcamp β€” May 30 5 hours from prompting to production. Build 10 real-world projects with AI-assisted development. Register Now
Kubernetes audit logging complete guide
Platform Engineering

Kubernetes Audit Logging: Complete Setup and Analysis Guide

Configure Kubernetes API server audit logging to track who did what, when β€” with policy examples, backend options, log analysis patterns, and security alerting integration.

LB
Luca Berton
Β· 2 min read

Why Audit Logging Matters

Kubernetes audit logs answer four questions for every API request:

  • Who made the request (user, service account, or anonymous)
  • What they did (verb: get, create, patch, delete)
  • When it happened (timestamp with nanosecond precision)
  • On what resource (pods, secrets, configmaps, etc.)

Without audit logging, you cannot detect:

  • Unauthorized secret access
  • Privilege escalation attempts
  • Accidental or malicious resource deletion
  • Compliance violations

Audit Policy Structure

The audit policy defines what to log and at what detail level:

apiVersion: audit.k8s.io/v1
kind: Policy
rules:
  # Don't log read-only requests to certain endpoints
  - level: None
    users: ["system:kube-proxy"]
    verbs: ["watch"]
    resources:
      - group: ""
        resources: ["endpoints", "services", "services/status"]

  # Log secret access at Metadata level (who accessed, not the content)
  - level: Metadata
    resources:
      - group: ""
        resources: ["secrets", "configmaps", "tokenreviews"]

  # Log all write operations with request body
  - level: Request
    verbs: ["create", "update", "patch", "delete"]
    resources:
      - group: ""
      - group: "apps"
      - group: "batch"

  # Log everything else at metadata level
  - level: Metadata
    omitStages:
      - RequestReceived

Audit Levels

LevelWhat Is Logged
NoneNothing β€” skip this rule
MetadataRequest metadata (user, timestamp, resource, verb)
RequestMetadata + request body
RequestResponseMetadata + request body + response body

Warning: RequestResponse on secrets would log secret values. Use Metadata for sensitive resources.

Enabling Audit Logging

Log Backend (File)

# kube-apiserver manifest
spec:
  containers:
    - command:
        - kube-apiserver
        - --audit-policy-file=/etc/kubernetes/audit-policy.yaml
        - --audit-log-path=/var/log/kubernetes/audit.log
        - --audit-log-maxage=30
        - --audit-log-maxbackup=10
        - --audit-log-maxsize=100
      volumeMounts:
        - name: audit-policy
          mountPath: /etc/kubernetes/audit-policy.yaml
          readOnly: true
        - name: audit-logs
          mountPath: /var/log/kubernetes
  volumes:
    - name: audit-policy
      hostPath:
        path: /etc/kubernetes/audit-policy.yaml
        type: File
    - name: audit-logs
      hostPath:
        path: /var/log/kubernetes
        type: DirectoryOrCreate

Webhook Backend (Real-time)

Send audit events to an external service:

apiVersion: v1
kind: Config
clusters:
  - name: audit-webhook
    cluster:
      server: https://audit-collector.internal:9443/audit
      certificate-authority: /etc/kubernetes/pki/webhook-ca.crt
contexts:
  - name: default
    context:
      cluster: audit-webhook
current-context: default
# kube-apiserver flags
- --audit-webhook-config-file=/etc/kubernetes/audit-webhook.yaml
- --audit-webhook-batch-max-size=10
- --audit-webhook-batch-max-wait=5s

Production Audit Policy

Here is a battle-tested policy that balances security visibility with log volume:

apiVersion: audit.k8s.io/v1
kind: Policy
rules:
  # Skip noisy read-only system requests
  - level: None
    userGroups: ["system:nodes"]
    verbs: ["get", "list", "watch"]
    resources:
      - group: ""
        resources: ["nodes", "nodes/status"]

  - level: None
    users:
      - "system:kube-scheduler"
      - "system:kube-controller-manager"
    verbs: ["get", "list", "watch"]

  # Skip health checks and metrics
  - level: None
    nonResourceURLs:
      - "/healthz*"
      - "/readyz*"
      - "/livez*"
      - "/metrics"

  # CRITICAL: Log all secret access
  - level: Metadata
    resources:
      - group: ""
        resources: ["secrets"]

  # Log RBAC changes with full request body
  - level: RequestResponse
    resources:
      - group: "rbac.authorization.k8s.io"
        resources: ["clusterroles", "clusterrolebindings", "roles", "rolebindings"]

  # Log authentication events
  - level: Metadata
    resources:
      - group: "authentication.k8s.io"
      - group: "authorization.k8s.io"

  # Log all deletions with request body
  - level: Request
    verbs: ["delete", "deletecollection"]

  # Log pod exec and port-forward (high-risk)
  - level: Request
    resources:
      - group: ""
        resources: ["pods/exec", "pods/portforward", "pods/attach"]

  # Log workload mutations
  - level: Request
    verbs: ["create", "update", "patch"]
    resources:
      - group: "apps"
        resources: ["deployments", "daemonsets", "statefulsets"]
      - group: "batch"
        resources: ["jobs", "cronjobs"]

  # Default: metadata only
  - level: Metadata
    omitStages:
      - RequestReceived

Analyzing Audit Logs

Common Queries with jq

# Who accessed secrets in the last hour?
cat /var/log/kubernetes/audit.log | \
  jq -r 'select(.objectRef.resource == "secrets") |
    [.requestReceivedTimestamp, .user.username, .verb, .objectRef.namespace + "/" + .objectRef.name] |
    @tsv'

# Failed authentication attempts
cat /var/log/kubernetes/audit.log | \
  jq -r 'select(.responseStatus.code >= 400 and .responseStatus.code < 500) |
    [.requestReceivedTimestamp, .user.username, .verb, .responseStatus.code, .responseStatus.reason] |
    @tsv'

# All pod exec commands (potential breakglass)
cat /var/log/kubernetes/audit.log | \
  jq -r 'select(.objectRef.subresource == "exec") |
    [.requestReceivedTimestamp, .user.username, .objectRef.namespace + "/" + .objectRef.name] |
    @tsv'

# RBAC changes (privilege escalation detection)
cat /var/log/kubernetes/audit.log | \
  jq -r 'select(.objectRef.apiGroup == "rbac.authorization.k8s.io" and
    (.verb == "create" or .verb == "update" or .verb == "patch")) |
    [.requestReceivedTimestamp, .user.username, .verb, .objectRef.resource, .objectRef.name] |
    @tsv'

Shipping to EFK/Loki

# Fluent Bit config for audit logs
[INPUT]
    Name              tail
    Path              /var/log/kubernetes/audit.log
    Parser            json
    Tag               kube.audit.*
    Refresh_Interval  5

[FILTER]
    Name    modify
    Match   kube.audit.*
    Add     log_type kubernetes_audit

[OUTPUT]
    Name            es
    Match           kube.audit.*
    Host            elasticsearch.logging
    Port            9200
    Index           kube-audit
    Type            _doc

Security Alerting Rules

Prometheus AlertManager (via audit-exporter)

groups:
  - name: kubernetes-audit-security
    rules:
      - alert: SecretAccessByUnknownUser
        expr: |
          count by (user) (
            kube_audit_event_total{
              resource="secrets",
              verb=~"get|list|watch",
              user!~"system:.*|admin"
            }
          ) > 0
        for: 0m
        labels:
          severity: warning
        annotations:
          summary: "Unknown user {{ $labels.user }} accessed secrets"

      - alert: ClusterRoleBindingCreated
        expr: |
          kube_audit_event_total{
            resource="clusterrolebindings",
            verb="create"
          } > 0
        for: 0m
        labels:
          severity: critical
        annotations:
          summary: "New ClusterRoleBinding created β€” potential privilege escalation"

      - alert: PodExecDetected
        expr: |
          kube_audit_event_total{
            subresource="exec"
          } > 0
        for: 0m
        labels:
          severity: info
        annotations:
          summary: "Pod exec by {{ $labels.user }} in {{ $labels.namespace }}"

Managed Kubernetes Audit Logging

EKS

# Enable via eksctl
eksctl utils update-cluster-logging \
  --cluster my-cluster \
  --enable-types audit \
  --approve

# Logs go to CloudWatch Logs group: /aws/eks/my-cluster/cluster
# Query via CloudWatch Insights:
fields @timestamp, user.username, verb, objectRef.resource, objectRef.name
| filter objectRef.resource = "secrets"
| sort @timestamp desc
| limit 50

GKE

# Admin Activity logs are always on (free)
# Data Access logs (secret reads) must be enabled:
gcloud projects get-iam-policy PROJECT_ID --format=json > policy.json
# Add "DATA_READ" for "k8s.io" service

AKS

# Enable diagnostic settings
az monitor diagnostic-settings create \
  --resource /subscriptions/.../managedClusters/my-cluster \
  --name audit-logs \
  --logs '[{"category":"kube-audit","enabled":true}]' \
  --workspace /subscriptions/.../workspaces/my-workspace

Storage and Retention

Audit logs grow fast. Plan for:

Cluster SizeDaily Volume30-Day Retention
Small (under 50 nodes)1-5 GB/day30-150 GB
Medium (50-200 nodes)5-20 GB/day150-600 GB
Large (200+ nodes)20-100+ GB/day600 GB - 3 TB

Cost optimization:

  • Use None level aggressively for known-safe traffic
  • Set omitStages: ["RequestReceived"] to halve event count
  • Archive to cold storage after 7 days, keep hot for alerting

Deepen Your Kubernetes Skills

If you found this article useful, check out my books for hands-on Kubernetes mastery:

Both books follow the same practical, example-driven approach you see in my articles.

Free 30-min AI & Cloud consultation

Book Now