Tasks

Edit This Page

Auditing

Kubernetes auditing provides a security-relevant chronological set of records documenting the sequence of activities that have affected system by individual users, administrators or other components of the system. It allows cluster administrator to answer the following questions:

Kube-apiserver performs auditing. Each request on each stage of its execution generates an event, which is then pre-processed according to a certain policy and written to a backend. The policy determines what’s recorded and the backends persist the records. The current backend implementations include logs files and webhooks.

Each request can be recorded with an associated “stage”. The known stages are:

Note: The audit logging feature increases the memory consumption of the API server because some context required for auditing is stored for each request. Additionally, memory consumption depends on the audit logging configuration.

Audit Policy

Audit policy defines rules about what events should be recorded and what data they should include. The audit policy object structure is defined in the audit.k8s.io API group. When an event is processed, it’s compared against the list of rules in order. The first matching rule sets the “audit level” of the event. The known audit levels are:

You can pass a file with the policy to kube-apiserver using the --audit-policy-file flag. If the flag is omitted, no events are logged. Note that the rules field must be provided in the audit policy file. A policy with no (0) rules is treated as illegal.

Below is an example audit policy file:

audit/audit-policy.yaml
apiVersion: audit.k8s.io/v1 # This is required.
kind: Policy
# Don't generate audit events for all requests in RequestReceived stage.
omitStages:
  - "RequestReceived"
rules:
  # Log pod changes at RequestResponse level
  - level: RequestResponse
    resources:
    - group: ""
      # Resource "pods" doesn't match requests to any subresource of pods,
      # which is consistent with the RBAC policy.
      resources: ["pods"]
  # Log "pods/log", "pods/status" at Metadata level
  - level: Metadata
    resources:
    - group: ""
      resources: ["pods/log", "pods/status"]

  # Don't log requests to a configmap called "controller-leader"
  - level: None
    resources:
    - group: ""
      resources: ["configmaps"]
      resourceNames: ["controller-leader"]

  # Don't log watch requests by the "system:kube-proxy" on endpoints or services
  - level: None
    users: ["system:kube-proxy"]
    verbs: ["watch"]
    resources:
    - group: "" # core API group
      resources: ["endpoints", "services"]

  # Don't log authenticated requests to certain non-resource URL paths.
  - level: None
    userGroups: ["system:authenticated"]
    nonResourceURLs:
    - "/api*" # Wildcard matching.
    - "/version"

  # Log the request body of configmap changes in kube-system.
  - level: Request
    resources:
    - group: "" # core API group
      resources: ["configmaps"]
    # This rule only applies to resources in the "kube-system" namespace.
    # The empty string "" can be used to select non-namespaced resources.
    namespaces: ["kube-system"]

  # Log configmap and secret changes in all other namespaces at the Metadata level.
  - level: Metadata
    resources:
    - group: "" # core API group
      resources: ["secrets", "configmaps"]

  # Log all other resources in core and extensions at the Request level.
  - level: Request
    resources:
    - group: "" # core API group
    - group: "extensions" # Version of group should NOT be included.

  # A catch-all rule to log all other requests at the Metadata level.
  - level: Metadata
    # Long-running requests like watches that fall under this rule will not
    # generate an audit event in RequestReceived.
    omitStages:
      - "RequestReceived"

You can use a minimal audit policy file to log all requests at the Metadata level:

# Log all requests at the Metadata level.
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: Metadata

The audit profile used by GCE should be used as reference by admins constructing their own audit profiles.

Audit backends

Audit backends persist audit events to an external storage. Kube-apiserver out of the box provides three backends:

In all cases, audit events structure is defined by the API in the audit.k8s.io API group. The current version of the API is v1.

Note:

In case of patches, request body is a JSON array with patch operations, not a JSON object with an appropriate Kubernetes API object. For example, the following request body is a valid patch request to /apis/batch/v1/namespaces/some-namespace/jobs/some-job-name.

[
  {
    "op": "replace",
    "path": "/spec/parallelism",
    "value": 0
  },
  {
    "op": "remove",
    "path": "/spec/template/spec/containers/0/terminationMessagePolicy"
  }
]

Log backend

Log backend writes audit events to a file in JSON format. You can configure log audit backend using the following kube-apiserver flags:

Webhook backend

Webhook backend sends audit events to a remote API, which is assumed to be the same API as kube-apiserver exposes. You can configure webhook audit backend using the following kube-apiserver flags:

The webhook config file uses the kubeconfig format to specify the remote address of the service and credentials used to connect to it.

In v1.13 webhook backends can be configured dynamically.

Batching

Both log and webhook backends support batching. Using webhook as an example, here’s the list of available flags. To get the same flag for log backend, replace webhook with log in the flag name. By default, batching is enabled in webhook and disabled in log. Similarly, by default throttling is enabled in webhook and disabled in log.

The following flags are used only in the batch mode.

Parameter tuning

Parameters should be set to accommodate the load on the apiserver.

For example, if kube-apiserver receives 100 requests each second, and each request is audited only on ResponseStarted and ResponseComplete stages, you should account for ~200 audit events being generated each second. Assuming that there are up to 100 events in a batch, you should set throttling level at least 2 QPS. Assuming that the backend can take up to 5 seconds to write events, you should set the buffer size to hold up to 5 seconds of events, i.e. 10 batches, i.e. 1000 events.

In most cases however, the default parameters should be sufficient and you don’t have to worry about setting them manually. You can look at the following Prometheus metrics exposed by kube-apiserver and in the logs to monitor the state of the auditing subsystem.

Truncate

Both log and webhook backends support truncating. As an example, the following is the list of flags available for the log backend:

By default truncate is disabled in both webhook and log, a cluster administrator should set audit-log-truncate-enabled or audit-webhook-truncate-enabled to enable the feature.

Dynamic backend

FEATURE STATE: Kubernetes v1.13 alpha
This feature is currently in a alpha state, meaning:

  • The version names contain alpha (e.g. v1alpha1).
  • Might be buggy. Enabling the feature may expose bugs. Disabled by default.
  • Support for feature may be dropped at any time without notice.
  • The API may change in incompatible ways in a later software release without notice.
  • Recommended for use only in short-lived testing clusters, due to increased risk of bugs and lack of long-term support.

In Kubernetes version 1.13, you can configure dynamic audit webhook backends AuditSink API objects.

To enable dynamic auditing you must set the following apiserver flags:

When enabled, an AuditSink object can be provisioned:

apiVersion: auditregistration.k8s.io/v1alpha1
kind: AuditSink
metadata:
  name: mysink
spec:
  policy:
    level: Metadata
    stages:
    - ResponseComplete
  webhook:
    throttle:
      qps: 10
      burst: 15
    clientConfig:
      url: "https://audit.app"

For the complete API definition, see AuditSink. Multiple objects will exist as independent solutions.

Existing static backends that you configure with runtime flags are not affected by this feature. However, the dynamic backends share the truncate options of the static webhook. If webhook truncate options are set with runtime flags, they are applied to all dynamic backends.

Policy

The AuditSink policy differs from the legacy audit runtime policy. This is because the API object serves different use cases. The policy will continue to evolve to serve more use cases.

The level field applies the given audit level to all requests. The stages field is now a whitelist of stages to record.

Security

Administrators should be aware that allowing write access to this feature grants read access to all cluster data. Access should be treated as a cluster-admin level privilege.

Performance

Currently, this feature has performance implications for the apiserver in the form of increased cpu and memory usage. This should be nominal for a small number of sinks, and performance impact testing will be done to understand its scope before the API progresses to beta.

Multi-cluster setup

If you’re extending the Kubernetes API with the aggregation layer, you can also set up audit logging for the aggregated apiserver. To do this, pass the configuration options in the same format as described above to the aggregated apiserver and set up the log ingesting pipeline to pick up audit logs. Different apiservers can have different audit configurations and different audit policies.

Log Collector Examples

Use fluentd to collect and distribute audit events from log file

Fluentd is an open source data collector for unified logging layer. In this example, we will use fluentd to split audit events by different namespaces.

  1. install fluentd, fluent-plugin-forest and fluent-plugin-rewrite-tag-filter in the kube-apiserver node

    Note: Fluent-plugin-forest and fluent-plugin-rewrite-tag-filter are plugins for fluentd. You can get details about plugin installation from fluentd plugin-management.

  2. create a config file for fluentd

    $ cat <<'EOF' > /etc/fluentd/config
    # fluentd conf runs in the same host with kube-apiserver
    <source>
        @type tail
        # audit log path of kube-apiserver
        path /var/log/kube-audit
        pos_file /var/log/audit.pos
        format json
        time_key time
        time_format %Y-%m-%dT%H:%M:%S.%N%z
        tag audit
    </source>
    
    <filter audit>
        #https://github.com/fluent/fluent-plugin-rewrite-tag-filter/issues/13
        @type record_transformer
        enable_ruby
        <record>
         namespace ${record["objectRef"].nil? ? "none":(record["objectRef"]["namespace"].nil? ? "none":record["objectRef"]["namespace"])}
        </record>
    </filter>
    
    <match audit>
        # route audit according to namespace element in context
        @type rewrite_tag_filter
        <rule>
            key namespace
            pattern /^(.+)/
            tag ${tag}.$1
        </rule>
    </match>
    
    <filter audit.**>
       @type record_transformer
       remove_keys namespace
    </filter>
    
    <match audit.**>
        @type forest
        subtype file
        remove_prefix audit
        <template>
            time_slice_format %Y%m%d%H
            compress gz
            path /var/log/audit-${tag}.*.log
            format json
            include_time_key true
        </template>
    </match>
    EOF
  3. start fluentd

    $ fluentd -c /etc/fluentd/config  -vv
  4. start kube-apiserver with the following options:

    --audit-policy-file=/etc/kubernetes/audit-policy.yaml --audit-log-path=/var/log/kube-audit --audit-log-format=json
  5. check audits for different namespaces in /var/log/audit-*.log

Use logstash to collect and distribute audit events from webhook backend

Logstash is an open source, server-side data processing tool. In this example, we will use logstash to collect audit events from webhook backend, and save events of different users into different files.

  1. install logstash
  2. create config file for logstash

    $ cat <<EOF > /etc/logstash/config
    input{
        http{
            #TODO, figure out a way to use kubeconfig file to authenticate to logstash
            #https://www.elastic.co/guide/en/logstash/current/plugins-inputs-http.html#plugins-inputs-http-ssl
            port=>8888
        }
    }
    filter{
        split{
            # Webhook audit backend sends several events together with EventList
            # split each event here.
            field=>[items]
            # We only need event subelement, remove others.
            remove_field=>[headers, metadata, apiVersion, "@timestamp", kind, "@version", host]
        }
        mutate{
            rename => {items=>event}
        }
    }
    output{
        file{
            # Audit events from different users will be saved into different files.
            path=>"/var/log/kube-audit-%{[event][user][username]}/audit"
        }
    }
    EOF
  3. start logstash

    $ bin/logstash -f /etc/logstash/config --path.settings /etc/logstash/
  4. create a kubeconfig file for kube-apiserver webhook audit backend

    $ cat <<EOF > /etc/kubernetes/audit-webhook-kubeconfig
    apiVersion: v1
    clusters:
    - cluster:
        server: http://<ip_of_logstash>:8888
      name: logstash
    contexts:
    - context:
        cluster: logstash
        user: ""
      name: default-context
    current-context: default-context
    kind: Config
    preferences: {}
    users: []
    EOF
    
  5. start kube-apiserver with the following options:

    --audit-policy-file=/etc/kubernetes/audit-policy.yaml --audit-webhook-config-file=/etc/kubernetes/audit-webhook-kubeconfig
  6. check audits in logstash node’s directories /var/log/kube-audit-*/audit

Note that in addition to file output plugin, logstash has a variety of outputs that let users route data where they want. For example, users can emit audit events to elasticsearch plugin which supports full-text search and analytics.

Feedback