This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Node Audit Logging

Gardener extension controller which configures the rsyslog and auditd services installed on shoot nodes.

Gardener Extension to configure rsyslog with relp module

REUSE status CI Build status Go Report Card

Gardener extension controller which configures the rsyslog and auditd services installed on shoot nodes.

Usage

Local Setup and Development

1 - Configuration

Configuring the Rsyslog Relp Extension

Introduction

As a cluster owner, you might need audit logs on a Shoot node level. With these audit logs you can track actions on your nodes like privilege escalation, file integrity, process executions, and who is the user that performed these actions. Such information is essential for the security of your Shoot cluster. Linux operating systems collect such logs via the auditd and journald daemons. However, these logs can be lost if they are only kept locally on the operating system. You need a reliable way to send them to a remote server where they can be stored for longer time periods and retrieved when necessary.

Rsyslog offers a solution for that. It gathers and processes logs from auditd and journald and then forwards them to a remote server. Moreover, rsyslog can make use of the RELP protocol so that logs are sent reliably and no messages are lost.

The shoot-rsyslog-relp extension is used to configure rsyslog on each Shoot node so that the following can take place:

  1. Rsyslog reads logs from the auditd and journald sockets.
  2. The logs are filtered based on the program name and syslog severity of the message.
  3. The logs are enriched with metadata containing the name of the Project in which the Shoot is created, the name of the Shoot, the UID of the Shoot, and the hostname of the node on which the log event occurred.
  4. The enriched logs are sent to the target remote server via the RELP protocol.

The following graph shows a rough outline of how that looks in a Shoot cluster: rsyslog-logging-architecture

Shoot Configuration

The extension is not globally enabled and must be configured per Shoot cluster. The Shoot specification has to be adapted to include the shoot-rsyslog-relp extension configuration, which specifies the target server to which logs are forwarded, its port, and some optional rsyslog settings described in the examples below.

Below is an example shoot-rsyslog-relp extension configuration as part of the Shoot spec:

kind: Shoot
metadata:
  name: bar
  namespace: garden-foo
...
spec:
  extensions:
  - type: shoot-rsyslog-relp
    providerConfig:
      apiVersion: rsyslog-relp.extensions.gardener.cloud/v1alpha1
      kind: RsyslogRelpConfig
      # Set the target server to which logs are sent. The server must support the RELP protocol.
      target: some.rsyslog-rlep.server
      # Set the port of the target server.
      port: 10250
      # Define rules to select logs from which programs and with what syslog severity
      # are forwarded to the target server.
      loggingRules:
      - severity: 4
        programNames: ["kubelet", "audisp-syslog"]
      - severity: 1
        programNames: ["audisp-syslog"]
      # Define an interval of 90 seconds at which the current connection is broken and re-established.
      # By default this value is 0 which means that the connection is never broken and re-established.
      rebindInterval: 90
      # Set the timeout for relp sessions to 90 seconds. If set too low, valid sessions may be considered
      # dead and tried to recover.
      timeout: 90
      # Set how often an action is retried before it is considered to have failed.
      # Failed actions discard log messages. Setting `-1` here means that messages are never discarded.
      resumeRetryCount: -1
      # Configures rsyslog to report continuation of action suspension, e.g. when the connection to the target
      # server is broken.
      reportSuspensionContinuation: true
      # Add tls settings if tls should be used to encrypt the connection to the target server.
      tls:
        enabled: true
        # Use `name` authentication mode for the tls connection.
        authMode: name
        # Only allow connections if the server's name is `some.rsyslog-rlep.server`
        permittedPeer:
        - "some.rsyslog-rlep.server"
        # Reference to the resource which contains certificates used for the tls connection.
        # It must be added to the `.spec.resources` field of the Shoot.
        secretReferenceName: rsyslog-relp-tls
        # Instruct librelp on the Shoot nodes to use the gnutls tls library.
        tlsLib: gnutls
  resources:
    # Add the rsyslog-relp-tls secret in the resources field of the Shoot spec.
    - name: rsyslog-relp-tls
      resourceRef:
        apiVersion: v1
        kind: Secret
        name: rsyslog-relp-tls-v1
...

Choosing Which Log Messages to Send to the Target Server

The .loggingRules field defines rules about which logs should be sent to the target server. When a log is processed by rsyslog, it is compared against the list of rules in order. If the program name and the syslog severity of the log messages matches the rule, the message is forwarded to the target server. The following table describes the syslog severity and their corresponding codes:

Numerical         Severity
  Code

  0               Emergency: system is unusable
  1               Alert: action must be taken immediately
  2               Critical: critical conditions
  3               Error: error conditions
  4               Warning: warning conditions
  5               Notice: normal but significant condition
  6               Informational: informational messages
  7               Debug: debug-level messages

Below is an example with a .loggingRules section that will only forward logs from the kubelet program with syslog severity of 6 or lower and any other program with syslog severity of 2 or lower:

apiVersion: rsyslog-relp.extensions.gardener.cloud/v1alpha1
kind: RsyslogRelpConfig
target: localhost
port: 1520
loggingRules:
- severity: 6
  programNames: ["kubelet"]
- severity: 2

You can use a minimal shoot-rsyslog-relp extension configuration to forward all logs to the target server:

apiVersion: rsyslog-relp.extensions.gardener.cloud/v1alpha1
kind: RsyslogRelpConfig
target: some.rsyslog-rlep.server
port: 10250
loggingRules:
- severity: 7

Securing the Communication to the Target Server with TLS

The communication to the target server is not encrypted by default. To enable encryption, set the .tls.enabled field in the shoot-rsyslog-relp extension configuration to true. In this case, a secret which contains the TLS certificates used to establish the TLS connection to the server must be created in the same project namespace as your Shoot.

An example Secret is given below:

kind: Secret
apiVersion: v1
metadata:
  name: rsyslog-relp-tls-v1
  namespace: garden-foo
data:
  ca: |
    -----BEGIN BEGIN RSA PRIVATE KEY-----
    ...
    -----END RSA PRIVATE KEY-----    
  crt: |
    -----BEGIN BEGIN RSA PRIVATE KEY-----
    ...
    -----END RSA PRIVATE KEY-----    
  key: |
    -----BEGIN BEGIN RSA PRIVATE KEY-----
    ...
    -----END RSA PRIVATE KEY-----    

The Secret must be referenced in the Shoot’s .spec.resources field and the corresponding resource entry must be referenced in the .tls.secretReferenceName of the shoot-rsyslog-relp extension configuration:

kind: Shoot
metadata:
  name: bar
  namespace: garden-foo
...
spec:
  extensions:
  - type: shoot-rsyslog-relp
    providerConfig:
      apiVersion: rsyslog-relp.extensions.gardener.cloud/v1alpha1
      kind: RsyslogRelpConfig
      target: some.rsyslog-rlep.server
      port: 10250
      loggingRules:
      - severity: 7
      tls:
        enabled: true
        secretReferenceName: rsyslog-relp-tls
  resources:
    - name: rsyslog-relp-tls
      resourceRef:
        apiVersion: v1
        kind: Secret
        name: rsyslog-relp-tls-v1
...

You can set a few additional parameters for the TLS connection: .tls.authMode, tls.permittedPeer, and tls.tlsLib. Refer to the rsyslog documentation for more information on these parameters:

2 - Getting Started

Deploying Rsyslog Relp Extension Locally

This document will walk you through running the Rsyslog Relp extension and a fake rsyslog relp service on your local machine for development purposes. This guide uses Gardener’s local development setup and builds on top of it.

If you encounter difficulties, please open an issue so that we can make this process easier.

Prerequisites

  • Make sure that you have a running local Gardener setup. The steps to complete this can be found here.
  • Make sure you are running Gardener version >= 1.74.0 or the latest version of the master branch.

Setting up the Rsyslog Relp Extension

Important: Make sure that your KUBECONFIG env variable is targeting the local Gardener cluster!

make extension-up

This will build the shoot-rsyslog-relp, shoot-rsyslog-relp-admission, and shoot-rsyslog-relp-echo-server images and deploy the needed resources and configurations in the garden cluster. The shoot-rsyslog-relp-echo-server will act as development replacement of a real rsyslog relp server.

Creating a Shoot Cluster

Once the above step is completed, we can deploy and configure a Shoot cluster with default rsyslog relp settings.

kubectl apply -f ./example/shoot.yaml

Once the Shoot’s namespace is created, we can create a networkpolicy that will allow egress traffic from the rsyslog on the Shoot’s nodes to the rsyslog-relp-echo-server that serves as a fake rsyslog target server.

kubectl apply -f ./example/local/allow-machine-to-rsyslog-relp-echo-server-netpol.yaml

Currently, the Shoot’s nodes run Ubuntu, which does not have the rsyslog-relp and auditd packages installed, so the configuration done by the extension has no effect. Once the Shoot is created, we have to manually install the rsyslog-relp and auditd packages:

kubectl -n shoot--local--local exec -it $(kubectl -n shoot--local--local get po -l app=machine,machine-provider=local -o name) -- bash -c "
   apt-get update && \
   apt-get install -y rsyslog-relp auditd && \
   systemctl enable rsyslog.service && \
   systemctl start rsyslog.service"

Once that is done we can verify that log messages are forwarded to the rsyslog-relp-echo-server by checking its logs.

kubectl -n rsyslog-relp-echo-server logs deployment/rsyslog-relp-echo-server

Making Changes to the Rsyslog Relp Extension

Changes to the rsyslog relp extension can be applied to the local environment by repeatedly running the make recipe.

make extension-up

Tearing Down the Development Environment

To tear down the development environment, delete the Shoot cluster or disable the shoot-rsyslog-relp extension in the Shoot’s spec. When the extension is not used by the Shoot anymore, you can run:

make extension-down

This will delete the ControllerRegistration and ControllerDeployment of the extension, the shoot-rsyslog-relp-admission deployment, and the rsyslog-relp-echo-server deployment.

Maintaining the Publicly Available Image for the rsyslog-relp Echo Server

The testmachinery tests use an rsyslog-relp-echo-server image from a publicly available repository. The one which is currently used is eu.gcr.io/gardener-project/gardener/extensions/shoot-rsyslog-relp-echo-server:v0.1.0.

Sometimes it might be necessary to update the image and publish it, e.g. when updating the alpine base image version specified in the repository’s Dokerfile.

To do that:

  1. Bump the version with which the image is built in the Makefile.

  2. Build the shoot-rsyslog-relp-echo-server image:

    make echo-server-docker-image
    
  3. Once the image is built, push it to gcr with:

    make push-echo-server-image
    
  4. Finally, bump the version of the image used by the testmachinery tests here.

  5. Create a PR with the changes.

3 - Shoot Rsyslog Relp

Developer Docs for Gardener Shoot Rsyslog Relp Extension

This document outlines how Shoot reconciliation and deletion works for a Shoot with the shoot-rsyslog-relp extension enabled.

Shoot Reconciliation

This section outlines how the reconciliation works for a Shoot with the shoot-rsyslog-relp extension enabled.

Extension Enablement / Reconciliation

This section outlines how the extension enablement/reconciliation works, e.g., the extension has been added to the Shoot spec.

  1. As part of the Shoot reconciliation flow, the gardenlet deploys the Extension resource.
  2. The shoot-rsyslog-relp extension reconciles the Extension resource. pkg/controller/lifecycle/actuator.go contains the implementation of the extension.Actuator interface. The reconciliation of an Extension of type shoot-rsyslog-relp only deploys the necessary monitoring configuration - the shoot-rsyslog-relp-prometheus ConfigMap which contains the definitions for: scraping metrics by prometheus, alerting rules, and the Plutono dashboard for the Rsyslog component.
  3. As part of the Shoot reconciliation flow, the gardenlet deploys the OperatingSystemConfig resource.
  4. The shoot-rsyslog-relp extension serves a webhook that mutates the OperatingSystemConfig resource for Shoots having the shoot-rsyslog-relp extension enabled (the corresponding namespace gets labeled by the gardenlet with extensions.gardener.cloud/shoot-rsyslog-relp=true). pkg/webhook/operatingsystemconfig/ensurer.go contains implementation of the genericmutator.Ensurer interface.
    1. The webhook renders the 60-audit.conf.tpl template script and appends it to the OperatingSystemConfig files. When rendering the template, the configuration of the shoot-rsyslog-relp extension is used to fill in the required template values. The file is installed as /var/lib/rsyslog-relp-configurator/rsyslog.d/60-audit.conf on the host OS.
    2. The webhook appends the audit rules to the OperatingSystemConfig. The files are installed under /var/lib/rsyslog-relp-configurator/rules.d on the host OS.
    3. The webhook renders the configure-rsyslog.tpl.sh script and appends it to the OperatingSystemConfig files. This script is installed as /var/lib/rsyslog-relp-configurator/configure-rsyslog.sh on the host OS. It keeps the configuration of the rsyslog systemd service up-to-date by copying /var/lib/rsyslog-relp-configurator/rsyslog.d/60-audit.conf to /etc/rsyslog.d/60-audit.conf, if /etc/rsyslog.d/60-audit.conf does not exist or the files differ. The script also takes care of syncing the audit rules in /etc/audit/rules.d with the ones installed in /var/lib/rsyslog-relp-configurator/rules.d and restarts the auditd systemd service if necessary.
    4. The webhook renders the process-rsyslog-pstats.tpl.sh and appends it to the OperatingSystemConfig files. This script receives metrics from the rsyslog process, transforms them, and writes them to /var/lib/node-exporter/textfile-collector/rsyslog_pstats.prom so that they can be collected by the node-exporter.
    5. As part of the Shoot reconciliation, before the shoot-rsyslog-relp extension is deployed, the gardenlet copies all Secret and ConfigMap resources referenced in .spec.resources[] to the Shoot’s control plane namespace on the Seed. When the .tls.enabled field is true in the shoot-rsyslog-relp extension configuration, a value for .tls.secretReferenceName must also be specified so that it references a named resource reference in the Shoot’s .spec.resources[] array. The webhook appends the data of the referenced Secret in the Shoot’s control plane namespace to the OperatingSystemConfig files.
    6. The webhook appends the rsyslog-configurator.service unit to the OperatingSystemConfig units. The unit invokes the configure-rsyslog.sh script every 15 seconds.

Extension Disablement

This section outlines how the extension disablement works, i.e., the extension has to be removed from the Shoot spec.

  1. As part of the Shoot reconciliation flow, the gardenlet destroys the Extension resource because it is no longer needed.
    1. As part of the deletion flow, the shoot-rsyslog-relp extension deploys the rsyslog-relp-configuration-cleaner DaemonSet to the Shoot cluster to clean up the existing rsyslog configuration and revert the audit rules.

Shoot Deletion

This section outlines how the deletion works for a Shoot with the shoot-rsyslog-relp extension enabled.

  1. As part of the Shoot deletion flow, the gardenlet destroys the Extension resource.
    1. In the Shoot deletion flow, the Extension resource is deleted after the Worker resource. Hence, there is no need to deploy the rsyslog-relp-configuration-cleaner DaemonSet to the Shoot cluster to clean up the existing rsyslog configuration and revert the audit rules.