Introduction to the Compliance Operator

The OpenShift Compliance Operator is all about keeping your OpenShift cluster secure and in line with governance policies. It does this by scanning both the OpenShift Platform 4 (ocp4) and Red Hat Core OS 4 (RHCOS4). Now, let’s break down the types of compliance checks it can handle:

  • Platform (Cluster) Related Checks: These checks use the OpenShift API to make sure everything is shipshape.
  • Node Related Checks: These scans take a closer look at the filesystem of each node.

So, why should you use the Compliance Operator? Well, its a great companion to the Red Hat Advanced Cluster Security (RHACS) Operator, ensuring you’re following the best security practices. If you are not familiar with RHACS, I recommend checking out my previous blog on the topic. Now, you might be wondering: if RHACS already covers compliance, why bother with the Compliance Operator? Let me explain the key differences:

  • Granular Control: Unlike some other tools, the Compliance Operator doesn’t just assess running workloads on the cluster (like deployments).
  • Customization Options: It lets you tailor compliance checks to suit your organization’s specific needs. We will dive deeper into this when we talk about remediation.
  • Automated Remediation: The Compliance Operator can automatically fix certain issues it finds through its automated remediation mechanism.

So, as you can see, the Compliance Operator provides a range of features that complement RHACS and give you more control and flexibility.

Understanding the differnet parts of the Compliance Operator

If you’re looking to wrap your head around the Compliance Operator, the official documentation is your go-to resource. It’s got a handy diagram that breaks down each component and makes things easier to grasp.

Architecture - Compliance Operator

Now, let’s take a quick peek into the main workflow involved in compliance management:

Step 1: Figure Out What You Need to Comply With

The Compliance Operator offers two profile bundles: one for the Platform (ocp4) and another for Node (rhcos4) checks. These bundles contain XML files like ssg-ocp4-ds.xml and ssg-rhcos4-ds.xml, which act as the source of truth. They define the compliance profiles and rules for scanning our environment. All the checks and their values are neatly packed in these XML files, which you can get from the openshift-compliance-content-rhel container in the Red Hat registry.

rbajaj@rhcos4$ oc get profilebundle
NAME     CONTENTIMAGE                                                                                                                               CONTENTFILE         STATUS
ocp4     registry.redhat.io/compliance/openshift-compliance-content-rhel8@sha256:7c1285f294f4630766bf158924ea7eff9b6549b2a9881e6dea6bad07887525a0   ssg-ocp4-ds.xml     VALID
rhcos4   registry.redhat.io/compliance/openshift-compliance-content-rhel8@sha256:7c1285f294f4630766bf158924ea7eff9b6549b2a9881e6dea6bad07887525a0   ssg-rhcos4-ds.xml   VALID

Next, the compliance profiles are security guidelines set by different frameworks like CIS (Center for Internet Security) and NIST (National Institute of Standards and Technology). Basically, compliance profiles usually come in pairs. For example, let’s take ocp4-cis and ocp4-cis-node. When it comes to CIS standards, the ocp4-cis profile scans Kubernetes API resources, while the ocp4-cis-node profile focuses on scanning the nodes themselves.

rbajaj@rhcos4$ oc get profiles.compliance.openshift.io
NAME                       AGE   VERSION
ocp4-cis                   53d   1.4.0
ocp4-cis-1-4               53d   1.4.0
ocp4-cis-node              53d   1.4.0
ocp4-cis-node-1-4          53d   1.4.0
ocp4-e8                    53d
ocp4-high                  53d   Revision 4
ocp4-high-node             53d   Revision 4
ocp4-high-node-rev-4       53d   Revision 4
ocp4-high-rev-4            53d   Revision 4
ocp4-moderate              53d   Revision 4
ocp4-moderate-node         53d   Revision 4
ocp4-moderate-node-rev-4   53d   Revision 4
ocp4-moderate-rev-4        53d   Revision 4
ocp4-nerc-cip              53d
ocp4-nerc-cip-node         53d
ocp4-pci-dss               53d   3.2.1
ocp4-pci-dss-3-2           53d   3.2.1
ocp4-pci-dss-node          53d   3.2.1
ocp4-pci-dss-node-3-2      53d   3.2.1
ocp4-stig                  53d   V1R1
ocp4-stig-node             53d   V1R1
ocp4-stig-node-v1r1        53d   V1R1
ocp4-stig-v1r1             53d   V1R1
rhcos4-e8                  53d
rhcos4-high                53d   Revision 4
rhcos4-high-rev-4          53d   Revision 4
rhcos4-moderate            53d   Revision 4
rhcos4-moderate-rev-4      53d   Revision 4
rhcos4-nerc-cip            53d
rhcos4-stig                53d   V1R1
rhcos4-stig-v1r1           53d   V1R1

Pro tip - 1: If you try to $oc get profiles, you might see the following output:

rbajaj@rhcos4$ oc get profiles
No resources found in openshift-compliance namespace.

CRDs can sometimes have long names that are complex to type, or they may have similar prefixes, which can lead to selecting the wrong CRD. For example, in this case, we want to use profiles.compliance.openshift.io, but OpenShift might pick up profiles.tuned.openshift.io. To avoid such situations, always run $oc get crd | grep compliance.

rbajaj@rhcos4$ oc get crd | grep profile
profiles.compliance.openshift.io
profiles.tuned.openshift.io

rbajaj@rhcos4$ oc get crd | grep compliance
compliancecheckresults.compliance.openshift.io
complianceremediations.compliance.openshift.io
compliancescans.compliance.openshift.io
compliancesuites.compliance.openshift.io
profilebundles.compliance.openshift.io
profiles.compliance.openshift.io
rules.compliance.openshift.io
scansettingbindings.compliance.openshift.io
scansettings.compliance.openshift.io
tailoredprofiles.compliance.openshift.io
variables.compliance.openshift.io

# Once you identify which crd you want to use:
rbajaj@rhcos4$ oc get profiles.compliance.openshift.io

Pro tip - 2: Before deciding which profile to use for scanning the compliance of your infrastructure, review the rules that a profile checks and the rationale behind each rule.

rbajaj@rhcos4$ oc get profiles.compliance.openshift.io ocp4-cis -o yaml
apiVersion: compliance.openshift.io/v1alpha1
description: This profile defines a baseline that aligns to the Center for Internet
  Security® Red Hat OpenShift Container Platform 4 Benchmark™, V1.5. This profile
  includes Center for Internet Security® Red Hat OpenShift Container Platform 4 CIS
  Benchmarks™ content. Note that this part of the profile is meant to run on the Platform
  that Red Hat OpenShift Container Platform 4 runs on top of. This profile is applicable
  to OpenShift versions 4.12 and greater.
... skipped metadata ...
rules:
- ocp4-accounts-restrict-service-account-tokens
- ocp4-accounts-unique-service-account
- ocp4-api-server-admission-control-plugin-alwaysadmit
- ocp4-api-server-admission-control-plugin-alwayspullimages
- ocp4-api-server-admission-control-plugin-namespacelifecycle
- ocp4-api-server-admission-control-plugin-noderestriction
- ocp4-api-server-admission-control-plugin-scc
- ocp4-api-server-admission-control-plugin-service-account
- ocp4-api-server-anonymous-auth
- ocp4-api-server-api-priority-gate-enabled
- ocp4-api-server-audit-log-maxbackup
- ocp4-api-server-audit-log-maxsize
- ocp4-api-server-audit-log-path
- ocp4-api-server-auth-mode-no-aa
- ocp4-api-server-auth-mode-rbac
- ocp4-api-server-basic-auth
- ocp4-api-server-bind-address
- ocp4-api-server-client-ca
- ocp4-api-server-encryption-provider-cipher
- ocp4-api-server-etcd-ca
- ocp4-api-server-etcd-cert
... and so on

You can check the contents of each rule with the following command:

# Replace ocp4-accounts-restrict-service-account-tokens with the rule you want to check the contents of.
rbajaj@rhcos4$ oc get rule ocp4-accounts-restrict-service-account-tokens -o yaml

This command will illustrate the rationale behind why this check must be performed, along with information about how you can verify whether this rule is applied in the cluster. It provides you with the commands to check if compliance has been met in your infrastructure. If not, it will also recommend the changes in settings that might be required to comply with the specified rule.

Step 2: Configure How Your Scans Should Work

This is where you get to specify how you want your compliance scans to be set up. You will need to define scan settings that align with your compliance requirements. The scan settings define technical parameters for the scan. Let us talk about a few important sections of the scan settings:

rbajaj@rhcos4$ oc get ss <scan-setting-name> -o yaml
apiVersion: compliance.openshift.io/v1alpha1
kind: ScanSetting
... skipped metadata ...
rawResultStorage:
  nodeSelector:
    node-role.kubernetes.io/worker: ""
  pvAccessModes:
  - ReadWriteMany
  rotation: 3
  size: 1Gi
  storageClassName: <dynamically-provisioned-fs>
  tolerations:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
    operator: Exists
roles:
- master
- worker
- infra
scanTolerations:
- operator: Exists
schedule: 0 1 * * *
strictNodeScan: true

The scan settings are responsible for specifying which nodes to perform the scan on. They determine when the scan will run using the schedule field as a cron job. The scan settings also define where the results of the scan must be stored, specifying the node with the rawResultStorage field.

Step 3: Connect Compliance Requirements with Scan Configurations

This step is all about linking the profiles that you need to comply with to the way (scan settings) you want to scan. By doing this, you ensure that the scans are performed according to the rules you have laid out. You can achieve this by define the scan setting bindings.

rbajaj@rhcos4$ oc get ssb <scan-setting-binding-name> -o yaml
apiVersion: compliance.openshift.io/v1alpha1
kind: ScanSettingBinding
... skipped metadata ...
profiles:
- apiGroup: compliance.openshift.io/v1alpha1
  kind: Profile
  name: ocp4-cis-node
settingsRef:
  apiGroup: compliance.openshift.io/v1alpha1
  kind: ScanSetting
  name: <scan-setting-name>

Step 4: Keep an Eye on Compliance Scans

To stay on top of things, you will want to use compliance suites to monitor your compliance scans. These suites help you effectively manage and organize the scanning process.

rbajaj@rhcos4$ oc get compliancesuites
NAME                         PHASE   RESULT
<scan-setting-binding>-cis   DONE    ERROR

The name of the resource compliance suites is the name of your SSB suffixed with -cis. The result of the compliance suite indicates the outcome of the scan. In our case, the cluster is not compliant with the rules defined by the profiles and that there were errors during the execution of the scan. Other possible result values include:

  • SUCCESS: This indicates that the cluster is compliant with the rules defined by the profiles and the scans performed on the platform and the nodes.

To know more in detail why the compliancesuites result in error, we can check the compliancescans resource. This resource shows in detail about which roles exactly throw the error.

rbajaj@rhcos4$ oc get compliancescans
NAME                   PHASE   RESULT
ocp4-cis-node-infra    DONE    ERROR
ocp4-cis-node-master   DONE    NON-COMPLIANT
ocp4-cis-node-worker   DONE    INCONSISTENT 

Wow! These are some serious compliance failures in my cluster. Don’t worry, we can fix all of these, but we will address them in the next blog. For now, let’s just understand what these results really mean:

  • ERROR: This means that the scan was not able to complete due to some reason, such as the pod responsible for performing the scan not being scheduled.
  • NON-COMPLIANT: This simply means that there are a few checks which are failing, and we can resolve them using the Compliance Check Results (Step 5).
  • INCONSISTENT: This means that two nodes with the same role yield different check results. In this case, we need to either adjust the list of roles or check nodes with the same role that have different configurations. This should not happen since we execute machine config pools on these nodes, and nodes with the same roles are expected to have similar configurations.

Step 5: Review the Results

Once the scans are done, you can check out the Compliance Check Results (CCR). These give you insights into the compliance status of your environment.

rbajaj@rhcos4$ oc get ccr
NAME                                                                          STATUS           SEVERITY
ocp4-cis-node-infra-file-groupowner-cni-conf                                  PASS             medium
ocp4-cis-node-infra-file-groupowner-kubelet-conf                              PASS             medium
ocp4-cis-node-infra-file-groupowner-multus-conf                               PASS             medium
ocp4-cis-node-infra-file-groupowner-ovn-cni-server-sock                       PASS             medium
ocp4-cis-node-infra-file-groupowner-ovn-db-files                              PASS             medium
ocp4-cis-node-infra-file-groupowner-ovs-conf-db                               PASS             medium
ocp4-cis-node-infra-file-groupowner-ovs-conf-db-lock                          PASS             medium
ocp4-cis-node-infra-file-groupowner-ovs-pid                                   PASS             medium
ocp4-cis-node-infra-file-groupowner-ovs-sys-id-conf                           PASS             medium
ocp4-cis-node-infra-file-groupowner-ovs-vswitchd-pid                          PASS             medium
ocp4-cis-node-infra-file-groupowner-ovsdb-server-pid                          PASS             medium
ocp4-cis-node-infra-file-groupowner-worker-ca                                 PASS             medium
ocp4-cis-node-infra-file-groupowner-worker-kubeconfig                         PASS             medium
ocp4-cis-node-infra-file-groupowner-worker-service                            PASS             medium
ocp4-cis-node-infra-file-owner-cni-conf                                       PASS             medium
ocp4-cis-node-infra-file-owner-kubelet                                        PASS             medium
ocp4-cis-node-infra-file-owner-kubelet-conf                                   PASS             medium
ocp4-cis-node-infra-file-owner-multus-conf                                    PASS             medium
ocp4-cis-node-infra-file-owner-ovn-cni-server-sock                            PASS             medium
ocp4-cis-node-infra-file-owner-ovn-db-files                                   PASS             medium
ocp4-cis-node-infra-file-owner-ovs-conf-db                                    PASS             medium
ocp4-cis-node-infra-file-owner-ovs-conf-db-lock                               PASS             medium
ocp4-cis-node-infra-file-owner-ovs-pid                                        PASS             medium
ocp4-cis-node-infra-file-owner-ovs-sys-id-conf                                PASS             medium
ocp4-cis-node-infra-file-owner-ovs-vswitchd-pid                               PASS             medium
ocp4-cis-node-infra-file-owner-ovsdb-server-pid                               PASS             medium
ocp4-cis-node-infra-file-owner-worker-ca                                      PASS             medium
ocp4-cis-node-infra-file-owner-worker-kubeconfig                              PASS             medium
ocp4-cis-node-infra-file-owner-worker-service                                 PASS             medium
ocp4-cis-node-infra-file-permissions-cni-conf                                 FAIL             medium
ocp4-cis-node-infra-file-permissions-kubelet-conf                             PASS             medium
ocp4-cis-node-infra-file-permissions-multus-conf                              PASS             medium
... many more

The CCR is a long list of all the checks and their individual scan results. The results can either be PASS, FAIL, or MANUAL. MANUAL means the check is not automated; manual steps are provided within the rule, and one has to manually check your cluster’s compliance with the rule. We will talk about remediating failed checks in future blogs.

Quick Overview of the Scanning Process

In the beginning of this blog, we made it clear that there are two types of compliance checks that the operator can perform: platform and node. Let’s conclude this blog with a brief understanding of how compliance checks are performed in both cases.

For node-related checks, when the scan takes place, multiple pods are run. Each pod is scheduled on one cluster node. The pods use the host path volume to mount the root file system of the node into the pod, and then the pod runs OpenSCAP to scan this root filesystem.

For platform-related checks, a separate API check pod is run. This pod scans the OpenShift API resources. It runs an API resource collector, saving these resources on the local filesystem in the pod. Then, the pod runs OpenSCAP, which scans the API resources.