Blog: A Closer Look at NSA/CISA Kubernetes Hardening Guidance

Blog: A Closer Look at NSA/CISA Kubernetes Hardening Guidance

Authors: Jim Angel (Google), Pushkar Joglekar (VMware), and Savitha
Raghunathan (Red Hat)

Background

USA’s National Security Agency (NSA) and the Cybersecurity and Infrastructure
Security Agency (CISA)
released, “Kubernetes Hardening Guidance
on August 3rd, 2021. The guidance details threats to Kubernetes environments
and provides secure configuration guidance to minimize risk.

The following sections of this blog correlate to the sections in the NSA/CISA guidance.
Any missing sections are skipped because of limited opportunities to add
anything new to the existing content.

Note: This blog post is not a substitute for reading the guide. Reading the published
guidance is recommended before proceeding as the following content is
complementary.

Introduction and Threat Model

Note that the threats identified as important by the NSA/CISA, or the intended audience of this guidance, may be different from the threats that other enterprise users of Kubernetes consider important. This section
is still useful for organizations that care about data, resource theft and
service unavailability.

The guidance highlights the following three sources of compromises:

  • Supply chain risks
  • Malicious threat actors
  • Insider threats (administrators, users, or cloud service providers)

The threat model tries to take a step back and review threats that not only
exist within the boundary of a Kubernetes cluster but also include the underlying
infrastructure and surrounding workloads that Kubernetes does not manage.

For example, when a workload outside the cluster shares the same physical
network, it has access to the kubelet and to control plane components: etcd, controller manager, scheduler and API
server. Therefore, the guidance recommends having network level isolation
separating Kubernetes clusters from other workloads that do not need connectivity
to Kubernetes control plane nodes. Specifically, scheduler, controller-manager,
etcd only need to be accessible to the API server. Any interactions with Kubernetes
from outside the cluster can happen by providing access to API server port.

List of ports and protocols for each of these components are
defined in Ports and Protocols
within the Kubernetes documentation.

Special note: kube-scheduler and kube-controller-manager uses different ports than the ones mentioned in the guidance

The Threat modelling section
from the CNCF Cloud Native Security Whitepaper + Map
provides another perspective on approaching threat modelling Kubernetes, from a
cloud native lens.

Kubernetes Pod security

Kubernetes by default does not guarantee strict workload isolation between pods
running in the same node in a cluster. However, the guidance provides several
techniques to enhance existing isolation and reduce the attack surface in case of a
compromise.

“Non-root” containers and “rootless” container engines

Several best practices related to basic security principle of least privilege
i.e. provide only the permissions are needed; no more, no less, are worth a
second look.

The guide recommends setting non-root user at build time instead of relying on
setting runAsUser at runtime in your Pod spec. This is a good practice and provides
some level of defense in depth. For example, if the container image is built with user 10001
and the Pod spec misses adding the runAsuser field in its Deployment object. In this
case there are certain edge cases that are worth exploring for awareness:

  1. Pods can fail to start, if the user defined at build time is different from
    the one defined in pod spec and some files are as a result inaccessible.
  2. Pods can end up sharing User IDs unintentionally. This can be problematic
    even if the User IDs are non-zero in a situation where a container escape to
    host file system is possible. Once the attacker has access to the host file
    system, they get access to all the file resources that are owned by other
    unrelated pods that share the same UID.
  3. Pods can end up sharing User IDs, with other node level processes not managed
    by Kubernetes e.g. node level daemons for auditing, vulnerability scanning,
    telemetry. The threat is similar to the one above where host file system
    access can give attacker full access to these node level daemons without
    needing to be root on the node.

However, none of these cases will have as severe an impact as a container
running as root being able to escape as a root user on the host, which can provide
an attacker with complete control of the worker node, further allowing lateral
movement to other worker or control plane nodes.

Kubernetes 1.22 introduced
an alpha feature
that specifically reduces the impact of such a control plane component running
as root user to a non-root user through user namespaces.

That (alpha stage) support for user namespaces / rootless mode is available with
the following container runtimes:

Some distributions support running in rootless mode, like the following:

Immutable container filesystems

The NSA/CISA Kubernetes Hardening Guidance highlights an often overlooked feature readOnlyRootFileSystem, with a
working example in Appendix B. This example limits execution and tampering of
containers at runtime. Any read/write activity can then be limited to few
directories by using tmpfs volume mounts.

However, some applications that modify the container filesystem at runtime, like exploding a WAR or JAR file at container startup,
could face issues when enabling this feature. To avoid this issue, consider making minimal changes to the filesystem at runtime
when possible.

Building secure container images

Kubernetes Hardening Guidance also recommends running a scanner at deploy time as an admission controller,
to prevent vulnerable or misconfigured pods from running in the cluster.
Theoretically, this sounds like a good approach but there are several caveats to
consider before this can be implemented in practice:

  • Depending on network bandwidth, available resources and scanner of choice,
    scanning for vulnerabilities for an image can take an indeterminate amount of
    time. This could lead to slower or unpredictable pod start up times, which
    could result in spikes of unavailability when apps are serving peak load.
  • If the policy that allows or denies pod startup is made using incorrect or
    incomplete data it could result in several false positive or false negative
    outcomes like the following:

    • inside a container image, the openssl package is detected as vulnerable. However,
      the application is written in Golang and uses the Go crypto package for TLS. Therefore, this vulnerability
      is not in the code execution path and as such has minimal impact if it
      remains unfixed.
    • A vulnerability is detected in the openssl package for a Debian base image.
      However, the upstream Debian community considers this as a Minor impact
      vulnerability and as a result does not release a patch fix for this
      vulnerability. The owner of this image is now stuck with a vulnerability that
      cannot be fixed and a cluster that does not allow the image to run because
      of predefined policy that does not take into account whether the fix for a
      vulnerability is available or not
    • A Golang app is built on top of a distroless
      image, but it is compiled with a Golang version that uses a vulnerable standard library.
      The scanner has
      no visibility into golang version but only on OS level packages. So it
      allows the pod to run in the cluster in spite of the image containing an
      app binary built on vulnerable golang.

To be clear, relying on vulnerability scanners is absolutely a good idea but
policy definitions should be flexible enough to allow:

  • Creation of exception lists for images or vulnerabilities through labelling
  • Overriding the severity with a risk score based on impact of a vulnerability
  • Applying the same policies at build time to catch vulnerable images with
    fixable vulnerabilities before they can be deployed into Kubernetes clusters

Special considerations like offline vulnerability database fetch, may also be
needed, if the clusters run in an air-gapped environment and the scanners
require internet access to update the vulnerability database.

Pod Security Policies

Since Kubernetes v1.21, the PodSecurityPolicy
API and related features are deprecated,
but some of the guidance in this section will still apply for the next few years, until cluster operators
upgrade their clusters to newer Kubernetes versions.

The Kubernetes project is working on a replacement for PodSecurityPolicy.
Kubernetes v1.22 includes an alpha feature called called Pod Security Admission
that is intended to allow enforcing a minimum level of isolation between pods.

The built-in isolation levels for Pod Security Admission are derived
from Pod Security Standards, which is a superset of all the components mentioned in Table I page 10 of
the guidance.

Information about migrating from PodSecurityPolicy to the Pod Security
Admission feature is available
in
Migrate from PodSecurityPolicy to the Built-In PodSecurity Admission Controller.

One important behavior mentioned in the guidance that remains the same between
Pod Security Policy and its replacement is that enforcing either of them does
not affect pods that are already running. With both PodSecurityPolicy and Pod Security Admission,
the enforcement happens during the pod creation
stage.

Hardening container engines

Some container workloads are less trusted than others but may need to run in the
same cluster. In those cases, running them on dedicated nodes that include
hardened container runtimes that provide stricter pod isolation boundaries can
act as a useful security control.

Kubernetes supports
an API called RuntimeClass that is
stable / GA (and, therefore, enabled by default) stage as of Kubernetes v1.20.
RuntimeClass allows you to ensure that Pods requiring strong isolation are scheduled onto
nodes that can offer it.

Some third-party projects that you can use in conjunction with RuntimeClass are:

As discussed here and in the guidance, many features and tooling exist in and around
Kubernetes that can enhance the isolation boundaries between
pods. Based on relevant threats and risk posture, you should pick and choose
between them, instead of trying to apply all the recommendations. Having said that, cluster
level isolation i.e. running workloads in dedicated clusters, remains the strictest workload
isolation mechanism, in spite of improvements mentioned earlier here and in the guide.

Network Separation and Hardening

Kubernetes Networking can be tricky and this section focuses on how to secure
and harden the relevant configurations. The guide identifies the following as key
takeaways:

  • Using NetworkPolicies to create isolation between resources,
  • Securing the control plane
  • Encrypting traffic and sensitive data

Network Policies

Network policies can be created with the help of network plugins. In order to
make the creation and visualization easier for users, Cilium supports
a web GUI tool. That web GUI lets you create Kubernetes
NetworkPolicies (a generic API that nevertheless requires a compatible CNI plugin),
and / or Cilium network policies (CiliumClusterwideNetworkPolicy and CiliumNetworkPolicy,
which only work in clusters that use the Cilium CNI plugin).
You can use these APIs to restrict network traffic between pods, and therefore minimize the
attack vector.

Another scenario that is worth exploring is the usage of external IPs. Some
services, when misconfigured, can create random external IPs. An attacker can take
advantage of this misconfiguration and easily intercept traffic. This vulnerability
has been reported
in CVE-2020-8554.
Using externalip-webhook
can mitigate this vulnerability by preventing the services from using random
external IPs. externalip-webhook
only allows creation of services that don’t require external IPs or whose
external IPs are within the range specified by the administrator.

CVE-2020-8554 – Kubernetes API server in all versions allow an attacker
who is able to create a ClusterIP service and set the spec.externalIPs field,
to intercept traffic to that IP address. Additionally, an attacker who is able to
patch the status (which is considered a privileged operation and should not
typically be granted to users) of a LoadBalancer service can set the
status.loadBalancer.ingress.ip to similar effect.

Resource Policies

In addition to configuring ResourceQuotas and limits, consider restricting how many process
IDs (PIDs) a given Pod can use, and also to reserve some PIDs for node-level use to avoid
resource exhaustion. More details to apply these limits can be
found in Process ID Limits And Reservations.

Control Plane Hardening

In the next section, the guide covers control plane hardening. It is worth
noting that
from Kubernetes 1.20,
insecure port from API server, has been removed.

Etcd

As a general rule, the etcd server should be configured to only trust
certificates assigned to the API server. It limits the attack surface and prevents a
malicious attacker from gaining access to the cluster. It might be beneficial to
use a separate CA for etcd, as it by default trusts all the certificates issued
by the root CA.

Kubeconfig Files

In addition to specifying the token and certificates directly, .kubeconfig
supports dynamic retrieval of temporary tokens using auth provider plugins.
Beware of the possibility of malicious
shell code execution in a
kubeconfig file. Once attackers gain access to the cluster, they can steal ssh
keys/secrets or more.

Secrets

Kubernetes Secrets is the native way of managing secrets as a Kubernetes
API object. However, in some scenarios such as a desire to have a single source of truth for all app secrets, irrespective of whether they run on Kubernetes or not, secrets can be managed loosely coupled with
Kubernetes and consumed by pods through side-cars or init-containers with minimal usage of Kubernetes Secrets API.

External secrets providers
and csi-secrets-store
are some of these alternatives to Kubernetes Secrets

Log Auditing

The NSA/CISA guidance stresses monitoring and alerting based on logs. The key points
include logging at the host level, application level, and on the cloud. When
running Kubernetes in production, it’s important to understand who’s
responsible, and who’s accountable, for each layer of logging.

Kubernetes API auditing

One area that deserves more focus is what exactly should alert or be logged. The
document outlines a sample policy in Appendix L: Audit Policy that logs all
RequestResponse’s including metadata and request / response bodies. While helpful for a demo, it may not be practical for production.

Each organization needs to evaluate their
own threat model and build an audit policy that complements or helps troubleshooting incident response. Think
about how someone would attack your organization and what audit trail could identify it. Review more advanced options for tuning audit logs in the official audit logging documentation.
It’s crucial to tune your audit logs to only include events that meet your threat model. A minimal audit policy that logs everything at metadata level can also be a good starting point.

Audit logging configurations can also be tested with
kind following these instructions.

Streaming logs and auditing

Logging is important for threat and anomaly detection. As the document outlines,
it’s a best practice to scan and alert on logs as close to real time as possible
and to protect logs from tampering if a compromise occurs. It’s important to
reflect on the various levels of logging and identify the critical areas such as
API endpoints.

Kubernetes API audit logging can stream to a webhook and there’s an example in Appendix N: Webhook configuration. Using a webhook could be a method that
stores logs off cluster and/or centralizes all audit logs. Once logs are
centrally managed, look to enable alerting based on critical events. Also ensure
you understand what the baseline is for normal activities.

Alert identification

While the guide stressed the importance of notifications, there is not a blanket
event list to alert from. The alerting requirements vary based on your own
requirements and threat model. Examples include the following events:

  • Changes to the securityContext of a Pod
  • Updates to admission controller configs
  • Accessing certain files / URLs

Additional logging resources

Upgrading and Application Security practices

Kubernetes releases three times per year, so upgrade-related toil is a common problem for
people running production clusters. In addition to this, operators must
regularly upgrade the underlying node’s operating system and running
applications. This is a best practice to ensure continued support and to reduce
the likelihood of bugs or vulnerabilities.

Kubernetes supports the three most recent stable releases. While each Kubernetes
release goes through a large number of tests before being published, some
teams aren’t comfortable running the latest stable release until some time has
passed. No matter what version you’re running, ensure that patch upgrades
happen frequently or automatically. More information can be found in
the version skew policy
pages.

When thinking about how you’ll manage node OS upgrades, consider ephemeral
nodes. Having the ability to destroy and add nodes allows your team to respond
quicker to node issues. In addition, having deployments that tolerate node
instability (and a culture that encourages frequent deployments) allows for
easier cluster upgrades.

Additionally, it’s worth reiterating from the guidance that periodic
vulnerability scans and penetration tests can be performed on the various system
components to proactively look for insecure configurations and vulnerabilities.

Finding release & security information

To find the most recent Kubernetes supported versions, refer to
https://k8s.io/releases, which includes minor versions. It’s good to stay up to date with
your minor version patches.

If you’re running a managed Kubernetes offering, look for their release
documentation and find their various security channels.

Subscribe to
the Kubernetes Announce mailing list.
The Kubernetes Announce mailing list is searchable for terms such
as “Security Advisories“.
You can set up alerts and email notifications as long as you know what key
words to alert on.

Conclusion

In summary, it is fantastic to see security practitioners sharing this
level of detailed guidance in public. This guidance further highlights
Kubernetes going mainstream and how securing Kubernetes clusters and the
application containers running on Kubernetes continues to need attention and focus of
practitioners. Only a few weeks after the guidance was published, an open source
tool kubescape to validate cluster
against this guidance became available.

This tool can be a great starting point to check the current state of your
clusters, after which you can use the information in this blog post and in the guidance to assess
where improvements can be made.

Finally, it is worth reiterating that not all controls in this guidance will
make sense for all practitioners. The best way to know which controls matter is
to rely on the threat model of your own Kubernetes environment.

A special shout out and thanks to Rory McCune (@raesene) for his inputs to this blog post


Source: Kubernetes Blog

HashiConf Global Preview: Sessions for Cloud Platform Teams

HashiConf Global Preview: Sessions for Cloud Platform Teams

As enterprise cloud strategies mature, “platform teams” have become a best practice. Platform teams build, run, and support infrastructure and backing services that are exposed to development teams as self-service offerings.

HashiConf Global (livestream Tuesday – Wednesday, October 19 – 20, and rebroadcast for the Asia/Pacific time zones on Wednesday – Thursday, October 20 – 21) is packed with sessions designed for platform teams. Here’s a preview of the relevant HashiConf talks grouped into popular cloud architecture pillars: Operational Excellence, Security, and Reliability.

»Operational Excellence

»Tide’s Self-Service Service Mesh With Consul

Wednesday, October 20, 12:30 p.m. ET

Tide Business Bank — a leading UK FinTech firm — tells its HashiCorp Consul adoption story. This talk is especially relevant for platform owners using Amazon Web Services (AWS). Jez Halford, Tide’s Head of Cloud Engineering, explains how Tide uses HCP Consul to wire up Amazon ECS and EC2, as well as ECS and AWS Fargate. Interestingly, the move to Consul came without downtime or a painful “big bang” migration. If you want greater networking automation across different AWS runtimes — and want to upgrade from your status quo — here’s your playbook.

»A Journey to Improving SLOs With HashiCorp Vault

Wednesday, October 20, 2:00 pm ET

Experienced cloud engineers tend to have a story or two about the expired certificate everyone forgot about. Good secrets management hygiene is essential to application — and platform — uptime and reliability. In this session, George Hantzaras, a cloud engineering leader at Citrix, explains how HashiCorp Vault improved service level objectives (SLOs) in the company’s observability infrastructure.

»Redeploying Stateless Systems in Lieu of Patching

Tuesday, October 19, 1:00 pm ET

Seasoned operators know that patching is a way of life. But does it have to be? Chris Manfre, a Senior DevOps Engineer at Petco, says “no.” In this talk, he describes a better approach to vulnerability mitigation: replace unpatched instances with new instances that feature updated templates. He explains how HashiCorp Packer and HashiCorp Terraform Enterprise can help you adopt this immutable infrastructure best practice.

»Security

»Vault for Secrets Management in Consul K8s

Tuesday, October 19, 12:30 pm ET

We’re all hearing a lot about zero trust security these days, and for good reason. It’s the modern approach to protecting critical systems and customer data. But what does implementing zero trust security really entail?

Here’s a starting point: modernize your infrastructure around the new control point for security: identity. This is what the most secure organizations have done in recent years. From there, platform teams can authenticate and authorize access for services and users alike. That sounds great, but how do you actually do that in the real world? This talk will give you a big part of the answer, especially if you’re a Kubernetes shop.

Kyle Schochenmaier, HashiCorp Senior Engineer on the Consul Ecosystem, and HashiCorp Senior Product Manager David Yu HashiCorp explain how to use Vault as the secrets management backend for Consul atop Kubernetes. They also explain how to rotate secrets in Consul on Kubernetes. Attend this talk, and you’ll be in a much stronger position to combine the protections from Vault (machine authN and authZ) with those from Consul (machine-to-machine access).

»Managing Target’s Secrets Platform

Tuesday, October 19, 1:30 p.m. ET

Every vertical industry has its own unique security challenges. Retailers around the world use HashiCorp’s tech to improve their security posture. This is a big job, and it requires constant vigilance from platform teams in this sector. Target — one of the largest retailers in the US with more than 1,900 locations — has an extraordinarily large attack surface to protect. Shane Petrich, a Target Lead Engineer, details how Target keeps its HashiCorp Vault deployment humming.

»Vault Roadmap

Tuesday, October 19, 2:00 p.m. PT

There’s a reason why Vault is the dominant secrets management solution for platform teams: it’s incredibly powerful and it continues to get even better. So what innovations do we have planned for Vault in the near future? Attend this session and hear the specifics from Darshant Bhagat, Product Head for Vault, and Naaman Newbold Vault Director of Engineering.

»Reliability

»Consul Use Cases At Stripe: Service Mesh and More

Tuesday, October 19, 1:30 p.m. PT

Interest in the service mesh pattern is surging. According to the HashiCorp State of Strategy Cloud survey, service mesh adoption is expected to grow 250% in the year ahead. If this is on your roadmap, who better to learn from than Stripe? After all, even a few seconds of downtime could cost the fintech giant millions. This company is on the cutting edge of modern networking, and there’s a lot to learn from its experience with Consul and Kubernetes.

Mark Guan and Ruoran Wang, Software Engineers at Stripe, reveal the details of their multi-region service networking tech stack. If this sounds like an impressive feat of engineering, it is. This duo gives you an inside look at their overall topology across various AWS accounts and regions, and how they federated multi-region clusters together.

»The Future of HCP Packer

Tuesday, October 20, 12:30 p.m. ET

Platform teams use Packer to create identical machine images for multiple clouds from a single source configuration. Meanwhile, these same teams use Terraform to deploy images. What if there was a way to bring these two technologies closer together? That’s the vision behind HCP Packer: bridge the image-management workflows between Packer and Terraform. This service was first announced at HashiConf Europe in June.

Megan Marsh, Packer’s Engineering Lead, will demonstrate the product and unveil exciting roadmap details. And don’t miss the hands-on lab for HCP Packer at 1:30 p.m. ET on Wednesday, October 20.

»Network Automation on Terraform Cloud With CTS

Wednesday, October 20, 1:30 p.m. PT

Ticketing systems are the enemy of the modern platform team. They served their purpose in years past; now we’re in the era of automation and self-service. Yet even the most determined enterprise likely has a few workflows that still depend on tickets. One stubborn scenario: requests for network configuration changes. Here, dev teams are ready to release new code to production, but the new code requires firewall policy updates or changes to the load balancer member pool.

This session focuses on Consul-Terraform-Sync (CTS), a new capability that automates this gap in your workflow. HashiCorp Senior Engineers Melissa Kam, and Kim Ngo show you how CTS introduces network infrastructure automation to Consul and integrates directly with Terraform Cloud. Attend this session and learn how CTS monitors changes to the L7 network layer, and subsequently uses Terraform to dynamically update infrastructure.

»Workday’s Multi-Cloud Network Fabric With Consul & Vault

Wednesday, October 20, 1:00 p.m. ET

The hallmark of a reliable distributed system is that it continues to behave as expected even as it changes rapidly. Workday’s platform team has supported rapid growth and innovation over the last few years. To handle this growth, it uses Consul and Vault as part of its critical infrastructure. Workday Principal Engineer Daniele Vazzola explains how his company uses HashiCorp’s tools to support deployments across multiple cloud providers and on-premises datacenters. He even digs into how this multi-cloud fabric empowers service teams to autonomously set up secure connections across datacenters between workloads running on heterogeneous platforms. Don’t miss it!

»Join Us for the Livestreams

These fantastic talks are only a small part of what you’ll experience at HashiConf Global, happening online Tuesday – Wednesday, October 19 – 20 (and rebroadcast for the Asia/Pacific time zones on Wednesday – Thursday, October 20 – 21). This year, in addition to the visionary keynote sessions and dozens of useful practitioner talks, we’ve added free hands-on labs. For platform teams, we recommend the labs: Vault as a Certificate Authority (CA) for Consul Connect and Create a Custom Provider With the Terraform Plugin Framework.

Register for HashiConf Global today — it’s fast and free.


Source: HashiCorp Blog