Beyond the Install: Architecting a Hardened RKE2 + Rancher Platform on Rocky Linux

Beyond the Install: Architecting a Hardened RKE2 + Rancher Platform on Rocky Linux

Kubernetes is notoriously easy to install but significantly harder to "wire" correctly—especially when you factor in High Availability (HA) networking, certificate management, and a production-grade backup strategy from day one.

This week, I stood up a new Kubernetes platform built on Rocky Linux using RKE2 and Rancher, with Kasten K10 integrated for data protection. Here is a look at the architecture and the hardening steps taken to get it production-ready.

Architecture Highlights

  • RKE2: Leveraging RKE2 for an upstream-aligned Kubernetes distribution with embedded etcd.
  • HA Networking: HAProxy provides a Virtual IP (VIP) for both the API server (6443) and ingress traffic (80/443).
  • Ingress Path: RKE2 ingress-nginx sits behind a NodePort, which is fronted by the HAProxy layer.
  • Centralized Management: Rancher handles the cluster lifecycle and RBAC.
  • Internal PKI: A custom Root CA was used for all externally exposed services to maintain a private, trusted chain.

The Hardening "Gotchas"

Setting up a cluster is one thing; hardening it requires navigating some specific friction points:

  • TLS Identity: I replaced the default dynamiclistener certificates with internally signed TLS.
  • Ingress Clean-up: I removed cert-manager ingress shims to prevent secret regeneration conflicts that often occur when mixing internal CA logic with automated controllers.
  • Access Control: Implemented explicit RBAC-backed access to Kasten using Kubernetes bearer tokens rather than broad permissions.
  • Scoped Protection: Backup policies are scoped at the namespace level, moving away from broad cluster-wide defaults to ensure granular recovery points.

The Storage & Backup Layer

For this phase, the storage layer uses an NFS external provisioner as the default StorageClass. Because this is a non-CSI backend, Kasten is currently configured for file-level backups.

The Takeaway

The real work of platform engineering isn't in the yum install. It’s in the integration: making sure the certificates don't conflict, the load balancer correctly tracks the ingress nodes, and the backup policy knows exactly what it’s looking at.

Next Phase: Validating restore workflows, evaluating CSI-backed storage to enable snapshot-based protection, and S3 bucket integration testing.

Read more