Tech Tree · Engineering
CNCF Adoption
Progress your platform from raw containers to a production-grade, policy-governed CNCF stack. Each node maps to an actual CNCF project (containerd, Kubernetes, Cilium, Argo CD, OpenTelemetry, Falco, OPA, Crossplane, …) with concrete adoption steps and realistic effort.
Maturity tiers
Containerise
You ship in containers. Builds are reproducible, images live in a registry, and a single host can run the workload — even if there's no orchestration yet.
Orchestrate
Kubernetes is the substrate. Workloads are declarative, ingress + service discovery are solved, and you ship via Helm / manifests with health checks in place.
Operate
The platform is observable, policy-governed, and self-service. Service mesh, OpenTelemetry, OPA / Kyverno, and GitOps make it possible to run dozens of services without dozens of platform engineers.
Optimise
The platform scales itself, surfaces its own state, and presents a sane developer interface. Crossplane / Backstage / KEDA / progressive delivery turn the cluster into an internal product.
Tracks
Runtime & Packaging
How code becomes an image and an image becomes a running process. Builds, registries, runtimes, OCI standards.
Orchestration
Kubernetes core, workload APIs, ingress, service discovery, packaging (Helm), and storage interfaces.
Networking & Mesh
CNI, service mesh, ingress controllers, and the L4/L7 networking that makes Kubernetes services discoverable, secure, and observable.
Observability
Metrics (Prometheus), traces + logs (OpenTelemetry, Fluent Bit, Loki), and the dashboards / alerting that make the platform legible.
Security & Policy
Runtime security (Falco), supply-chain signing (Cosign / Sigstore), and policy-as-code (OPA, Kyverno) that govern what runs and how.
CI/CD & GitOps
Pipelines, GitOps controllers (Argo CD, Flux), progressive delivery, and the platform-as-product layer (Backstage, Crossplane, KEDA).
All capabilities (23)
Containerise
containerd as Default Runtime
Hosts and clusters use containerd (or CRI-O) directly, not Docker. CRI-compliant, smaller surface area, the way every managed Kubernetes ships.
containerd · cri-o · runtime
OCI-Compliant Images
All services build to OCI-format images via Docker or Buildpacks. Multi-stage builds keep runtime images lean; images are tagged with the commit SHA and stored in a registry.
oci · docker · containers · buildpacks
Registry Auth + Image Signing Foundation
Image pulls require authentication. Public images come only from trusted sources (curated mirror). Sets the stage for full Cosign / Sigstore enforcement at the Operate tier.
registry · image-signing · security
Orchestrate
Argo CD / Flux Bootstrap
Cluster state is reconciled from a Git repo, not `kubectl apply`. Argo CD or Flux watches the repo and applies changes; engineers ship by opening a PR.
gitops · argocd · flux
CNI with NetworkPolicy Enforcement
Cilium, Calico, or another CNI handles pod networking AND enforces NetworkPolicies. Default-deny posture means a compromised pod can't talk to anything it shouldn't.
cni · cilium · calico · networkpolicy
Helm Charts for Releasable Units
Each service ships as a versioned Helm chart. Environment overrides via values files; chart repository stores immutable versions.
helm · packaging · k8s
Production Ingress Controller
NGINX Ingress, Traefik, or Contour exposes services to the outside world. TLS termination, rate limiting, and basic routing live at the ingress layer.
ingress · cert-manager · tls · nginx · traefik
Production Kubernetes Cluster
A managed or self-hosted Kubernetes cluster runs the workload. Workloads are declarative (Deployment / StatefulSet / Job), with resource requests, limits, and readiness/liveness probes.
kubernetes · k8s · orchestration
Prometheus + Grafana
kube-prometheus-stack runs cluster metrics, every workload exposes /metrics, and Grafana dashboards cover the platform + each service's RED metrics.
prometheus · grafana · metrics
Operate
Cluster-Wide Log Aggregation
Fluent Bit ships every pod's stdout to Loki (or another aggregator). Querying logs is one search bar across the cluster, joined to traces by request-id.
logging · fluent-bit · loki
OpenTelemetry End-to-End
OpenTelemetry Collector receives traces + metrics + logs from every workload. Auto-instrumentation for HTTP/DB; manual spans for business operations. Export to your backend of choice.
opentelemetry · tracing · observability
Policy-as-Code (OPA Gatekeeper / Kyverno)
Admission policies block insecure manifests at apply time: no privileged pods, no `latest` tags, required labels, signed-image enforcement. Policies live in Git.
opa · gatekeeper · kyverno · policy-as-code
Progressive Delivery (Argo Rollouts / Flagger)
Releases roll out as canary or blue/green, with automated rollback on SLO breach. Replaces "yolo `kubectl rollout`" with a deliberate, observable process.
progressive-delivery · canary · argo-rollouts · flagger
Runtime Security (Falco)
Falco watches syscalls + Kubernetes audit events and alerts on suspicious behaviour — shells inside containers, unexpected outbound connections, sensitive file reads.
falco · runtime-security · security
Service Mesh (Linkerd or Istio)
mTLS between every pod, retries / timeouts / circuit breakers as policy, and the ability to inject latency or faults for testing. Linkerd is the lighter default; Istio adds richer policy + multi-cluster.
service-mesh · linkerd · istio · mtls
Supply-Chain Signing (Cosign / Sigstore)
Every image is signed at build time via Cosign; the admission controller (Gatekeeper / Kyverno) rejects unsigned images. SLSA level 2+ on the build pipeline.
cosign · sigstore · slsa · supply-chain
Optimise
AIOps + Anomaly Correlation
Anomaly detection on the OTel + Prometheus telemetry surfaces issues before they cross human-set thresholds. Correlation across signals (metrics, logs, traces, deploys) reduces MTTD significantly.
aiops · anomaly-detection · observability
Developer Portal (Backstage)
Backstage is the front door to the platform: service catalogue, scaffolder templates, software docs, TechDocs. Engineers self-serve "create a new service" without filing tickets.
backstage · developer-portal · platform-engineering
Event-Driven Autoscaling (KEDA)
KEDA scales workloads on external signals: queue depth, Kafka lag, scheduled cron, HTTP RPS. Pods can scale to zero when idle and burst on demand.
keda · autoscaling · event-driven
FinOps Visibility (OpenCost)
OpenCost / Kubecost attributes cluster spend to namespaces, workloads, and teams. Engineers see what their service costs; chargeback or showback becomes possible.
finops · opencost · kubecost · cost
Function-Style Workloads (Knative)
Knative Serving runs request-driven HTTP services with scale-to-zero, traffic splitting, and revisions out of the box. The fastest path from "I have a handler" to "it's live with autoscaling".
knative · serverless · scale-to-zero
Infrastructure-as-Kubernetes (Crossplane)
Crossplane reconciles cloud resources (S3 buckets, RDS instances, IAM roles, DNS records) from Kubernetes manifests. Same GitOps loop that deploys services now provisions the infrastructure they depend on.
crossplane · platform-engineering · iac
Zero-Trust Workload Identity (SPIFFE/SPIRE)
Every workload gets a cryptographic identity (SVID) from SPIRE. Service-to-service auth no longer relies on shared secrets or network position — identity is bound to the workload.
spiffe · spire · zero-trust · workload-identity