Tech Tree · Infrastructure
Cloud Infrastructure Maturity
Evolve your infrastructure from on-premises hardware to fully managed, multi-cloud, and cloud-native platforms. Each node represents a concrete capability with effort estimates and cross-track dependencies.
Maturity tiers
On-prem
Physical or virtualised on-premises infrastructure. Manual provisioning and high operational overhead.
Cloud
Workloads running in a public cloud with IaaS services. Lift-and-shift or basic cloud adoption.
Cloud-native
Services built for the cloud using managed platforms, containers, and infrastructure-as-code.
Multi-cloud
Workloads span multiple cloud providers with full portability, global distribution, and advanced resiliency.
Tracks
Compute
How workloads are packaged, scheduled, and executed.
Networking
How traffic is routed, secured, and accelerated between services and end-users.
Data
How data is stored, replicated, queried, and governed.
Observability
How system health is measured, monitored, and diagnosed.
All capabilities (16)
On-prem
Basic Network Segmentation
The network is divided into logical segments (VLANs or subnets) separating production, staging, and development environments with firewall rules between them.
networking · security · segmentation
Centralised Log Aggregation
Application and system logs are shipped to a central store (e.g., ELK stack, Splunk) enabling cross-service correlation and persistent log retention.
logging · observability · operations
Managed Relational Database
Databases are provisioned on dedicated servers with scheduled backups, defined recovery objectives, and documented failover procedures.
database · backup · reliability
Virtual Machines
Workloads run on hypervisor-managed VMs rather than bare metal, enabling better resource utilisation and faster provisioning than physical servers.
compute · virtualisation · infrastructure
Cloud
Cloud Load Balancer
A managed load balancer distributes traffic across compute instances, performs health checks, and terminates TLS — eliminating self-managed HAProxy or Nginx reverse proxies.
networking · load-balancer · tls
Cloud Metrics & Alerting
Infrastructure and application metrics flow into a managed monitoring platform (CloudWatch, Stackdriver, Azure Monitor) with alert rules triggering on-call notifications.
metrics · alerting · observability
Cloud-managed Database
Databases migrate to fully managed cloud services (RDS, Cloud SQL, Azure DB) with automated backups, read replicas, and provider-handled patching.
database · cloud · managed-services
Managed Cloud Compute
Workloads run on cloud provider VMs (EC2, Compute Engine, Azure VMs) with auto-scaling groups, pay-as-you-go billing, and provider-managed hardware.
cloud · compute · aws · gcp · azure
Cloud-native
CDN & Edge Caching
Static assets and cacheable API responses are served from a CDN edge network, reducing latency for global users and offloading origin traffic.
cdn · edge · performance · networking
Container Orchestration
Services run as OCI containers scheduled by Kubernetes (or ECS/Cloud Run) with declarative configuration, health probes, and rolling deployments.
kubernetes · containers · docker · cloud-native
Distributed Tracing
Requests are instrumented with trace context propagated across service boundaries, enabling end-to-end latency analysis and root-cause identification in distributed systems.
tracing · opentelemetry · observability · distributed-systems
Infrastructure as Code
All cloud resources are provisioned via Terraform or Pulumi, stored in version control, and applied through CI/CD pipelines — no manual console changes.
iac · terraform · devops · automation
Multi-cloud
Cloud Data Lake
Raw event data, logs, and operational data flow into a cloud object store (S3, GCS) forming a queryable data lake, enabling analytics without burdening transactional databases.
data-lake · analytics · streaming · big-data
Multi-region Active-active
The application runs across two or more geographic regions simultaneously, serving traffic from the nearest region and surviving a full regional outage without manual intervention.
multi-region · resilience · ha · disaster-recovery
Service Mesh
A service mesh (Istio, Linkerd, Consul Connect) provides mutual TLS, traffic splitting, circuit breaking, and fine-grained observability across all inter-service communication.
service-mesh · istio · mtls · networking
SLOs & Error Budgets
Service Level Objectives are defined, measured continuously, and used to gate feature deployments via error budgets — balancing velocity with reliability.
slo · sre · error-budgets · reliability