Tech Tree · Data
Data & Analytics Maturity
Advance your data capability from manual spreadsheets to a self-learning intelligence platform. Each node represents a concrete data engineering or analytics practice with steps, effort estimates, and the dependencies that mirror real-world data stack evolution.
Maturity tiers
Spreadsheets
Data lives in spreadsheets and people's heads. Reporting is manual, slow, and inconsistently defined.
Warehouse
A central data warehouse consolidates sources. Dashboards replace spreadsheets. Analysts self-serve structured reports.
Real-time
Streaming pipelines deliver fresh data in seconds. Operational dashboards react instantly to business events.
ML / AI
Machine learning models augment decisions, personalise experiences, and surface insights no human analyst would find at scale.
Tracks
Collection
How data is captured, ingested, and made available for downstream use.
Storage
Where data lives, how it is organised, and how it is governed.
Processing
How raw data is cleaned, joined, aggregated, and modelled for analysis.
Intelligence
How data drives decisions — from human dashboards to automated ML systems.
All capabilities (15)
Spreadsheets
Executive Dashboards
A small set of consistent dashboards gives leadership a daily view of business health — replacing ad-hoc spreadsheet pulls.
dashboards · bi · reporting · executive
Operational Database Exports
Production database tables are exported on a schedule so analysts can query business data without hitting the live transactional system.
export · etl · database · foundation
Product Event Tracking
User interactions are captured as structured events with a consistent schema. This is the foundation every downstream analytics capability depends on.
tracking · events · analytics · foundation
Shared Metric Definitions
Key business metrics (revenue, DAU, conversion) are defined once in a central document and agreed across teams. Eliminates the "which number is right?" debate.
metrics · governance · alignment
Warehouse
Cloud Data Warehouse
A columnar, cloud-native warehouse (Snowflake, BigQuery, Redshift) centralises all analytical data with scalable compute separate from storage.
warehouse · snowflake · bigquery · redshift
dbt Data Models
SQL transformations are version-controlled, tested, and documented using dbt. Analysts own the transformation layer with software engineering discipline.
dbt · sql · transformation · data-modelling
ETL / ELT Pipelines
Automated pipelines extract data from all sources, load it into the warehouse, and transform it into analytics-ready models on a reliable schedule.
etl · elt · airflow · fivetran · pipeline
Self-Service Analytics
Non-technical stakeholders can answer their own data questions using a governed BI layer — without waiting for an analyst or writing SQL.
self-serve · bi · semantic-layer · looker
Real-time
A/B Testing Platform
Product changes are validated through randomised controlled experiments with statistical rigour. Every significant feature launch is an experiment with a clear success metric.
ab-testing · experimentation · statsig · product
Data Lake
Raw, unprocessed data from all sources lands in object storage in open formats (Parquet, Iceberg). The lake is the foundation for ML training datasets and ad-hoc exploration.
data-lake · parquet · iceberg · s3
Real-Time Analytics
Business-critical dashboards refresh in seconds using a real-time OLAP engine, enabling operational decisions based on the current state of the business.
real-time · olap · clickhouse · druid
Streaming Ingestion
Events flow from producers to the analytics layer in seconds via a durable message stream. Batch ETL is complemented or replaced for high-velocity sources.
streaming · kafka · kinesis · real-time
ML / AI
Feature Store
A centralised store serves pre-computed features to both offline training pipelines and online inference endpoints — eliminating training-serving skew.
feature-store · mlops · feast · tecton
ML Training Pipeline
Model training is automated, reproducible, and tracked. Every experiment logs hyperparameters, metrics, and artefacts so results are comparable and models are promotable.
ml · mlops · mlflow · training
Recommendation Engine
A production recommendation system personalises content, products, or actions for each user, increasing engagement and conversion at scale.
recommendations · personalisation · ml · ranking