Work

Foundations 01

Data Platform

Designing and evolving data warehouse and data lake architectures using Data Vault and dimensional modelling. Working on incremental loading strategies and improving scalability of data storage.

Execution 02

Orchestration and automation

Developing and maintaining data pipelines with Apache Airflow, Spark, dbt, and CI/CD workflows. Automating ingestion and transformation processes while improving reliability of pipelines.

Trust 03

Data modelling and quality

Implementing data modelling approaches (Data Vault, ETL/ELT) and data quality checks, including validation and anomaly detection. Continuously improving trust in downstream analytics.

Delivery 04

BI-ready delivery

Building and optimizing data marts and BI-ready datasets. Improving query performance and enabling faster reporting for business users.

Technologies I Use

Core
Technologies

  • Python
  • SQL
  • Jupyter
  • Git
  • Linux

Warehouses &
Data Lakes

  • Greenplum
  • PostgreSQL
  • ClickHouse
  • Amazon S3
  • Parquet, ORC & Avro

Pipelines &
Analytics

  • Apache Airflow
  • dbt
  • Apache Spark
  • Apache Superset
  • Data Vault

Platform &
Delivery

  • Docker
  • Terraform
  • Google Cloud
  • CI/CD