Personal · production-gradePython · Pydantic · managed Postgres7 pipelines · daily on hosted CI

A monitoring layer so I stop finding out about broken pipelines from the BI user.

Daily ETL pulling Malaysian government CPI + MCOICOP classifications from the Malaysian open-data API — async extract with retries, Pydantic-validated transform, upserted to managed Postgres. Orchestrated by a scheduled CI workflow on a hosted runner that auto-publishes regenerated dashboard JSONs. 5 SLOs are continuously monitored; breaches page Slack, email, and Discord. Runs in 4.7s average per pipeline.

24h success rate

100%

vs ≥ 95% SLO

SLO health score

80

4 / 5 meeting

Avg latency

4.7s

vs ≤ 5m SLO

// LIVE · last 7 days

Pipeline health grid.

OKWarnFail

// INSPECTOR

CPI Core

ok

LATENCY · LAST 7 RUNS

seconds, p50

SLA: 300sp50: 3sruns (window): 7fails: 0records: 3,172

LAST RUN · JUST NOW

SUCCESS · 455 records

duration 3.0s · within SLA

downstream refresh queued

RUN HISTORY · BY DAY

// SLO SCORECARD

5 SLOs, last 7 days.

Defined in the alerting config module. When any SLO trips its target, the alert fans out to Slack, email, and Discord — same alert, three pipes for redundancy.

Success rate≥ 95%

100.0%

Error rate≤ 5%

0.0%

Avg duration≤ 300s

4.3s

Data quality≥ 98%

85.7%

Data freshness≤ 24h

15s

// PER-PIPELINE

Adherence by pipeline.

Each pipeline tracked against the same ≥ 95% success-rate target. Lower than that triggers a deeper investigation before the next scheduled run.

CPI Core≥ 95%

100%

CPI Headline≥ 95%

100%

CPI State≥ 95%

100%

Data Validation≥ 95%

100%

MCOICOP Class≥ 95%

100%

MCOICOP Division≥ 95%

100%

MCOICOP Group≥ 95%

100%

// THE STACK

Cheap, boring, monitored.

Hosted CI scheduler

Daily cron · manual dispatch · run logs

Python · Pandas

Async transform · Pydantic validation

Managed Postgres

Upsert · etl_runs metastore · 600+ runs

Slack + Email + Discord

Alerts on SLO breach · on-call is just me