calvin.goh
/ projects / rfm-engine
← All projects
2024Coca-Cola · MY / SGMS Fabric · OneLake · MoEngageSolo analytics lead

The RFM engine that taught Coca-Cola's eB2B who to talk to.

Trade marketing was blasting the same promo to every outlet on CokeBuddy. No one knew which stores were champions, which were slipping, or which were worth a discount. This is how I built — and shipped — the segmentation engine that changed the conversation.

Order freq · champions
+18%
90d post-launch
Cohorts, daily
11
~26.7k outlets scored
Zero to live
90d
Solo build
01 · PROBLEM
One-size-fits-all pushes

Trade marketing was blasting the same promo to every outlet on CokeBuddy. No signal for which stores were champions, which were slipping, which were worth a discount.

02 · APPROACH
Blend, score, pipe

Built an RFM scoring engine on OneLake that merges offline DMS sales with in-app eB2B orders. Pipes segment membership to MoEngage every night at 2am.

03 · RESULT
Campaign decisions in days

Win-back offers stopped landing in the inboxes of outlets that ordered yesterday. Champions got breadth plays. Measurable uplift per cohort — especially on frequency.

// THE PLUMBING

How the data flows.

SOURCES
DMS · GA4 CokeBuddy · CRM
4 systems
BRONZE
Raw landing OneLake
Delta format
SILVER
Cleaned, joined Notebooks
SCD type 2
GOLD · RFM
Scored cohorts Semantic model
Refreshed 2am
ACTIVATION
MoEngage · BI Daily API push
11-cohort split
// TRY IT · CLICK A COHORT

The 11 cohorts, visualized.

~26.7k active outlets · any order in 12mo window · Mar 2024
COHORT · CHAMPIONS
12% of active outlets
3,209 outlets
Recencyhow recently ordered5 / 5
Frequencyhow often they buy5 / 5
Monetarybasket value per order5 / 5
PLAYBOOK
Breadth plays, new SKU pilots
// gold.rfm_scoresSQL
WITH base AS (
  SELECT outlet_id,
    DATEDIFF(day, MAX(order_date), CURRENT_DATE) AS recency_days,
    COUNT(*)         AS frequency,
    SUM(net_value)   AS monetary
  FROM silver.unified_orders
  WHERE order_date >= DATEADD(month, -12, CURRENT_DATE)
  GROUP BY outlet_id
), scored AS (
  SELECT outlet_id,
    NTILE(5) OVER (ORDER BY recency_days ASC) AS r_score,  -- 5 = most recent
    NTILE(5) OVER (ORDER BY frequency)         AS f_score,
    NTILE(5) OVER (ORDER BY monetary)          AS m_score
  FROM base
)
SELECT outlet_id, r_score, f_score, m_score,
  CASE
    WHEN r_score=5  AND f_score=5  AND m_score=5  THEN 'champions'
    WHEN r_score=1  AND f_score>=4                THEN 'cant_lose_them'
    WHEN r_score>=4 AND f_score>=4 AND m_score>=4 THEN 'loyal_customers'
    WHEN r_score<=2 AND f_score>=3 AND m_score>=3 THEN 'at_risk'
    WHEN r_score=5  AND f_score=1                 THEN 'recent_customers'
    WHEN r_score=4  AND f_score=1                 THEN 'promising'
    WHEN r_score>=4 AND f_score>=2                THEN 'potential_loyalist'
    WHEN r_score=3  AND f_score>=3                THEN 'needing_attention'
    WHEN r_score=3  AND f_score<=2                THEN 'about_to_sleep'
    WHEN r_score<=2 AND f_score=1                 THEN 'lost'
    ELSE                                               'hibernating'
  END AS cohort
FROM scored;
// IMPACT · 90 DAYS POST-LAUNCH

Order frequency moves measurably.

ChampionsLoyal CustomersPotential LoyalistAt RiskHoldoutindex: 100 = pre-launch avg
+18%
Champions
+12%
Loyal Customers
+9%
Potential Loyalist
+6%
At Risk
Finally — we stopped sending win-back pushes to outlets that ordered yesterday.
— Digital Trade Marketing Lead
Coca-Cola MY/SG
// WHAT'S NEXT

Where I'd take it in Q4.

01
Predictive migration

Model which Potential Loyalist outlets are likely to tip into Champions given a nudge.

02
Cost-sensitive routing

Factor promo cost per cohort into activation decisioning.

03
Seasonality layer

Decouple Ramadan, CNY, and Merdeka from the baseline trend.

04
Outlet-level CLV

Blend RFM with 3-year gross-profit forecast per outlet.