Weld logo
Blog image
February 28, 2023Tools & Tips

How do you calculate MRR with Stripe using SQL?

Author image
by Jonas Thordal

Monthly Recurring Revenue more commonly known as MRR is a business metric most widely used in subscription businesses where a portion of your revenue stream is predictable and recurring such as Software as a Service (SaaS). It's one of the most if not the most important metrics in any subscription business.

In this article, we will elaborate on the previous article on how to calculate MRR with a concrete example using SQL and data from Stripe. Even though this article is centered around Stripe you can also get a lot of inspiration if you are using other payment processors software like GoCardless, Adyen, or Square or billing software like Xero, Zoho, or Freshbooks.

This guide assumes that you already have your Stripe ready to analyze in your data warehouse. If not check out this article to learn more about how to extract data from Stripe. 

Now follow this step-by-step guide using Weld to create an MRR model using Stripe data with SQL. 

Step 1: Get all subscriptions 

To begin, it is essential to define MRR from a Stripe perspective with a clear formula. MRR is calculated using this equation:

MRR = sum of recurring revenue from active subscriptions

In other words, we need to find all the active subscriptions and then calculate the total revenue on a monthly basis. Since subscriptions can change over time, let's use this query to get all subscriptions from Stripe. 

Tip: Replace or adjust the table names based on your setup.

1select
2    id
3  , customer_id
4  , start_date
5  , canceled_at
6  , id subscription_id
7  , status
8from
9    {{raw.stripe.subscription}}

Step 2: Get subscription plans

The next step is to join the subscription plans which hold information about what plan each subscriber is on. To do this you also need to left join the subscription_item table which holds the price_id. Then you can use the price_id to join the plan as shown below. We need to divide by 100 because the Stripe API returns two additional zeros for all amounts. 

Tip: It is important to set up a new subscription whenever changes occur, in order to capture historical data. Otherwise, your model won't be functional. Additionally, you can ensure that your data warehouse's tables incorporate Slowly Changing Dimensions (SCD), so as to store past data. Make certain to use the right filtering methods to prevent any double counting of subscriptions.

Moreover, we join in the coupon and subscription_discount tables in order to calculate any discounts applied to the subscription. You will also note that we do a case when statement to assess whether the start_date equals cancelled_at to avoid duplicates from new subscriptions created but canceled in the same month. Ideally, any subscription is set up correctly from the beginning but mistakes happen.

1select
2    s.id
3  , s.customer_id
4  , s.start_date
5  , s.canceled_at
6  , s.id subscription_id
7  , s.status
8  , d.coupon_id
9  , c.amount_off / 100 amount_off
10  , si.price_id
11  , si.quantity
12  , p.amount / 100 amount
13  , p.interval_count
14  , p.interval interval_period
15  , p.currency
16  , case
17        when date_trunc(start_date, month) <> date_trunc(canceled_at, month) then 0
18        when canceled_at is null then 0
19        else 1
20    end deleted
21from
22    {{stripe_demo.stripe_subscriptions}} s
23    left join {{raw.stripe.subscription_item}} si on s.id = si.subscription_id
24    left join {{raw.stripe.plan}} p on si.price_id = p.id
25    left join {{raw.stripe.subscription_discount}} d on s.id = d.subscription_id
26    left join {{raw.stripe.coupon}} c on d.coupon_id = c.id

Step 3: Normalize to monthly revenue

Now let's normalize revenue to MRR and apply currency conversion. To get the monthly amount we need to divide it by the interval_count and ad interval_period. After that, we use a currency conversion table available in the Weld app as a Forex connector. You can use your own here if needed. We average the rate and group by month to join it on our main table which holds the currency for the amount. Finally, we divide by the rate.

1with
2    subscription_item as (
3        select
4            customer_id
5          , subscription_id
6          , amount_off
7          , start_date
8          , canceled_at
9          , quantity
10          , amount
11          , currency
12          , status
13          , case
14                when interval_period = 'year' then ((amount / interval_count) / 12) * quantity
15                else (amount / interval_count) * quantity
16            end monthly_amount
17        from
18            {{stripe_demo.stripe_subsription_item}}
19        where
20            deleted = 0
21    )
22  , forex as (
23        select
24            avg(rate) rate
25          , currency
26          , date_trunc(timestamp, month) month
27        from
28            {{raw.forex_eur.incremental}}
29        group by
30            month
31          , currency
32        order by
33            month
34    )
35select
36    customer_id
37  , subscription_id
38  , start_date
39  , canceled_at
40  , amount_off
41  , status
42  , case
43        when monthly_amount > 0 then monthly_amount / rate
44    end as monthly_amount_eur
45from
46    subscription_item
47    left join forex on date_trunc(start_date, month) = forex.month
48    and upper(subscription_item.currency) = forex.currency

Step 4: Identify active subscriptions and sum revenue 

Now we will sum the total revenue per customer per subscription and subtract any discount applied. Additionally, we select active and canceled subscriptions separately in conjunction with a coalesce function to assess whether there is an active subscription on a given customer. We also use this to get the latest cancellation date to calculate churn in case no active subscriptions are present anymore. 

Tip: It's important to note that discounts are applied on a subscription level which means that you need to make sure that you only subtract it once per subscription. We use the max function to avoid this as we joined the discount on a subscription_item level.

1with
2    revenue_per_subscription as (
3        select
4            customer_id
5          , cast(date_trunc(start_date, month) as date) start_date_month
6          , cast(date_trunc(canceled_at, month) as date) canceled_at
7          , max(amount_off) amount_off
8          , status
9          , sum(monthly_amount_eur) monthly_amount_eur
10          , sum(monthly_amount_eur) - ifnull(max(amount_off), 0) monthly_amount_eur_after_discount
11        from
12            {{stripe_demo.stripe_mrr_normalization}}
13        group by
14            customer_id
15          , start_date_month
16          , canceled_at
17          , status
18    )
19  , active_subscriptions as (
20        select
21            customer_id
22          , max(status) status
23        from
24            revenue_per_subscription
25        where
26            status <> 'canceled'
27        group by
28            customer_id
29    )
30  , cancelled_subscriptions as (
31        select
32            customer_id
33          , max(canceled_at) canceled_at
34          , max(status) status
35        from
36            revenue_per_subscription
37        where
38            status = 'canceled'
39        group by
40            customer_id
41    )
42select
43    r.customer_id
44  , coalesce(s.status, c.status) status
45  , start_date_month
46  , case
47        when coalesce(s.status, c.status) = 'canceled' then c.canceled_at
48    end as canceled_at
49  , cast(
50        round(sum(monthly_amount_eur_after_discount), 0) as int
51    ) monthly_amount_eur_after_discount
52from
53    revenue_per_subscription r
54    left join active_subscriptions s on r.customer_id = s.customer_id
55    left join cancelled_subscriptions c on r.customer_id = c.customer_idgroup by customer_id
56  , status
57  , r.status
58  , start_date_month
59  , canceled_atorder by customer_id

Step 5: Create a timeseries for each customer 

To make sure we have data for all months of a given customer's lifetime we create an array with dates starting from 2021-01-01. Apply your own timeframe here as needed. 

The idea here is to calculate the correct start and end dates for each customer and then calculate any changes between months in the subscription. If it's the first month we know that it's a new MRR and if it's the last month with an active subscription we know it's a churn and any changes in between will be contraction and expansion relatively. We use the row_number function to rank rows to validate the first and last month. We also remove any months not relevant before any active subscription was present or after the last churn date. 

1with
2    time_series as (
3        select
4            month
5        from
6            unnest (
7                generate_date_array(
8                    date_trunc('2021-01-01', month)
9                  , current_date
10                  , interval 1 month
11                )
12            ) as month
13    )
14  , time_series_customer as (
15        select
16            *
17        from
18            (
19                select distinct
20                    (customer_id) customer_id
21                  , month
22                from
23                    {{stripe_demo.stripe_mrr_active_subscriptions}}
24                    cross join time_series
25                order by
26                    customer_id
27                  , month
28            )
29    )
30  , time_series_revenue as (
31        select
32            c.customer_id
33          , month
34          , start_date_month
35          , first_value(canceled_at ignore nulls) over (
36                partition by
37                    c.customer_id
38                order by
39                    month asc
40            ) canceled_at
41          , monthly_amount_eur_after_discount
42          , first_value(status ignore nulls) over (
43                partition by
44                    c.customer_id
45                order by
46                    month asc
47            ) status
48          , last_value(monthly_amount_eur_after_discount ignore nulls) over (
49                partition by
50                    c.customer_id
51                order by
52                    month asc
53            ) mrr
54        from
55            time_series_customer c
56            left join {{stripe_demo.stripe_mrr_active_subscriptions}} r on c.customer_id = r.customer_id
57            and c.month = r.start_date_month
58        order by
59            customer_id
60          , month asc
61    )
62select
63    *
64  , case
65        when mrr - lag(mrr) over (
66            partition by
67                customer_id
68            order by
69                month asc
70        ) is null then mrr
71        else mrr - lag(mrr) over (
72            partition by
73                customer_id
74            order by
75                month asc
76        )
77    end mrr_change
78  , row_number() over (
79        partition by
80            customer_id
81        order by
82            month asc
83    ) mrr_rank_asc
84  , row_number() over (
85        partition by
86            customer_id
87        order by
88            month desc
89    ) mrr_rank_desc
90  , from
91    time_series_revenuewhere mrr is not null
92    and (
93        canceled_at is null
94        or month <= canceled_at
95    )
96order by
97    customer_id
98  , month asc

Step 6: Summarize MRR

Finally you are ready to sum it all up and divide MRR into the different types and calculate net new NRR.  Additionally, we are adding a total to also show the total MRR over time.

1with
2    mrr_calc as (
3        select
4            *
5          , case
6                when mrr_rank_asc = 1 then mrr_change
7                else 0
8            end as new_mrr
9          , case
10                when mrr_change < 0 then mrr_change
11                else 0
12            end as contraction_mrr
13          , case
14                when mrr_change > 0
15                and mrr_rank_asc > 1 then mrr_change
16                else 0
17            end as expansion_mrr
18          , case
19                when mrr_rank_desc = 1
20                and status = 'canceled' then mrr * -1
21                else 0
22            end as churn_mrr
23        from
24            {{stripe_demo.stripe_mrr_time_series}}
25    )
26  , net_mrr_calc as (
27        select
28            month
29          , sum(new_mrr) new_mrr
30          , sum(contraction_mrr) contraction_mrr
31          , sum(churn_mrr) churn_mrr
32          , sum(expansion_mrr) expansion_mrr
33          , sum(
34                new_mrr + contraction_mrr + churn_mrr + expansion_mrr
35            ) as net_new_mrr
36      , from
37            mrr_calc
38        group by
39            month
40    )
41select
42    month
43  , new_mrr
44  , contraction_mrr
45  , churn_mrr
46  , expansion_mrr
47  , net_new_mrr
48  , sum(net_new_mrr) over (
49        order by
50            month asc
51    ) as mrr_totalfrom net_mrr_calc

Thats it. Your model is ready and you can see the full lineage graph of the model below:

Lineage graph from the Weld app

Additionally, you can visualize your MRR breakdown like the chart below:

MRR breakdown from the Weld app

Tip: If you want to slice and dice your data based on country, industry, company size or specific companies add those columns to your model from Step 2.

Conclusion:

Depending on your Stripe and data infrastructure set up the model that is needed can change but this should give you an idea of how it can be done. 

Continue reading

Weld logo

Tired of scattered data? Sync data to your data warehouse with Weld in minutes with our powerful ETL, SQL Transformations, Reverse ETL and AI Assistant, Ed - connected to 100+ apps, files and databases.

Backed by leading investors
Frontline logoCherry logoInnnovation Fund logo
Twitter LogoLinkedIn Logo
GDPR logoSOC2
© 2024 Weld. All rights reserved.