Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Intent
Build a dbt project from scratch the takes raw subscription data and transforms it into a BI-ready, daily subscription metrics model, as instructed in this notion doc.
The goal is to match the output at the bottom of the notion doc, but getting close is sufficient for this first pass PR. In this case, some of the columns are exactly aligned, and a number are very close.
project/DAG structure
The project is made up of a three main stages of models:
DAG

The majority of metrics can be calculated as somewhat minor transforms (filtering + aggregating to date). I chose to break these each out into their own intermediate model for now for easier logical readability + debugging while developing. The most complex set of models is the cascading logic captured in the
date_spined_fanoutsanddate_spine_derivativesfolders:subscriptions_days- full fanout of subscriptions by day from created_date to cancelled_datesubscribers_days- an intermediate-step aggregation from subscriptions by day to subscribers by day (customer_id * date)date_spine_derivativesthen follow a similar process of pre-aggregating information to a date level as the minor transforms that directly reference the base model.Validation of models
My final output is close in most cases to the notion doc output. The biggest sources of discrepancy are
I have some theories for where my model could be improved to get closer to the original output, but in the interest of time I will open this PR and save that for discussion.
Checklist
dbt runruns successfullydbt testpasses on all tests