This workshop will help you get data from csv into MotherDuck, and lay out basic patterns for using dbt + MotherDuck in a performant way.
- Create a MotherDuck account.
- Create a database called "stocks_dev" inside of MotherDuck. This can be done with the command
create database stocks_dev;from the MotherDuck UI. - Fork the
matsonj/stocksrepo in GitHub. - Generate an access token inside of MotherDuck and add it as a codespace secret inside of GitHub.
- Open a codespace on the repo.
- After it loads completely, reload the window in order to make sure the dbt power user extension has access to your md environment.
- Build the data warehouse with
dbt buildin the CLI. The default dbt target is the MotherDuck-backeddevprofile, and the project ships with example CSV snapshots as dbt seeds underseeds/. - Plot the results using
python3 viz/line_chart.py. The webpage will be available at127.0.0.1:8050, and the app reads fromstocks_dev.mainin MotherDuck. - Alternatively, you can invoke these 2 steps with
make run. - The Yahoo Finance helper scripts in
scripts/append normalized rows directly into the stable seedsseeds/ticker_info.csv,seeds/option_history.csv, andseeds/ticker_history.csv. If a fetched row already exists, it is not duplicated.
- Example data is stored as dbt seeds in
seeds/, withsnapshot_tsrecording when each snapshot was fetched.symbols.txtcontains the list of symbols for which to fetch data.get_info.pygets the company information for each company.get_options.pygets the currently open options. note: this data is temporal, and thus needs to be snapshotted. This is left as an exercise to the reader.get_stock_history.pygets the stock price history for the last 360 days.- Those scripts merge normalized snapshots into the stable seed files, so you can run the scripts and then run
dbt build.
- The raw dbt models
company_info.sql,options.sql, andstock_history.sqlreference those seed relations directly. - For the raw models in step 2, tests make sure that the primary key is unique.
- The marts create a dataset of closing stock price X outstanding shares over time to estimate Market Cap.
- Plotting is defined in the
viz/line_chart.pyfile. It is a set of simple charts usingplotlyanddash. - You can serve the plots with
python3 viz/line_chart.py.
- In order to take advantage of the
dbt power userplugin, you will need to put yourMOTHERDUCK_TOKENin your bash profile. Otherwise, all interactions withdbt power userwill hit the login page for MotherDuck.