This workshop will help you get data from csv into MotherDuck, and lay out basic patterns for using dbt + MotherDuck in a performant way.
- Create a MotherDuck account.
- Create a database called "stocks_dev" inside of MotherDuck. This can be done with the command
create database stocks_dev;from the MotherDuck UI. - Fork the
matsonj/stocksrepo in GitHub. - Generate an access token inside of MotherDuck and add it as a codespace secret inside of GitHub.
- Open a codespace on the repo.
- After it loads completely, reload the window in order to make sure the dbt power user extension has access to your md environment.
- Get data from the yahoo finance api by running the following 3 commands:
python3 scripts/get_info.pypython3 scripts/get_options.pypython3 scripts/get_stock_history.py
- Build the data warehouse with
dbt buildin the CLI. - Lastly, plot the results using
python3 viz/line_chart.py. The webpage will be available at127.0.0.1:8050. - Alternatively, you can invoke these 3 steps with
make run.
- Data is extracted from yahoo finance API using python. The scripts run and write out a file to
datafolder with the timestamp in the name for each file.symbols.txtcontains the list of symbols for which to fetch data.get_info.pygets the company information for each company.get_options.pygets the currently open options. note: this data is temporal, and thus needs to be snapshotted. This is left as an exercise to the reader.get_stock_history.pygets the stock price history for the last 30 days.
- dbt creates a list of these files in
files.sqlwith the Duckdbglobfunction. - for each model -
company_info.sqloptions.sqlstock_history.sql- de-duplicate and load any new files. - for the models in step 3, test to make sure that the primary key is unique.
- create a dataset of closing stock price X outstanding shares over time to estimate Market Cap.
- Plotting is defined in the
viz/line_chart.pyfile. It is a set of simple charts usingplotlyanddash. - You can serve the plots with
python3 viz/line_chart.py.
- In order to take advantage of the
dbt power userplugin, you will need to put yourMOTHERDUCK_TOKENin your bash profile. Otherwise, all interactions withdbt power userwill hit the login page for MotherDuck.