Built with Python, SQL, BLS API, and Streamlit
This project computes a Personalized Consumer Price Index (CPI) for each user based on their unique spending behavior. Instead of relying on the national “CPI-U,” this system builds a user-specific inflation index using:
- Cleaned transaction data
- Fixed-weight category baskets (Laspeyres method)
- Normalized CPI data from the BLS
- SARIMAX forecasting models
- An interactive Streamlit dashboard
This behaves like a miniature quant research pipeline: data engineering → economic modeling → time-series forecasting → visualization.
- Load multi-user transaction CSVs
- Clean timestamps, categories, and amounts
- Store clean data in
transactions_raw
- Map raw categories → CPI-like buckets
- Compute monthly spending weights
- Identify each user’s base month
- Build fixed Laspeyres weights, same method used for official CPI
- Pull CPI time-series from the BLS API
- Normalize each category’s index to base = 100
- Store results in
cpi_seriesandcpi_norm
- Combine user weights with normalized CPI
- Generate multi-user time series stored in personal_index
- Fit SARIMAX(1,1,1)x(0,1,1,12) per user
- Fallback to naive forecasts for short histories
- Save 12-month forecasts with confidence bands
Includes:
- Personal CPI vs official CPI-U
- Category weight evolution
- CPI forecast
- “What-if” scenario tool (e.g., gas +20%)
- Random User Selector
- Preview of recent transactions
Shows whether your inflation moves differently than national CPI-U
Highlights how grocery, gas, dining, shopping, and other categories shift over time.
Predicts personal inflation for the next 6–12 months with confidence intervals.
Official CPI uses fixed category weights (Shelter ~34%, Energy ~7%, Food ~13%). But our own spending is different so our inflation is different.
This project captures that by computing:
- Your personal inflation exposure
- The categories driving your cost of living
- Forward-looking CPI projections
It mirrors how analysts build:
- Custom inflation baskets
- Household CPI indexes
- Real return calculations
- Regional inflation models
This project focuses on inflation measurement and scenario analysis rather than trading or alpha generation.
- Python: pandas, SQLAlchemy, statsmodels (SARIMAX), Plotly, Streamlit
- SQL: SQLite, window functions, category mapping, weight calculations
- APIs: BLS CPI API
- Visualization: Matplotlib & Streamlit interactive dashboards
This project uses a public, multi-user credit card transaction dataset from Kaggle to simulate household spending behavior. The dataset includes transaction timestamps, amounts, and merchant metadata suitable for CPI-style aggregation.
.
├── data/raw/ # Raw transaction CSVs
├── src/
│ ├── etl_transactions.py # Load & clean raw data
│ ├── categorize_sql.sql # Map to CPI-like categories
│ ├── index_sql.sql # Compute category weights + base weights
│ ├── Personal_cpi_sql.sql # Normalize CPI + compute personal CPI
│ ├── bls_api.py # Fetch official CPI from BLS API
│ ├── forecast.py # SARIMAX forecasts per user
│ ├── make_charts.py # Static visualization
│ └── app.py # Streamlit dashboard
├── charts/ # PNG visualizations
├── DB/personal_cpi.db # SQLite database
├── .env # DB_URL and (optional) BLS_API_KEY
├── requirements.txt
└── README.md1. Clone the repository
git clone https://github.com/YOUR-USERNAME/Personal-CPI-Tracker.git
cd Personal-CPI-Tracker
2. Create a virtual environment
python3 -m venv .venv
source .venv/bin/activate
3. Install dependencies
pip install -r requirements.txt
4. Create your .env file
Inside the root directory:
DB_URL=sqlite:///personal_cpi.db
BLS_API_KEY=your_key_here
5. Load raw transactions
python src/etl_transactions.py
6. Run SQL preprocessing
Open your DB and run:
categorize.sql
weights.sql
cpi_math.sql
7. Pull CPI data
python src/bls_api.py
8. Generate Forecasts
python src/forecast.py
9. Run the dashboard
streamlit run src/app.py

