Fully automated, verifier-first digital product engine for Raspberry Pi.
Now with systemd automation + real-time dashboard (hourly scheduling, cost tracking, activity monitoring).
✅ Multi-Source Data Ingestion - HackerNews (no API key!), Reddit, RSS, or file imports
✅ Works Out-of-Box - Default HackerNews source requires zero API credentials
✅ Automated Scheduling - Systemd timer runs pipeline hourly (configurable)
✅ Real-time Dashboard - Web UI for cost, activity, and status monitoring
✅ Cost Controls - Hard limits prevent runaway API bills
✅ Daily Backups - Automated database backups with retention policies
✅ Audit Trail - Complete operation history for debugging
✅ Input Sanitization - XSS prevention for all external content
Multiple Data Sources:
- HackerNews (no API key!) ─┐
- Reddit (optional) ├──→ Problem Extraction → Spec Generation → Content Generation → Verification → Gumroad Upload
- RSS Feeds (optional) │ ↓
- File Import (optional) ───┘ [Systemd Timer: Hourly]
↓
[Real-time Dashboard: Monitoring]
Graceful Degradation: If one source fails, others continue working.
Cost governor enforces hard limits at every LLM call.
- Python 3.8+ - Required for running the pipeline
- Git - For cloning the repository
- SSH keys configured with GitHub (for Pi installation) - Required for automated installation
IMPORTANT: Before running the automated installer on Raspberry Pi, you MUST configure SSH keys for GitHub authentication.
ls -la ~/.ssh/id_*.pubIf you see files like id_rsa.pub or id_ed25519.pub, you already have SSH keys. Skip to Step 3.
# Generate a new SSH key (use your GitHub email)
ssh-keygen -t ed25519 -C "your_email@example.com"
# When prompted:
# - Press Enter to accept the default file location
# - Enter a passphrase (optional but recommended)# Display your public key
cat ~/.ssh/id_ed25519.pub
# (or cat ~/.ssh/id_rsa.pub if you have an RSA key)
# Copy the entire outputThen:
- Go to GitHub → Settings → SSH and GPG keys: https://github.com/settings/keys
- Click "New SSH key"
- Give it a title (e.g., "Raspberry Pi")
- Paste your public key
- Click "Add SSH key"
ssh -T git@github.comYou should see:
Hi YourUsername! You've successfully authenticated, but GitHub does not provide shell access.
If you see this message, you're ready to run the installer!
Best for: Production deployment on Raspberry Pi with SSH keys already configured
Prerequisites: SSH keys must be configured (see above)
# Clone the repository
git clone git@github.com:SaltProphet/Pi-autopilot.git
cd Pi-autopilot
# Run the automated installer (requires sudo)
sudo bash installer/setup_pi.shBest for: Users who prefer Personal Access Token (PAT) over SSH keys
Prerequisites: GitHub Personal Access Token with 'repo' scope
To create a PAT:
- Go to: https://github.com/settings/tokens
- Click "Generate new token (classic)"
- Give it a name (e.g., "Pi-Autopilot")
- Select scope: repo (Full control of private repositories)
- Click "Generate token" and copy it (you won't see it again!)
# Clone the repository (you can use HTTPS here too with your PAT)
git clone https://github.com/SaltProphet/Pi-autopilot.git
cd Pi-autopilot
# Run the HTTPS installer (will prompt for PAT)
sudo bash installer/setup_with_https.shSecurity Note: The installer uses git credential helper and immediately clears the token from memory after use. Your PAT is never written to disk or exposed in process listings.
What the automated installers do:
- Install system dependencies (Python 3, pip, venv, git)
- Create installation at
/opt/pi-autopilot - Set up Python virtual environment
- Install all Python dependencies
- Create data directories with proper permissions
- Configure systemd services (pipeline + dashboard)
- Set up timer for hourly pipeline runs
- Configure daily database backups
After installation:
- Edit
/opt/pi-autopilot/.envwith your API keys - Test the pipeline:
sudo systemctl start pi-autopilot.service - Access dashboard:
http://<pi-ip>:8000 - Check timer status:
systemctl list-timers pi-autopilot.timer
Recommended for development, testing, or non-Pi systems
# 1. Clone the repository
git clone https://github.com/SaltProphet/Pi-autopilot.git
cd Pi-autopilot
# 2. Create a Python virtual environment
python3 -m venv venv
# 3. Activate the virtual environment
# On Linux/Mac:
source venv/bin/activate
# On Windows:
# venv\Scripts\activate
# 4. Upgrade pip
pip install --upgrade pip
# 5. Install dependencies
pip install -r requirements.txt
# 6. Create data directories
mkdir -p data/artifacts
# 7. Configure environment
cp .env.example .env
# Edit .env with your API keys (see Configuration section)
# 8. Run the pipeline manually
python main.py
# 9. (Optional) Start the dashboard
python dashboard.py
# Access at http://localhost:8000# Run pipeline once (manual)
python main.py
# Run with virtual environment (if not activated)
./venv/bin/python main.py
# Run in dry-run mode (no real Gumroad uploads)
# Set DRY_RUN=true in .env first
python main.py# Create virtual environment
python3 -m venv venv
# Activate virtual environment
source venv/bin/activate # Linux/Mac
venv\Scripts\activate # Windows
# Deactivate virtual environment
deactivate
# Install/update dependencies
pip install -r requirements.txt
# Upgrade specific package
pip install --upgrade openai# Start dashboard (development)
python dashboard.py
# Start with specific host/port
python dashboard.py --host 0.0.0.0 --port 8000
# Access dashboard
# Local: http://localhost:8000
# Remote: http://<your-ip>:8000# View service status
systemctl status pi-autopilot.service
systemctl status pi-autopilot-dashboard.service
# Start/stop services
sudo systemctl start pi-autopilot.service
sudo systemctl stop pi-autopilot.service
sudo systemctl restart pi-autopilot-dashboard.service
# Enable/disable automatic startup
sudo systemctl enable pi-autopilot.timer
sudo systemctl disable pi-autopilot.timer
# View logs (follow mode)
journalctl -fu pi-autopilot.service
journalctl -fu pi-autopilot-dashboard.service
# View recent logs
journalctl -u pi-autopilot.service -n 100
# Check timer schedule
systemctl list-timers pi-autopilot.timer
# Manually trigger pipeline
sudo systemctl start pi-autopilot.service
# Reload systemd after config changes
sudo systemctl daemon-reload# View database
sqlite3 data/pipeline.db
# Check lifetime cost
sqlite3 data/pipeline.db "SELECT SUM(usd_cost) FROM cost_tracking;"
# List recent pipeline runs
sqlite3 data/pipeline.db "SELECT * FROM pipeline_runs ORDER BY created_at DESC LIMIT 10;"
# Count posts by status
sqlite3 data/pipeline.db "SELECT status, COUNT(*) FROM pipeline_runs GROUP BY status;"
# Reset cost tracking (use with caution)
sqlite3 data/pipeline.db "DELETE FROM cost_tracking;"# Run all tests
SKIP_CONFIG_VALIDATION=1 pytest tests/
# Run specific test file
SKIP_CONFIG_VALIDATION=1 pytest tests/test_storage.py -v
# Run with coverage
SKIP_CONFIG_VALIDATION=1 pytest tests/ --cov=services --cov=agents
# Run only unit tests
SKIP_CONFIG_VALIDATION=1 pytest tests/ -m unit# Check Python version
python3 --version
# Verify virtual environment is activated
which python # Should show path to venv/bin/python
# List installed packages
pip list
# Check for missing dependencies
pip check
# View environment variables
cat .env
# Check file permissions
ls -la .env data/
# Test API connectivity
# HackerNews (no auth required)
python -c "from agents.hackernews_ingest import HackerNewsIngestAgent; print('HackerNews OK')"
# Reddit (if enabled)
python -c "from services.reddit_client import RedditClient; print('Reddit OK')"
# OpenAI
python -c "from openai import OpenAI; print('OpenAI OK')"After installation, the dashboard is live at:
http://<your-pi-ip>:8000
Real-time monitoring of:
- 💰 Cost tracking (lifetime + last 24h)
- ✅ Pipeline stats (completed/discarded/rejected)
- 📍 Active posts being processed
- 📋 Recent activity feed
Pi-Autopilot includes a web-based configuration interface for managing API keys and settings.
# Start the dashboard
python dashboard.py
# Navigate to:
http://localhost:8000/config- HTTPS or Localhost Only: Only access the config UI over HTTPS or from localhost
- Password Protection (optional): Set
DASHBOARD_PASSWORDin.envto require authentication - IP Whitelisting (optional): Set
DASHBOARD_ALLOWED_IPSto restrict access - File Permissions: The
.envfile is automatically secured with 0o600 permissions
- ✅ Secure API key input with masked display
- ✅ Test API keys before saving
- ✅ Toggle services on/off without deleting keys
- ✅ Automatic backups before every change
- ✅ Restore from previous backups
- ✅ Input validation and error handling
- ✅ Audit logging of all changes
Backups are stored in ./config_backups/ with timestamps. The system keeps the last 7 backups automatically.
Pi-Autopilot supports multiple pluggable data sources, allowing you to ingest problems from various platforms:
-
HackerNews (Default - No API Key Required! ✅)
- Uses public Algolia API
- Fetches Ask HN and Show HN posts
- Works out-of-the-box without any credentials
-
Reddit (Optional)
- Requires Reddit API credentials
- Fetches from multiple subreddits
- High-quality community discussions
-
RSS Feeds (Optional)
- Parse any RSS/Atom feed
- Blogs, forums, news sites
- No authentication needed
-
File Import (Optional)
- Load posts from JSON/CSV files
- Perfect for testing or manual curation
- No external API needed
The default configuration uses HackerNews, which requires no API credentials:
# Default - works immediately!
DATA_SOURCES=hackernewsJust add your OpenAI and Gumroad credentials and you're ready to go!
You can enable multiple sources simultaneously:
# Use multiple sources together
DATA_SOURCES=hackernews,reddit,rss
# Or just one
DATA_SOURCES=redditExample 1: HackerNews Only (Simplest Setup)
DATA_SOURCES=hackernews
HN_MIN_SCORE=50
HN_POST_LIMIT=20
HN_STORY_TYPES=ask_hn,show_hn
OPENAI_API_KEY=your_key_here
GUMROAD_ACCESS_TOKEN=your_token_hereExample 2: Reddit + HackerNews
DATA_SOURCES=reddit,hackernews
# Reddit config
REDDIT_CLIENT_ID=your_reddit_client_id
REDDIT_CLIENT_SECRET=your_reddit_client_secret
REDDIT_SUBREDDITS=SideProject,Entrepreneur
REDDIT_MIN_SCORE=10
# HackerNews config
HN_MIN_SCORE=50
HN_POST_LIMIT=20Example 3: All Sources
DATA_SOURCES=reddit,hackernews,rss,file
# Reddit
REDDIT_CLIENT_ID=xxx
REDDIT_CLIENT_SECRET=xxx
REDDIT_SUBREDDITS=SideProject,Entrepreneur
# HackerNews (no credentials needed)
HN_MIN_SCORE=50
HN_POST_LIMIT=20
HN_STORY_TYPES=ask_hn,show_hn
# RSS Feeds
RSS_FEED_URLS=https://blog1.com/feed.xml,https://blog2.com/rss
RSS_POST_LIMIT=20
# File Import
FILE_INGEST_PATHS=/path/to/posts.json,/path/to/data.csv
FILE_POST_LIMIT=20Edit .env or use the web interface at http://localhost:8000/config:
# ==== DATA SOURCES ====
# Comma-separated list: reddit,hackernews,rss,file
DATA_SOURCES=hackernews
# ==== REDDIT (Optional - only needed if using reddit source) ====
REDDIT_CLIENT_ID=
REDDIT_CLIENT_SECRET=
REDDIT_USER_AGENT=Pi-Autopilot/2.0
REDDIT_SUBREDDITS=SideProject,Entrepreneur,startups
REDDIT_MIN_SCORE=10
REDDIT_POST_LIMIT=20
# ==== HACKERNEWS (No API key required!) ====
HN_MIN_SCORE=50
HN_POST_LIMIT=20
HN_STORY_TYPES=ask_hn,show_hn
# ==== RSS FEEDS (Optional) ====
RSS_FEED_URLS=
RSS_POST_LIMIT=20
# ==== FILE INGEST (Optional) ====
FILE_INGEST_PATHS=
FILE_POST_LIMIT=20
# ==== OPENAI (Required) ====
OPENAI_API_KEY=your_openai_api_key
OPENAI_MODEL=gpt-4
OPENAI_INPUT_TOKEN_PRICE=0.00003
OPENAI_OUTPUT_TOKEN_PRICE=0.00006
# ==== GUMROAD (Required) ====
GUMROAD_ACCESS_TOKEN=your_gumroad_access_token
# ==== PIPELINE SETTINGS ====
DATABASE_PATH=./data/pipeline.db
ARTIFACTS_PATH=./data/artifacts
MAX_REGENERATION_ATTEMPTS=1
MAX_TOKENS_PER_RUN=50000
MAX_USD_PER_RUN=5.0
MAX_USD_LIFETIME=100.0
KILL_SWITCH=false
DRY_RUN=true
# ==== SALES FEEDBACK ====
ZERO_SALES_SUPPRESSION_COUNT=5
REFUND_RATE_MAX=0.3
SALES_LOOKBACK_DAYS=30✅ No Single Point of Failure - If Reddit API is down, HackerNews keeps working
✅ No API Approval Wait - Start with HackerNews immediately, add Reddit later
✅ Broader Coverage - Capture problems from multiple communities
✅ Free Testing - HackerNews requires zero credentials
✅ Graceful Degradation - Individual source failures don't stop the pipeline
Edit timer to change frequency:
sudo systemctl edit pi-autopilot.timerOptions:
# Every 30 minutes
OnUnitActiveSec=30min
# Every 6 hours
OnUnitActiveSec=6h
# Daily at 2 AM
OnCalendar=*-*-* 02:00:00Then reload:
sudo systemctl daemon-reloadThree levels of protection:
-
Per-Run Token Limit (
MAX_TOKENS_PER_RUN)- Default: 50,000 tokens
- Includes input + output tokens
- Pipeline aborts when exceeded
-
Per-Run USD Limit (
MAX_USD_PER_RUN)- Default: $5.00
- Estimated cost based on token usage
- Pipeline aborts when exceeded
-
Lifetime USD Limit (
MAX_USD_LIFETIME)- Default: $100.00
- Cumulative across all runs
- Persists in SQLite database
- Pipeline aborts when exceeded
All LLM calls tracked in cost_tracking table:
- Input/output tokens
- USD cost
- Timestamp
- Model used
- Abort reasons
View lifetime cost:
sqlite3 data/pipeline.db "SELECT SUM(usd_cost) FROM cost_tracking;"When any limit is exceeded:
- Pipeline stops immediately
- No regeneration attempts
- Failure written to
data/artifacts/abort_{run_id}.json - Database records abort reason
- Current post marked as
cost_limit_exceeded
Abort file format:
{
"run_id": 1234567890,
"abort_reason": "MAX_USD_PER_RUN exceeded: 5.12 > 5.0",
"run_tokens_sent": 25000,
"run_tokens_received": 18000,
"run_cost": 5.12,
"timestamp": 1234567890
}Emergency stop without deleting data:
# In .env
KILL_SWITCH=trueWhen enabled:
- Pipeline exits immediately
- No Reddit ingestion
- No LLM calls
- No Gumroad uploads
- Database and artifacts preserved
Test the system safely without making real Gumroad uploads:
# In .env (enabled by default on installation)
DRY_RUN=trueWhen enabled:
- Pipeline runs normally through all stages
- Reddit posts are processed
- LLM calls generate content
- Gumroad uploads are simulated (no real products created)
- All artifacts and logs are created
- Console shows "[DRY RUN]" prefix for uploads
To enable real uploads: Set DRY_RUN=false in .env
Configure per-token costs:
OPENAI_INPUT_TOKEN_PRICE=0.00003
OPENAI_OUTPUT_TOKEN_PRICE=0.00006Defaults are for GPT-4. Adjust for other models.
Before each LLM call:
- Estimate input tokens (text length / 3.5)
- Use max_tokens for output estimate
- Calculate estimated USD cost
- Check against all limits
- Refuse call if any limit would be exceeded
After each LLM call:
- Record actual token usage from response
- Calculate actual USD cost
- Update run totals
- Write to cost_tracking table
To reset cumulative cost tracking:
sqlite3 data/pipeline.db "DELETE FROM cost_tracking;"Or delete specific runs:
sqlite3 data/pipeline.db "DELETE FROM cost_tracking WHERE run_id = 1234567890;"Run the complete pipeline:
python main.pyThe pipeline executes sequentially:
- Ingest Reddit posts from configured subreddits
- Extract problems from posts (discard if not monetizable)
- Generate product specifications (reject if confidence < 70)
- Generate product content
- Verify content quality (max 1 regeneration attempt)
- Generate Gumroad listing
- Upload to Gumroad
- Verifier-first: Content must pass verification or get discarded
- One regeneration: If content fails verification, regenerate once
- Hard discard: If second attempt fails, permanently discard
- Cost limits: Any LLM call that would exceed limits is refused
- Sequential execution: No parallel processing
- Disk-based state: All artifacts saved to disk
- JSON between modules: All inter-agent communication uses JSON
/
├─ main.py # Pipeline orchestrator
├─ config.py # Configuration management
├─ requirements.txt # Python dependencies
├─ .env.example # Environment template
├─ installer/
│ ├─ setup_pi.sh # Raspberry Pi setup script
│ └─ run.sh # Manual run script
├─ prompts/
│ ├─ problem_extraction.txt
│ ├─ product_spec.txt
│ ├─ verifier.txt
│ ├─ product_content.txt
│ └─ gumroad_listing.txt
├─ agents/
│ ├─ reddit_ingest.py # Reddit ingestion
│ ├─ problem_agent.py # Problem extraction
│ ├─ spec_agent.py # Spec generation
│ ├─ verifier_agent.py # Quality verification
│ ├─ content_agent.py # Content generation
│ └─ gumroad_agent.py # Gumroad upload
├─ services/
│ ├─ llm_client.py # OpenAI API client
│ ├─ reddit_client.py # Reddit API client
│ ├─ gumroad_client.py # Gumroad API client
│ ├─ storage.py # SQLite storage
│ ├─ cost_governor.py # Cost control & limits
│ ├─ config_validator.py # Config validation (NEW)
│ ├─ backup_manager.py # Database backups (NEW)
│ ├─ error_handler.py # Error logging (NEW)
│ ├─ sanitizer.py # Input sanitization (NEW)
│ ├─ retry_handler.py # API retry logic (NEW)
│ └─ audit_logger.py # Audit trail (NEW)
└─ models/
├─ problem.py # Problem model
├─ product_spec.py # Product spec model
└─ verdict.py # Verification verdict model
- docs/SYSTEM_PIPELINE_OVERVIEW.md - Complete system pipeline, functions, and outcomes (START HERE for comprehensive understanding)
- SECURITY.md - Security features and hardening
- docs/CHANGELOG.md - Version history
- docs/ROADMAP.md - Feature roadmap (Q1-Q4 2026)
- docs/IMPLEMENTATION_OUTLINE.md - Technical architecture
- docs/IMPLEMENTATION_SUMMARY.md - Implementation details
All pipeline artifacts are saved to ./data/artifacts/{post_id}/:
problem_*.json- Extracted problemsspec_*.json- Product specificationscontent_*.md- Generated product contentverdict_attempt_*.json- Verification resultsgumroad_upload_*.json- Upload resultserror_*.json- Exception logs with full context
Backups and cost tracking:
backups/pipeline_db_*.sqlite.gz- Daily database backupsabort_{run_id}.json- Cost limit failures
After running setup_pi.sh, the pipeline runs every 6 hours automatically.
Check status:
systemctl status pi-autopilot.timerView logs:
journalctl -u pi-autopilot.service -fManual trigger:
systemctl start pi-autopilot.service- Go to https://www.reddit.com/prefs/apps
- Create app (script type)
- Copy client ID and secret
- Visit https://platform.openai.com/api-keys
- Create new key
- Copy key
- Go to Settings → Advanced → Applications
- Create application
- Generate access token
- Copy token
Content must pass all checks:
example_quality_score >= 7generic_language_detected == falsemissing_elements == []
If any check fails, content is regenerated once. If second attempt fails, the post is permanently discarded.
Posts are rejected at various stages:
Problem Extraction:
discard == true(not monetizable)- No economic consequence
- Generic complaints
Spec Generation:
build == falseconfidence < 70deliverables.length < 3- Generic target audience
Verification:
- Poor example quality
- Generic language detected
- Missing required sections
- Two failed attempts
Cost Limits:
- Token limit exceeded
- Per-run USD limit exceeded
- Lifetime USD limit exceeded
reddit_posts:
- id, title, body, timestamp, subreddit, author, score, url, raw_json
pipeline_runs:
- id, post_id, stage, status, artifact_path, error_message, created_at
cost_tracking:
- id, run_id, tokens_sent, tokens_received, usd_cost, timestamp, model, abort_reason
SSH Authentication Failed During Installation:
If you see errors like:
remote: Invalid username or token. Password authentication is not supported for Git operations.
fatal: Authentication failed for 'https://github.com/SaltProphet/Pi-autopilot.git/'
Or:
ERROR: SSH authentication to GitHub failed
Solution 1: Set up SSH keys properly
-
Check if you have SSH keys:
ls -la ~/.ssh/id_*.pub
-
If no keys exist, generate one:
ssh-keygen -t ed25519 -C "your_email@example.com" -
Add your public key to GitHub:
cat ~/.ssh/id_ed25519.pubCopy the output and add it at: https://github.com/settings/keys
-
Test the connection:
ssh -T git@github.com
You should see: "Hi YourUsername! You've successfully authenticated..."
-
Re-run the installer:
sudo bash installer/setup_pi.sh
Solution 2: Use HTTPS installer with Personal Access Token
If SSH is not working or you prefer using PAT:
- Create a Personal Access Token at: https://github.com/settings/tokens
- Select scope: repo
- Run the HTTPS installer:
sudo bash installer/setup_with_https.sh
"Could not detect the actual user" error:
- Don't run the script directly as root
- Always use:
sudo bash installer/setup_pi.sh(notsudo suthen run script)
SSH keys exist but authentication still fails:
- Verify the key is added to your GitHub account: https://github.com/settings/keys
- Check SSH agent is running:
eval "$(ssh-agent -s)" ssh-add ~/.ssh/id_ed25519
- Test manually:
ssh -Tv git@github.com
No posts ingested:
- Check Reddit credentials in
.env - Lower
REDDIT_MIN_SCOREthreshold - Verify subreddit names are correct
All posts discarded:
- Check OpenAI API key
- Review
problem_*.jsonartifacts - Adjust subreddit targets to more entrepreneurial communities
Verification always fails:
- Check
verdict_*.jsonfor specific reasons - Review
content_*.mdfor quality issues - Consider adjusting prompts in
/prompts
Gumroad upload fails:
- Verify
GUMROAD_ACCESS_TOKENis correct - Check Gumroad account status
- Review
gumroad_upload_*.jsonfor error details
Pipeline aborts with cost limit:
- Check
abort_{run_id}.jsonfor exact reason - Review
cost_trackingtable in database - Adjust limits in
.envif needed - Consider reducing
REDDIT_POST_LIMIT
Kill switch not working:
- Ensure
KILL_SWITCH=true(not "True" or "1") - Restart service after changing
.env - Check logs for "KILL SWITCH ACTIVE" message
- Run on Raspberry Pi 3B+ or newer
- Requires stable internet connection
- Runs headless and unattended
- All decisions logged to SQLite
- All artifacts preserved on disk
- No manual intervention required
- Cost limits prevent runaway spending
- Kill switch allows emergency stop
See LICENSE file.