Skip to content

provezano/ecommerce-data-analyst-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Ecommerce Analytics Platform

A comprehensive AI-powered analytics platform built with CrewAI for ecommerce data analysis. Features both a modern Streamlit web dashboard and command-line interfaces for analyzing ecommerce datasets, answering complex business questions about sales performance, customer engagement, churn risk, and business analytics with interactive visualizations and professional reporting.

🌟 Key Features

πŸ–₯️ Multiple User Interfaces

  • Streamlit Web Dashboard - Modern, interactive web interface with real-time chat and visualizations
  • Interactive CLI Chatbot - Rich terminal interface with markdown rendering
  • Command-Line Tools - Direct query execution and batch processing

⚑ High-Performance Architecture

  • Async Processing - Non-blocking operations with concurrent query execution
  • Dual Tool Support - Both synchronous and asynchronous tool implementations
  • Intelligent Caching - Optimized performance with smart caching mechanisms
  • Concurrent Execution - Run multiple queries simultaneously for faster results

πŸ“Š Advanced Analytics & Visualization

  • AI-Powered Data Analysis - CrewAI agents for intelligent data processing
  • Automatic Chart Generation - Bar charts, line plots, scatter plots, histograms, heatmaps, and box plots
  • Professional Reports - Structured markdown reports with executive summaries and recommendations
  • Interactive Visualizations - Dynamic charts with proper integration and file management

πŸ› οΈ Enterprise-Ready Deployment

  • Docker Containerization - Multi-purpose container with Streamlit and CLI support
  • Docker Compose - Orchestrated services for different use cases
  • Volume Mounting - Persistent storage for reports and visualizations
  • Environment Configuration - Flexible configuration for different deployment scenarios

πŸ“ˆ Comprehensive Ecommerce Analytics

  • Sales Performance Analysis - Product category and regional performance metrics
  • Customer Engagement Tracking - Regional engagement patterns and historical comparisons
  • Churn Risk Assessment - Customer retention analysis and risk predictions
  • Market Intelligence - Competitiveness analysis and forecasting
  • Daily Trends Analysis - Promotional impact and sales pattern recognition

πŸš€ Quick Start

Option 1: Streamlit Web Dashboard (Recommended)

Local Development:

# Install dependencies
uv sync

# Start the web dashboard
make ui
# Or directly: uv run streamlit run src/ecommerce_analytics_crew/streamlit_app.py

Docker:

# Build and start web dashboard
make ui-docker

# Or with docker-compose
make docker-compose-ui

The dashboard will be available at http://localhost:8501 with:

  • πŸ’¬ Interactive Chat Interface - Ask questions in natural language
  • πŸ“Š Real-time Visualizations - Charts generated automatically
  • πŸ“ Report Downloads - Professional markdown reports
  • ⚑ Async Processing - Fast, non-blocking operations

Option 2: Command Line Interface

Prerequisites:

Single Query (Async - Fastest):

# Local
make query Q="What product category has the highest sales performance?"

# Docker
make docker-query Q="What product category has the highest sales performance?"

Interactive Chatbot:

# Local with Rich markdown rendering
uv run src/ecommerce_analytics_crew/main.py chatbot

# Docker
make chatbot

Batch Processing (All Questions):

# Concurrent processing (fastest)
make all-concurrent

# Docker with volume mounting for outputs
make docker-compose-batch

πŸ–₯️ Interface Options

1. Streamlit Web Dashboard (Primary)

Modern web interface with interactive features:

# Local
make ui

# Docker
make ui-docker

Features:

  • 🌐 Web-based Interface - Access via browser at localhost:8501
  • πŸ’¬ Real-time Chat - Interactive conversation with AI agents
  • πŸ“Š Live Visualizations - Charts appear automatically with analysis
  • πŸ“ Session Management - Chat history and downloadable reports
  • ⚑ Async Processing - Non-blocking operations for better UX
  • 🎨 Modern UI - Clean, responsive design with sidebar navigation

2. Interactive CLI Chatbot

Rich terminal interface with markdown rendering:

# Local (with Rich formatting)
uv run src/ecommerce_analytics_crew/main.py chatbot

# Docker
make chatbot

Features:

  • 🎨 Rich Markdown Rendering - Beautiful terminal formatting
  • πŸ’¬ Multi-turn Conversations - Ask multiple questions per session
  • πŸ“‹ Built-in Help - Type help for example questions
  • πŸšͺ Easy Exit - Type quit, exit, or bye to end

3. Direct Query Execution

Single command execution for specific questions:

# Local (async - fastest)
make query Q="What product category has the highest sales performance?"

# Docker with volume mounting
make docker-query Q="Your question here"

# Manual execution
uv run src/ecommerce_analytics_crew/main.py run_query_async "Your question"

4. Batch Processing

Process all predefined analytics questions:

# Concurrent processing (fastest)
make all-concurrent

# Docker Compose batch processing
make docker-compose-batch

# Sequential async processing
uv run src/ecommerce_analytics_crew/main.py run_all_async

All modes save reports to outputs/results/ and charts to outputs/charts/

πŸ“ Example Questions

  • "What product category has the highest current sales performance?"
  • "How does North America's 2024 customer engagement compare to its historical average?"
  • "What regions have the highest customer churn risk?"
  • "How do promotional days impact daily sales performance?"
  • "What is the relationship between customer satisfaction and conversion rates?"

πŸ› οΈ Development

Project Structure

ecommerce-analytics-platform/
β”œβ”€β”€ src/ecommerce_analytics_crew/
β”‚   β”œβ”€β”€ config/                    # Agent and task YAML configurations
β”‚   β”œβ”€β”€ tools/                     # Data analysis and visualization tools
β”‚   β”‚   β”œβ”€β”€ ecommerce_data_analysis.py      # Sync data tools
β”‚   β”‚   β”œβ”€β”€ async_ecommerce_data_analysis.py # Async data tools
β”‚   β”‚   β”œβ”€β”€ data_visualization_tool.py       # Sync visualization
β”‚   β”‚   └── async_data_visualization_tool.py # Async visualization
β”‚   β”œβ”€β”€ ui/                        # UI components
β”‚   β”œβ”€β”€ crew.py                    # CrewAI setup with async support
β”‚   β”œβ”€β”€ main.py                    # CLI entry point
β”‚   └── streamlit_app.py           # Web dashboard
β”œβ”€β”€ src/config/                    # Global configuration
β”‚   β”œβ”€β”€ data_config.py            # Data sources and schemas
β”‚   └── logging_config.py         # Logging configuration
β”œβ”€β”€ deployment/docker/             # Docker and compose files
β”œβ”€β”€ data/                         # Ecommerce datasets (JSON)
β”œβ”€β”€ outputs/                      # Generated reports and charts
β”‚   β”œβ”€β”€ results/                  # Markdown reports
β”‚   └── charts/                   # PNG visualizations
β”œβ”€β”€ tests/                        # Test suite
β”œβ”€β”€ run_streamlit.py              # Streamlit launcher
β”œβ”€β”€ Makefile                      # Development commands
└── pyproject.toml                # Dependencies and configuration

Available Commands

Web Dashboard:

make ui                           # Start Streamlit dashboard locally
make ui-docker                    # Start dashboard in Docker
make docker-compose-ui            # Start with docker-compose

CLI Operations:

# Query execution (async recommended)
make query Q="your question"      # Local async query
make docker-query Q="question"    # Docker async query

# Interactive modes
uv run src/ecommerce_analytics_crew/main.py chatbot  # CLI chatbot
make chatbot                      # Docker chatbot

# Batch processing
make all-concurrent               # Concurrent processing (fastest)
make docker-compose-batch         # Docker batch processing

Development & Testing:

make test                         # Run test suite
make test-cov                     # Run tests with coverage
make help                         # Show all available commands

Docker Operations:

make docker-build                 # Build Docker image
make docker-compose-down           # Stop all services

πŸ“Š Output & Results

🌐 Streamlit Web Dashboard

  • Interactive Chat Interface - Real-time conversation with AI agents
  • Live Visualizations - Charts appear automatically during analysis
  • Session Management - Chat history and conversation persistence
  • Downloadable Reports - Professional markdown reports with embedded charts
  • Responsive Design - Works on desktop and mobile devices

πŸ–₯️ CLI & Terminal Output

  • Rich Markdown Rendering - Beautiful formatting with syntax highlighting
  • Professional Reports - Structured business intelligence reports
  • Real-time Processing - Live updates during async operations
  • Session Summaries - Execution time and performance metrics

πŸ“ File System Outputs

Organized Directory Structure:

outputs/
β”œβ”€β”€ results/                    # Professional markdown reports
β”‚   └── ecommerce_report.md    # Latest comprehensive report
β”œβ”€β”€ charts/                     # Data visualizations
β”‚   └── [session-uuid]/        # Session-specific charts
β”‚       β”œβ”€β”€ bar_chart_001.png  # Sales performance charts
β”‚       β”œβ”€β”€ line_plot_002.png  # Trend analysis charts
β”‚       └── scatter_003.png    # Correlation analysis
└── [batch-results]/           # Batch processing outputs
    β”œβ”€β”€ comprehensive_results.json
    β”œβ”€β”€ DATA_ANALYST_SUBMISSION.md
    └── question_folders/

Report Features:

  • πŸ“‹ Executive Summary - Key business findings and insights
  • πŸ“Š Data Analysis - Summary tables with metrics and statistics
  • πŸ“ˆ Visual Evidence - Automatically generated charts (bar, line, scatter, heatmap, box plots)
  • ⚠️ Risk Assessment - Customer churn and performance risk analysis
  • πŸ’‘ Actionable Recommendations - Business strategy suggestions
  • πŸ”— References - Data sources and chart file paths
  • ⏱️ Performance Metrics - Query execution times and caching info

πŸ”§ Configuration

Environment Variables

The chatbot requires the following environment variables to function properly:

Required:

  • OPENAI_API_KEY - Your OpenAI API key for LLM integration
  • MODEL - The OpenAI model to use (e.g., gpt-4, gpt-3.5-turbo)

Optional:

  • CREWAI_DISABLE_TELEMETRY - Set to true to disable CrewAI telemetry (default: true)
  • CREWAI_TELEMETRY_OPT_OUT - Set to true to opt out of telemetry (default: true)

Setting Up Environment Variables

For Docker: Create a .env.docker file in the deployment/docker/ directory:

OPENAI_API_KEY=your_openai_api_key_here
MODEL=gpt-4
CREWAI_DISABLE_TELEMETRY=true
CREWAI_TELEMETRY_OPT_OUT=true

For Local Development (uv): Create a .env file in the project root or set environment variables:

# Option 1: Create .env file
echo "OPENAI_API_KEY=your_openai_api_key_here" > .env
echo "MODEL=gpt-4" >> .env

# Option 2: Export in terminal
export OPENAI_API_KEY=your_openai_api_key_here
export MODEL=gpt-4

πŸ”¬ Technical Implementation

Architecture & Framework

  • CrewAI Multi-Agent System - Coordinated AI agents for data analysis, visualization, and reporting
  • Async-First Design - Non-blocking operations with concurrent processing capabilities
  • Dual Tool Architecture - Both synchronous and asynchronous tool implementations
  • Intelligent Caching - Smart caching mechanisms for improved performance

Data Processing Stack

  • DuckDB - High-performance in-memory SQL analytics engine
  • Pandas - Advanced data manipulation and analysis
  • JSON Data Sources - Structured ecommerce datasets with comprehensive schemas
  • Data Type Optimization - Automatic dtype conversion and memory optimization

Visualization & Reporting

  • Matplotlib & Seaborn - Professional chart generation (bar, line, scatter, heatmap, box plots)
  • Async File I/O - Non-blocking chart generation and file operations
  • Structured Reports - Markdown-based business intelligence reports
  • Session Management - UUID-based organization of outputs and visualizations

User Interfaces

  • Streamlit - Modern web dashboard with real-time chat and visualizations
  • Rich Terminal - Enhanced CLI with markdown rendering and beautiful formatting
  • Command-Line Tools - Direct execution capabilities for automation

Deployment & DevOps

  • Docker Containerization - Multi-purpose container with Streamlit and CLI support
  • Docker Compose - Orchestrated services for different deployment scenarios
  • uv Package Manager - Fast Python package management and virtual environments
  • Volume Mounting - Persistent storage for reports and visualizations

Performance Features

  • Concurrent Processing - Multiple queries executed simultaneously
  • Thread Pool Execution - CPU-intensive operations in separate threads
  • Async Tools - Non-blocking data operations and file I/O
  • Smart Caching - Function-level caching for repeated operations

Data Sources

The platform analyzes comprehensive ecommerce datasets including:

  • Global Sales Performance - Product category and regional performance metrics
  • Customer Engagement - Regional engagement patterns and historical comparisons
  • Daily Sales Trends - Promotional impact and daily performance data
  • Sales Forecasting - Market analysis and analog forecasting data
  • Churn Risk Analysis - Customer retention and risk assessment data

Quality Assurance

  • Comprehensive Test Suite - Unit tests with pytest and coverage reporting
  • Type Safety - Pydantic models for data validation and type checking
  • Error Handling - Robust error handling with detailed logging
  • Performance Monitoring - Execution time tracking and performance metrics

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors