This project serves as a comprehensive proof-of-concept (PoC) demonstrating advanced Agentic Intelligence applied to sales analytics. It features a sophisticated system capable of interpreting natural language queries, intelligently routing them to specialized analytical tools, and delivering actionable insights across four critical frameworks: Descriptive, Diagnostic, Predictive, and Prescriptive analytics.
Built on a modern, scalable architecture, the system leverages LangChain to orchestrate Generative AI (Google Gemini) for complex reasoning and intent classification, Python for robust core analytics, Flask for secure API exposure, n8n for seamless workflow orchestration, and Streamlit for an intuitive, user-friendly chatbot interface.
The system's AI agent autonomously classifies user intent and extracts relevant parameters (e.g., region, product category, date range, revenue goals) to execute the precise analysis required:
Summarizes historical sales data to provide a clear picture of past performance.
- Capabilities: Calculates total revenue, units sold, and average order value.
- Flexible Filtering: Supports granular analysis by region, product category, sales channel, and specific date ranges.
- Example Query: "What were the total sales for Electronics in the North region last month?"
Investigates data to identify the root causes of performance anomalies or trends.
- Capabilities: Automatically compares performance across different time periods (e.g., Q3 vs. Q2) to pinpoint significant changes.
- Root Cause Analysis: Drills down into contributing factors, such as identifying that a sales drop in a specific category was driven by underperformance in a particular sales channel.
- Example Query: "Why did sales for Apparel drop in the North region?"
Forecasts future trends to aid in proactive decision-making and resource allocation.
- Capabilities: Utilizes time-series analysis (Linear Regression with seasonality features) to predict future revenue.
- Customizable Forecasts: Allows users to specify the forecast horizon (e.g., next 3 months, next 6 months).
- Example Query: "Forecast Furniture sales for the next 3 months."
Provides data-driven recommendations to achieve desired business outcomes.
- Capabilities: Models the relationship between key variables (e.g., AdSpend and Revenue) to suggest optimal actions.
- Actionable Insights: Calculates the precise resource allocation needed to hit specific targets, such as the necessary increase in AdSpend to achieve a $50,000 revenue boost.
- Example Query: "How much should we increase AdSpend to boost Online revenue by $50,000?"
The solution is composed of loosely coupled, microservice-like components, ensuring modularity and scalability:
- Chatbot Interface (Streamlit): A clean, interactive web-based chat application that serves as the front-end for end-users.
- Orchestration Layer (n8n): A powerful low-code automation platform acting as the middleware. It receives messages from the chatbot, forwards them to the AI backend, and relays the responses back to the user.
- AI Agent API (Flask): A RESTful API that exposes the core intelligent agent, allowing it to be accessed by other services.
- Agentic Core (Python + LangChain Core + Gemini):
- Intent Classification: Utilizes Google Gemini-2.5-Flash via custom prompt engineering to accurately classify user queries into one of the four analytical categories and extract structured parameters (JSON).
- Analytical Engine: A pure Python module containing specialized functions for data processing (using Pandas) and machine learning (using Scikit-Learn).
Follow these detailed steps to run the entire system locally.
- Python 3.9 or higher
- n8n (Desktop app or a self-hosted instance)
- A Google AI Studio API Key (free tier available)
Clone the project and create a virtual environment to manage dependencies.
git clone [YOUR_GITHUB_REPO_LINK_HERE]
cd sales_ai_agent
python -m venv venv
# Activate the virtual environment:
# Windows: venv\Scripts\activate
# Mac/Linux: source venv/bin/activate
pip install -r requirements.txtSecurely store your Google API key in a .env file in the project's root directory.
GOOGLE_API_KEY=your_api_key_here
Run the included data generator script to create a realistic, hierarchical sales dataset (sales_data.csv) with embedded patterns designed for demonstration purposes.
python data_generator.pyLaunch the Flask API server. It will run locally on: http://127.0.0.1:5000
python app.py- Open n8n.
- Create a new workflow.
- Set the HTTP Method to
POSTand copy the Test URL.
- Method:
POST - URL:
http://127.0.0.1:5000/ask
(or your local machine's IP if running in Docker, e.g.,http://host.docker.internal:5000/ask) - Body Content Type:
JSON - JSON Body:
{"query": "={{ $json.body.query }}"}
- Respond With:
All Incoming Item Data
- Open
chatbot.pyin a text editor. - Replace the placeholder URL in the
n8n_webhook_urlvariable with your actual n8n Test Webhook URL. - Run the Streamlit app and start chatting while the n8n workflow is active/executing:
streamlit run chatbot.py- requirements.txt: List of all Python dependencies.
- .env: Configuration file for storing sensitive API keys.
- data_generator.py: Script to generate the synthetic sales_data.csv file.
- sales_data.csv: The generated dataset used for analysis.
- analytics_engine.py: Python module containing the four core analytical functions (tools).
- agent_system.py: The core "brain" that uses Gemini to classify intents and route queries.
- app.py: Flask server that acts as the API entry point.
- chatbot.py: Streamlit-based user interface.
Once the system is fully operational, you can test its capabilities with the following types of queries in the chatbot:
"How much revenue did the South region generate from Groceries?"
"Why was there a sales drop for Apparel in the North?"
"What is the 6-month revenue forecast for Electronics?"
"How much should we increase AdSpend to boost Online revenue by $50,000?"