Skip to content

EVA (Exploratory Visual Analyzer) is an intelligent data science assistant that automates the tedious parts of data analysis

License

Notifications You must be signed in to change notification settings

Aspect022/EVA-Exploratory_Visual_Analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

EVA Logo

EVA - Exploratory Visual Analyzer

πŸ€– AI-Powered Data Science Assistant

Features β€’ Installation β€’ Quick Start β€’ CLI Usage β€’ API Reference β€’ Configuration β€’ Contributing

Python 3.8+ License: MIT Code style: black PRs Welcome


🌟 Overview

EVA (Exploratory Visual Analyzer) is an intelligent data science assistant that automates the tedious parts of data analysis. Simply point EVA at your CSV file, and it will:

  • πŸ“Š Analyze your data structure and quality
  • πŸ“ˆ Generate comprehensive statistics and visualizations
  • 🧠 Suggest insights and data cleaning strategies using AI
  • πŸ€– Recommend machine learning models suited for your data
  • πŸ““ Export everything to a Jupyter notebook for further exploration

EVA uses an agent-based architecture where specialized agents collaborate to provide a complete data analysis pipeline.


✨ Features

πŸ” Intelligent Data Ingestion

  • Smart encoding detection - Automatically handles UTF-8, Latin-1, and other encodings
  • Type inference - Detects numeric, datetime, categorical, and boolean columns
  • Validation - Comprehensive file validation with detailed error reporting

πŸ“Š Exploratory Data Analysis

  • Descriptive statistics - Mean, median, std, quartiles, and more
  • Missing value analysis - Patterns and recommendations for handling
  • Correlation analysis - Pearson, Spearman, and categorical correlations
  • Outlier detection - IQR and Z-score based identification

πŸ“ˆ Automatic Visualization

  • Distribution plots - Histograms and density plots
  • Relationship plots - Scatter plots and pair plots
  • Correlation heatmaps - Beautiful visual correlation matrices
  • Interactive plots - Plotly-powered interactive visualizations

🧠 AI-Powered Insights

  • OpenAI Integration - GPT-powered analysis suggestions
  • Google Gemini Support - Alternative AI provider
  • Smart suggestions - Data cleaning and feature engineering recommendations
  • Fallback mode - Works offline with rule-based suggestions

πŸ€– Model Recommendations

  • Problem type detection - Classification, regression, clustering
  • Algorithm suggestions - Ranked list of suitable models
  • Baseline pipelines - Ready-to-use sklearn pipeline code

πŸ““ Notebook Export

  • Jupyter notebooks - Complete analysis as executable notebooks
  • Python scripts - Standalone .py file generation
  • Documentation - Well-commented, reproducible code

πŸš€ Installation

Prerequisites

  • Python 3.8 or higher
  • pip package manager

Install from Source

# Clone the repository
git clone https://github.com/yourusername/EVA.git
cd EVA

# Create a virtual environment (recommended)
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Install as Package

pip install -e .

πŸƒ Quick Start

Command Line

# Basic analysis
python -m eva.cli analyze data.csv

# With AI suggestions enabled
python -m eva.cli analyze data.csv --enable-ai

# Export to notebook
python -m eva.cli analyze data.csv --export-notebook --output ./results

Python API

from eva.orchestrator import AnalysisOrchestrator
from eva.models.core import AnalysisContext, AnalysisConfig
from eva.agents.csv_ingestor import CSVIngestorAgent
from eva.agents.eda_generator import EDAGeneratorAgent
from eva.agents.visualizer import VisualizerAgent

# Create orchestrator
orchestrator = AnalysisOrchestrator(max_workers=3)

# Configure analysis
config = AnalysisConfig(
    processing_timeout_minutes=5,
    enable_ai_suggestions=True
)

# Create context
context = AnalysisContext(
    session_id="my_analysis",
    config=config
)

# Set file path
context.metadata = {'file_path': 'data.csv'}

# Create and run agents
agents = [
    CSVIngestorAgent(),
    EDAGeneratorAgent(),
    VisualizerAgent()
]

results = orchestrator.execute_pipeline(agents, context)

# Access results
print(f"Dataset shape: {context.dataset.shape}")
print(f"EDA completed: {results['EDAGeneratorAgent'].success}")

πŸ’» CLI Usage

Analyze Command

python -m eva.cli analyze <file_path> [OPTIONS]
Option Description
--output, -o Output directory for results
--config, -c Path to configuration file
--enable-ai Enable AI-powered suggestions
--export-notebook Generate Jupyter notebook
--export-script Generate Python script
--format Visualization format (png, html, both)
--verbose, -v Verbose output
--quiet, -q Suppress output

Examples

# Full analysis with all exports
python -m eva.cli analyze sales_data.csv \
    --output ./analysis_results \
    --enable-ai \
    --export-notebook \
    --export-script \
    --format both \
    --verbose

# Quick analysis without AI
python -m eva.cli analyze data.csv --quiet

# Using custom configuration
python -m eva.cli analyze data.csv --config my_config.yaml

πŸ“š API Reference

Core Classes

AnalysisOrchestrator

Manages the execution of analysis agents with dependency resolution and parallel processing.

from eva.orchestrator import AnalysisOrchestrator

orchestrator = AnalysisOrchestrator(
    max_workers=4,           # Parallel worker count
    system_limits=limits     # Resource limits
)

results = orchestrator.execute_pipeline(agents, context)

AnalysisContext

Shared context object passed between agents.

from eva.models.core import AnalysisContext, AnalysisConfig

context = AnalysisContext(
    dataset=None,            # Populated by CSVIngestorAgent
    metadata={},             # File and analysis metadata
    results={},              # Agent results storage
    config=AnalysisConfig(), # Configuration
    session_id="unique_id"   # Session identifier
)

Agents

Agent Description Dependencies
CSVIngestorAgent Loads and validates CSV files None
EDAGeneratorAgent Statistical analysis CSVIngestorAgent
VisualizerAgent Creates visualizations EDAGeneratorAgent
InsightSuggesterAgent AI-powered insights EDAGeneratorAgent
ModelRecommenderAgent ML model suggestions EDAGeneratorAgent
NotebookExporterAgent Notebook generation All others

For detailed API documentation, see docs/api/README.md.


βš™οΈ Configuration

Configuration File

Create a config.yaml file:

# Analysis settings
analysis:
  max_file_size_mb: 100
  processing_timeout_minutes: 5
  memory_limit_gb: 2
  enable_ai_suggestions: true
  export_formats:
    - ipynb
    - py
  visualization_formats:
    - png
    - html

# Logging
log_level: INFO
log_file: null

# Storage
temp_dir: temp/eva
cache_dir: temp/eva/cache

# AI service
ai_service_provider: openai  # openai, gemini, mock
ai_api_key: null             # Use EVA_AI_API_KEY env var
ai_model: gpt-4
ai_timeout_seconds: 30

# Performance
max_workers: 4
chunk_size: 10000

Environment Variables

Variable Description
EVA_AI_API_KEY API key for AI service
EVA_CONFIG_PATH Custom config file path
EVA_LOG_LEVEL Logging level override
EVA_OUTPUT_DIR Default output directory

πŸ—οΈ Architecture

eva/
β”œβ”€β”€ examples/            # Usage examples
β”œβ”€β”€ scripts/             # Verification and utility scripts
β”œβ”€β”€ tests/               # Test suite
β”‚   β”œβ”€β”€ unit/           # Unit tests
β”‚   └── integration/    # Integration tests
β”œβ”€β”€ eva/                 # Source code
β”‚   β”œβ”€β”€ agents/         # Analysis agents
β”‚   β”œβ”€β”€ models/         # Data models
β”‚   β”œβ”€β”€ services/       # Business logic
β”‚   └── utils/          # Utilities
└── docs/                # Documentation

πŸ§ͺ Testing

# Run all tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=eva --cov-report=html

# Run specific test file
pytest tests/test_orchestrator.py -v

# Run integration tests
python tests/run_integration_tests.py

🀝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Development Setup

# Clone and setup
git clone https://github.com/yourusername/EVA.git
cd EVA
python -m venv .venv
source .venv/bin/activate

# Install dev dependencies
pip install -r requirements.txt
pip install -r requirements-dev.txt

# Install pre-commit hooks
pre-commit install

Code Style

  • Formatter: Black
  • Linter: Flake8
  • Type Checker: mypy
  • Import Sorter: isort

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ™ Acknowledgments


Made with ❀️ by the EVA Development Team

About

EVA (Exploratory Visual Analyzer) is an intelligent data science assistant that automates the tedious parts of data analysis

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages