Skip to content

JuniorTorresMTJ/EDAAgent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ EDA Agent - Complete Setup Guide

πŸ“‹ Prerequisites

  • Python 3.8 or higher
  • Internet connection
  • A Google account (for Gemini API)

πŸ”§ Step 1: Environment Setup

1.1 Create Project Directory

mkdir eda-agent
cd eda-agent

1.2 Create Virtual Environment (Recommended)

# Create virtual environment
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate

1.3 Verify Python Installation

python --version
pip --version

πŸ“¦ Step 2: Install Dependencies

2.1 Create requirements.txt

Create a file named requirements.txt with the following content:

streamlit==1.29.0
pandas==2.1.3
plotly==5.17.0
ydata-profiling==4.6.4
python-dotenv==1.0.0
google-generativeai==0.3.2

2.2 Install All Dependencies

pip install -r requirements.txt

Alternative - Install individually:

pip install streamlit pandas plotly ydata-profiling python-dotenv google-generativeai

πŸ”‘ Step 3: Get Google Gemini API Key

3.1 Visit Google AI Studio

  1. Go to Google AI Studio
  2. Sign in with your Google account

3.2 Create API Key

  1. Click "Create API Key"
  2. Choose "Create API key in new project" or select existing project
  3. Copy the generated API key (save it securely!)

βš™οΈ Step 4: Configure Environment

4.1 Create .env File

In your project directory, create a file named .env:

# On Windows
echo GOOGLE_API_KEY=your_actual_api_key_here > .env

# On macOS/Linux
echo "GOOGLE_API_KEY=your_actual_api_key_here" > .env

Or manually create the file:

GOOGLE_API_KEY=your_actual_api_key_here

⚠️ Important: Replace your_actual_api_key_here with your real API key!

πŸ“„ Step 5: Create the Application

5.1 Create Main Application File

Create a file named app.py and copy the complete EDA Agent code provided earlier.

5.2 Directory Structure

Your project should look like:

eda-agent/
β”œβ”€β”€ app.py
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ .env
β”œβ”€β”€ venv/ (if using virtual environment)
└── reports/ (will be created automatically)

πŸš€ Step 6: Run the Application

6.1 Start Streamlit Server

streamlit run app.py

6.2 Access the Application

  • The application will automatically open in your browser
  • If not, go to: http://localhost:8501
  • You should see the EDA Agent interface

πŸ“Š Step 7: Test the Application

7.1 Prepare Test Data

Create a simple CSV file for testing:

name,age,salary,department
John,25,50000,IT
Jane,30,65000,Finance
Bob,35,55000,IT
Alice,28,60000,Marketing
Charlie,32,70000,Finance

Save as test_data.csv

7.2 Test Features

  1. Upload the CSV file
  2. Explore visualizations
  3. Generate AI insights
  4. Create profiling report

πŸ” Step 8: Troubleshooting

8.1 Common Issues

Issue: Module not found

# Solution: Ensure virtual environment is activated and dependencies are installed
pip install -r requirements.txt

Issue: API Key not working

# Check .env file format (no spaces around =)
GOOGLE_API_KEY=your_key_here

# Verify API key is valid at Google AI Studio

Issue: Streamlit command not found

# Ensure streamlit is installed
pip install streamlit

# Or use full path
python -m streamlit run app.py

Issue: Port already in use

# Use different port
streamlit run app.py --server.port 8502

8.2 Debug Mode

Add debug information to check API connection:

# Add this to your app.py for debugging
if st.checkbox("Debug API Connection"):
    st.write(f"API Key exists: {bool(api_key)}")
    st.write(f"API Key length: {len(api_key) if api_key else 0}")

🎯 Step 9: Usage Tips

9.1 Supported File Formats

  • CSV files only
  • UTF-8 encoding recommended
  • Maximum file size: 200MB (Streamlit default)

9.2 Best Practices

  • Use clean, well-formatted CSV files
  • Ensure column names are descriptive
  • Remove or handle special characters in data
  • Start with smaller datasets for testing

9.3 Performance Tips

  • Use "Simplified report" for large datasets
  • Generate insights in sections rather than all at once
  • Monitor memory usage for very large files

πŸ”„ Step 10: Updates and Maintenance

10.1 Update Dependencies

pip install --upgrade streamlit pandas plotly ydata-profiling google-generativeai

10.2 Backup Important Files

  • Always backup your .env file
  • Save your requirements.txt
  • Keep a copy of your customized app.py

πŸ†˜ Getting Help

If you encounter issues:

  1. Check the Streamlit logs in the terminal
  2. Verify API key at Google AI Studio
  3. Test with simple CSV files first
  4. Check Python version compatibility
  5. Ensure all dependencies are installed correctly

πŸŽ‰ Success Indicators

You'll know everything is working when:

  • βœ… Streamlit starts without errors
  • βœ… "Google Gemini API configured successfully!" appears
  • βœ… You can upload and preview CSV files
  • βœ… Visualizations render correctly
  • βœ… AI insights generate successfully

🎯 You're now ready to explore your data with AI-powered insights!

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages