- Python 3.8 or higher
- Internet connection
- A Google account (for Gemini API)
mkdir eda-agent
cd eda-agent# Create virtual environment
python -m venv venv
# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activatepython --version
pip --versionCreate a file named requirements.txt with the following content:
streamlit==1.29.0
pandas==2.1.3
plotly==5.17.0
ydata-profiling==4.6.4
python-dotenv==1.0.0
google-generativeai==0.3.2pip install -r requirements.txtAlternative - Install individually:
pip install streamlit pandas plotly ydata-profiling python-dotenv google-generativeai- Go to Google AI Studio
- Sign in with your Google account
- Click "Create API Key"
- Choose "Create API key in new project" or select existing project
- Copy the generated API key (save it securely!)
In your project directory, create a file named .env:
# On Windows
echo GOOGLE_API_KEY=your_actual_api_key_here > .env
# On macOS/Linux
echo "GOOGLE_API_KEY=your_actual_api_key_here" > .envOr manually create the file:
GOOGLE_API_KEY=your_actual_api_key_here
your_actual_api_key_here with your real API key!
Create a file named app.py and copy the complete EDA Agent code provided earlier.
Your project should look like:
eda-agent/
βββ app.py
βββ requirements.txt
βββ .env
βββ venv/ (if using virtual environment)
βββ reports/ (will be created automatically)
streamlit run app.py- The application will automatically open in your browser
- If not, go to:
http://localhost:8501 - You should see the EDA Agent interface
Create a simple CSV file for testing:
name,age,salary,department
John,25,50000,IT
Jane,30,65000,Finance
Bob,35,55000,IT
Alice,28,60000,Marketing
Charlie,32,70000,FinanceSave as test_data.csv
- Upload the CSV file
- Explore visualizations
- Generate AI insights
- Create profiling report
Issue: Module not found
# Solution: Ensure virtual environment is activated and dependencies are installed
pip install -r requirements.txtIssue: API Key not working
# Check .env file format (no spaces around =)
GOOGLE_API_KEY=your_key_here
# Verify API key is valid at Google AI StudioIssue: Streamlit command not found
# Ensure streamlit is installed
pip install streamlit
# Or use full path
python -m streamlit run app.pyIssue: Port already in use
# Use different port
streamlit run app.py --server.port 8502Add debug information to check API connection:
# Add this to your app.py for debugging
if st.checkbox("Debug API Connection"):
st.write(f"API Key exists: {bool(api_key)}")
st.write(f"API Key length: {len(api_key) if api_key else 0}")- CSV files only
- UTF-8 encoding recommended
- Maximum file size: 200MB (Streamlit default)
- Use clean, well-formatted CSV files
- Ensure column names are descriptive
- Remove or handle special characters in data
- Start with smaller datasets for testing
- Use "Simplified report" for large datasets
- Generate insights in sections rather than all at once
- Monitor memory usage for very large files
pip install --upgrade streamlit pandas plotly ydata-profiling google-generativeai- Always backup your
.envfile - Save your
requirements.txt - Keep a copy of your customized
app.py
If you encounter issues:
- Check the Streamlit logs in the terminal
- Verify API key at Google AI Studio
- Test with simple CSV files first
- Check Python version compatibility
- Ensure all dependencies are installed correctly
You'll know everything is working when:
- β Streamlit starts without errors
- β "Google Gemini API configured successfully!" appears
- β You can upload and preview CSV files
- β Visualizations render correctly
- β AI insights generate successfully
π― You're now ready to explore your data with AI-powered insights!