A AI-powered chat application that helps users search and explore the Ecoinvent database (only publicly available meta-data, so no license needed).
This tool combines Semantic Search (to find relevant data even if keywords don't match exactly) with a Large Language Model (Llama 3 via Groq) to provide natural language answers.
Or use Ecoagent.tech for the refined version.
- Natural Language Search: Ask questions like "What are the impacts of steel production?" instead of searching for exact codes.
- Semantic Matching: Uses
SentenceTransformers(BERT) to find relevant database entries based on meaning, not just keywords. - AI Synthesis: Uses the Groq API (Llama 3.3 model) to read the data and explain it to you in plain English.
- Interactive Chat: Built with Streamlit for a clean, chat-like interface.
Before running this project, ensure you have the following:
- Python installed (Version 3.8 or higher recommended).
- Groq API Key: You need an API key to run the LLM. Get one for free at console.groq.com.
Follow these steps to get the app running on your computer.
It is best practice to keep dependencies isolated.
Windows:
python -m venv venv
.\venv\Scripts\activate
Mac/Linux:
python3 -m venv venv
source venv/bin/activate
Install the required Python packages listed in requirements.txt:
pip install -r requirements.txt
Streamlit handles secrets (like API keys) using a specific file.
Create a folder named .streamlit in your project root.
Inside that folder, create a file named secrets.toml.
Add your Groq API key to the file:
.streamlit/secrets.toml
GROQ_API_KEY = "gsk_..."
Once everything is installed and configured, run the app with:
streamlit run app.py
A browser window should automatically open pointing to http://localhost:8501.
app.py: The main application code containing the UI, the search logic, and the LLM integration.
data/: Directory to store the dataset.
.streamlit/secrets.toml: Configuration file for API keys (ignored by Git).
requirements.txt: List of Python libraries needed.
Embedding: When the app starts, it loads the dataset and the BERT model.
User Query: When you ask a question (e.g., "electric car battery"), the app converts your text into numbers (vector embedding).
Vector Search: It compares your query numbers with the numbers in the database to find the most mathematically similar records.
LLM Response: The app sends the found data + your question to the Llama 3 model, which writes a human-readable answer.
Pull requests are welcome