Skip to content

Shivanggaryaa/cold-email-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

📧 Cold Email Generator using GenAI (LLM + RAG)

An end-to-end Generative AI project that builds a B2B cold email generator for software and AI services companies using LLMs, LangChain, ChromaDB, and Streamlit.

This tool helps services companies automatically generate personalized cold emails by analyzing job openings from a company’s careers page and matching them with relevant portfolio items using Retrieval-Augmented Generation (RAG).


🚀 Project Overview

Hiring and scaling engineering teams is expensive and time-consuming for enterprises. Instead of hiring internally, many companies prefer to outsource development to specialized software service providers.

This project simulates that real-world scenario:

A services company (AtliQ) identifies job openings on a client’s careers page (e.g., Software Engineer, RPA Engineer) and sends a targeted cold email offering dedicated engineering support.


🧠 What This Project Demonstrates

  • End-to-end LLM application development
  • Real-world RAG (Retrieval-Augmented Generation) architecture
  • Web scraping and text extraction from careers pages
  • Structured information extraction using LLMs
  • Controlled prompt engineering for deterministic outputs
  • Integration of Groq-hosted open-source LLaMA models
  • Clean Streamlit-based UI for interaction

🏗️ System Architecture

Careers Page URL
       ↓
Web Scraping & Text Cleaning
       ↓
LLM-based Job Extraction (JSON)
       ↓
Vector Search (ChromaDB)
       ↓
LLM-based Cold Email Generation
       ↓
Streamlit UI Output

🛠️ Tech Stack

Layer Technology
Programming Language Python
LLM LLaMA 3.1 (via Groq)
LLM Framework LangChain
Vector Database ChromaDB
Web Scraping BeautifulSoup / Playwright (optional)
UI Streamlit
Environment Management Python venv

📂 Project Structure

cold-email-generator/
│
├── app/
│   ├── main.py              # Streamlit app entry point
│   ├── chains.py            # LLM chains (job extraction + email generation)
│   ├── portfolio.py         # Vector DB ingestion & retrieval
│   ├── utils.py             # Text cleaning utilities
│   └── __init__.py
│
├── resource/
│   └── my_portfolio.csv     # Dummy portfolio data
│
├── vectorstore/             # ChromaDB persistent storage
├── .env                     # API keys (not committed)
├── requirements.txt
└── README.md

⚙️ How It Works (Step-by-Step)

  1. User inputs a careers/job page URL
  2. The system scrapes and cleans the page content
  3. An LLM extracts structured job details (role, skills, description)
  4. Relevant portfolio items are retrieved using vector similarity search
  5. A controlled prompt generates a professional B2B cold email
  6. The email is displayed in the Streamlit UI

✉️ Sample Output

Subject: Dedicated Engineering Support for Automation Initiatives

Dear Hiring Manager,

We understand the challenges involved in hiring and scaling engineering teams for
roles focused on automation and RPA technologies.

AtliQ can provide dedicated engineers experienced in building automation workflows
using Microsoft Power Automate Desktop and Cloud, helping organizations streamline
business processes and improve efficiency.

Relevant work:
- https://example.com/rpa-portfolio
- https://example.com/automation-portfolio

I would be happy to discuss how AtliQ can support your automation initiatives.

Best regards,
Mohan
Business Development Executive
AtliQ

Note: Portfolio data and links are intentionally dummy data. The focus of this project is system design, GenAI flow, and RAG implementation.


🧪 Dummy Data Disclaimer

This project intentionally uses:

  • Dummy portfolio links
  • Mock company names (AtliQ)
  • Sample business scenarios

The goal is to demonstrate GenAI architecture and reasoning, not real sales content.


▶️ How to Run the Project Locally

1️⃣ Clone the Repository

git clone https://github.com/your-username/cold-email-generator.git
cd cold-email-generator

2️⃣ Create Virtual Environment

python -m venv venv
source venv/bin/activate   # Windows: venv\Scripts\activate

3️⃣ Install Dependencies

pip install -r requirements.txt

4️⃣ Add Environment Variables

Create a .env file:

GROQ_API_KEY=your_groq_api_key_here

5️⃣ Run the App

streamlit run app/main.py

📌 Supported Job Pages for Testing

  • Company career pages (e.g., Synapse India, Nike, etc.)
  • Public job description URLs
  • JavaScript-rendered pages (recommended with Playwright)

🧠 Key Learning Outcomes

  • Building production-style LLM pipelines
  • Why prompt constraints matter more than model choice
  • How RAG quality depends on data quality
  • Handling real-world GenAI debugging issues
  • Designing deterministic, controllable LLM outputs

🔮 Future Enhancements

  • Multi-job extraction → multiple emails
  • Tone selector (formal / casual / aggressive)
  • Resume-based portfolio ingestion (PDF)
  • Skill-weighted portfolio matching
  • Deployment on Streamlit Cloud or HuggingFace Spaces

👤 Author

Shivang Arya B.Tech Engineering Student | GenAI & ML Enthusiast

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Contributors

Languages