Skip to content

Sireeshreddy01/mercor-tooling

Β 
Β 

Repository files navigation

Mercor Application Pipeline

An automated system for managing contractor applications with Airtable integration, intelligent shortlisting, and LLM-powered candidate evaluation.


⚠️ Note: Airtable's free plan does not support webhook automations or custom scripts in automations. The automation runs via an external Python webhook server that can be triggered manually or via API calls. For full automation, Airtable Team/Business plan is required.


πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Airtable      │────▢│  Webhook Server  │────▢│   OpenAI API    β”‚
β”‚   (5 Tables)    │◀────│  (Flask/Python)  │◀────│   (GPT-4o)      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Pipeline Flow

  1. New Application β†’ Airtable triggers webhook
  2. Compress β†’ Aggregates data from linked tables into JSON
  3. Shortlist β†’ Evaluates against criteria (experience, rate, location)
  4. LLM Eval β†’ GPT-4o scores and summarizes candidate

πŸ“Š Airtable Schema

Tables

Table Purpose
Applications Main applicant records with compressed JSON, LLM results
Personal Details Name, email, location, LinkedIn (linked to Applications)
Work Experience Job history with technologies (linked to Applications)
Salary Preferences Hourly rates, currency, availability (linked to Applications)
Shortlisted Leads Approved candidates with reasons (linked to Applications)

Key Fields in Applications Table

  • Application ID - Unique identifier
  • Compressed JSON - All applicant data as JSON blob
  • Shortlist Status - "Shortlisted" or "Rejected"
  • LLM Score - 1-10 rating from GPT-4o
  • LLM Summary - AI-generated candidate summary
  • LLM Follow-Ups - Suggested interview questions

πŸš€ Webhook API

Base URL: http://YOUR-SERVER-IP

Endpoints

POST /webhook/new-application

Full pipeline - compress, shortlist, and LLM evaluate.

curl -X POST http://YOUR-SERVER-IP/webhook/new-application \
  -H "Content-Type: application/json" \
  -d '{"record_id": "recXXXXXXXXX"}'

Response:

{
  "status": "processing",
  "record_id": "recXXXXXXXXX",
  "message": "Pipeline started in background"
}

POST /webhook/compress

Compress applicant data only.

curl -X POST http://YOUR-SERVER-IP/webhook/compress \
  -H "Content-Type: application/json" \
  -d '{"record_id": "recXXXXXXXXX"}'

Response:

{"status": "compressed", "record_id": "recXXXXXXXXX"}

POST /webhook/shortlist

Run shortlist evaluation only.

curl -X POST http://YOUR-SERVER-IP/webhook/shortlist \
  -H "Content-Type: application/json" \
  -d '{"record_id": "recXXXXXXXXX"}'

Response:

{"status": "shortlisted", "record_id": "recXXXXXXXXX"}

POST /webhook/llm-eval

Run LLM evaluation only.

curl -X POST http://YOUR-SERVER-IP/webhook/llm-eval \
  -H "Content-Type: application/json" \
  -d '{"record_id": "recXXXXXXXXX"}'

Response:

{"status": "evaluated", "record_id": "recXXXXXXXXX"}

GET /health

Health check endpoint.

curl http://YOUR-SERVER-IP/health

Response:

{"status": "ok", "service": "mercor-pipeline"}

GET /

API documentation.

curl http://YOUR-SERVER-IP/

βš™οΈ Shortlist Criteria

Candidates are automatically shortlisted if they meet ALL criteria:

Criteria Requirement
Experience β‰₯ 3 years total
Hourly Rate ≀ $150 USD equivalent
Location USA, Canada, UK, Germany, or India

Bonus: Candidates from Tier-1 companies (Google, Apple, Microsoft, Amazon, Meta, Netflix, Tesla, SpaceX, IBM, Intel) get highlighted.


πŸ€– LLM Evaluation

GPT-4o evaluates shortlisted candidates and provides:

  • Score (1-10): Overall fit rating
  • Summary: 2-3 sentence assessment
  • Follow-up Questions: 3 suggested interview questions

πŸ“ Project Structure

mercor-tooling/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ config.py           # API keys, table names, criteria
β”‚   β”œβ”€β”€ utils.py            # Logging, date parsing, validation
β”‚   β”œβ”€β”€ airtable_client.py  # Airtable API wrapper
β”‚   β”œβ”€β”€ compress.py         # Data compression logic
β”‚   β”œβ”€β”€ decompress.py       # JSON to tables restoration
β”‚   β”œβ”€β”€ shortlist.py        # Shortlisting criteria evaluation
β”‚   └── llm_eval.py         # OpenAI integration
β”œβ”€β”€ airtable_scripts/       # Scripts for Airtable Scripting Extension
β”œβ”€β”€ tests/                  # Unit tests
β”œβ”€β”€ webhook_server.py       # Flask webhook server
β”œβ”€β”€ reset_data.py           # Test data population script
β”œβ”€β”€ requirements.txt        # Python dependencies
β”œβ”€β”€ .env                    # Environment variables (not in git)
β”œβ”€β”€ mercor-pipeline.service # Systemd service file
β”œβ”€β”€ nginx.conf              # Nginx reverse proxy config
└── README.md

πŸ› οΈ Local Development

Setup

# Clone repository
git clone https://github.com/Suhaib3100/mercor-tooling.git
cd mercor-tooling

# Create virtual environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Create .env file
cp .env.example .env
# Edit .env with your API keys

Environment Variables

AIRTABLE_API_KEY=pat...
AIRTABLE_BASE_ID=app...
LLM_API_KEY=sk-proj-...
LLM_PROVIDER=openai
LLM_MODEL=gpt-4o

Run Pipeline Manually

# Compress all applicants
python3 -m src.compress

# Shortlist candidates
python3 -m src.shortlist

# LLM evaluation
python3 -m src.llm_eval

Run Webhook Server Locally

python3 webhook_server.py
# Server runs on http://localhost:8080

🌐 Production Deployment (GCP VM)

Quick Setup

# SSH into VM
ssh user@your-vm-ip

# Clone and setup
git clone https://github.com/Suhaib3100/mercor-tooling.git
cd mercor-tooling
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Create .env with your keys
nano .env

# Setup systemd service
sudo cp mercor-pipeline.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable mercor-pipeline
sudo systemctl start mercor-pipeline

# Setup Nginx
sudo cp nginx.conf /etc/nginx/sites-available/mercor-pipeline
sudo ln -sf /etc/nginx/sites-available/mercor-pipeline /etc/nginx/sites-enabled/
sudo rm -f /etc/nginx/sites-enabled/default
sudo systemctl restart nginx

Useful Commands

# Check status
sudo systemctl status mercor-pipeline

# View logs
sudo journalctl -u mercor-pipeline -f

# Restart service
sudo systemctl restart mercor-pipeline

πŸ”— Airtable Automation Setup

  1. Go to Airtable β†’ Automations tab
  2. Create new automation:
    • Trigger: When record created in "Applications"
    • Action: Send webhook
  3. Configure webhook:
    • URL: http://YOUR-SERVER-IP/webhook/new-application
    • Method: POST
    • Body: {"record_id": "{{Record ID}}"}
  4. Turn on automation

πŸ“ API Response Codes

Code Meaning
200 Success
202 Accepted (processing in background)
400 Bad request (missing record_id)
404 Record not found
500 Server error

πŸ§ͺ Testing

Test Webhook Endpoints

# Health check
curl http://YOUR-SERVER-IP/health

# Test with a real record ID from Airtable
curl -X POST http://YOUR-SERVER-IP/webhook/new-application \
  -H "Content-Type: application/json" \
  -d '{"record_id": "recANTrJO2vkL5tol"}'

Reset Test Data

python3 reset_data.py

πŸ“„ License

MIT License


πŸ‘€ Author

Suhaib SZ - GitHub

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 80.5%
  • JavaScript 16.6%
  • Shell 2.8%
  • Procfile 0.1%