🚀 AI Resume Ranker and Test Evaluator (Complete hiring process)

Intelligent Resume Ranking System Powered by Machine Learning & Big Data log of Test Evaluation using Apache Spark 🤖✨ 💾💻⭐

Features • Installation • Quick Start • Usage • Results

##🤝 Team Members

🎓 Manikesh Kumar , 23BDS032
🎓 Amarjeet Raj , 23BDS006
🎓 Ojas Jogdand , 23BDS039

✨ Features

🎯 Resume Ranking: Automatically rank resumes based on job description match
🧠 ML-Powered Matching: Uses machine learning to extract and compare features
⚡ Distributed Processing: Leverages Apache Spark for large-scale data processing
🔎 Cheat Detection: Identify suspicious quiz attempts with statistical analysis
📊 Data Analytics: Comprehensive analytics on website logs and user behavior
🎓 Interactive Labs: Jupyter Notebook labs for learning Apache Spark fundamentals
💾 CSV Export: Export ranked results and analytics to CSV format

📦 Prerequisites

Before you begin, ensure you have the following installed:

🐍 Python >= 3.8
☕ Java Development Kit (JDK) >= 8
📦 pip (Python package manager)
💻 Git

🔧 Installation

Step 1️⃣: Clone the Repository

git clone https://github.com/DataScience-ArtificialIntelligence/Resume_Screening_Test_Evaluation.git
cd AI-Resume-Ranker

Step 2️⃣: Create a Virtual Environment

bash

On Windows

python -m venv venv
venv\Scripts\activate

On macOS/Linux

python3 -m venv venv
source venv/bin/activate

Step 3️⃣: Install Dependencies

pip install -r requirements.txt

Step 4️⃣: Install PySpark & Additional Tools

pip install pyspark findspark

Step 5️⃣: Verify Installation

python -c "import pyspark; print(f'PySpark {pyspark.__version__} installed successfully! ✅')"

🚀 Quick Start

Using the Resume Ranker 📄

bash python main.py
--resumes ./resumes
--jd job_description.txt
--top 5
--output output/ranked_resumes.csv

Evaluating Test Responses 📝

bash python evaluate_tests.py
--responses ./responses
--shortlist output/ranked_resumes.csv
--out output/test_results
--min_mcq 8
--min_code 300

📖 Usage Guide

1. Prepare Your Data 📁

Create the following folder structure:

AI-Resume-Ranker/ ├── resumes/ # Place resume files here (.pdf, .docx, .txt) ├── responses/ # JSON files of candidate test responses ├── output/ # Generated output files ├── job_description.txt # Job description file └── ws-logs_filtered.csv # Website logs for analysis

2. Prepare Job Description 🎯

Create a job_description.txt file with the role details:

Role: Machine Learning Engineer

Responsibilities:

Develop and deploy ML models
Build data pipelines

Required: Python, Machine Learning, SQL Preferred: TensorFlow, Docker, Kubernetes

3. Run Resume Ranking 🏃

bash python main.py --resumes ./resumes --jd job_description.txt --top 10

4. Evaluate Test Responses ✅

bash python evaluate_tests.py
--responses ./responses
--shortlist output/ranked_resumes.csv
--out output/test_results

5. Analyze Website Logs 📊

Open resume.ipynb in Jupyter Notebook for interactive Spark analytics:

bash jupyter notebook resume.ipynb

📊 Project Structure

AI-Resume-Ranker/ │ ├── 📄 main.py # Main resume ranking script ├── 📄 evaluate_tests.py # Test evaluation script ├── 📓 resume.ipynb # Interactive Spark lab notebook │ ├── 📁 utils/ │ ├── extract_text.py # Extract text from PDFs/DOCX │ ├── extract_features.py # Feature extraction from resumes │ ├── ranker.py # Ranking algorithm │ └── test_evaluator.py # Test evaluation logic │ ├── 📁 resumes/ # Resume files (input) ├── 📁 responses/ # Test response files (input) ├── 📁 output/ # Output results │ ├── 📄 requirements.txt # Python dependencies ├── 📄 job_description.txt # Job description template ├── 📄 ws-logs_filtered.csv # Website logs data │ ├── 📄 README.md # This file └── 📄 LICENSE # License file

🎯 Expected Outcomes

Resume Ranking Output 📊

A CSV file (ranked_resumes.csv) with:

✅ Candidate name and file path
✅ Overall ranking score (0-100)
✅ Skill match percentage
✅ Experience level match
✅ Ranking position

Example Output:

name,file_path,score,rank john_doe,./resumes/john_doe.pdf,95.5,1 jane_smith,./resumes/jane_smith.docx,87.3,2 bob_wilson,./resumes/bob_wilson.pdf,76.2,3

Test Evaluation Output ✅

Three CSV files:

selected_candidates.csv - 🎉 Candidates who passed
rejected_candidates.csv - ❌ Candidates who were rejected
all_ranked_candidates.csv - 📋 Complete ranking with scores

Cheat Detection Report 🚨

Identifies suspicious patterns:

⏱ Unusually fast completion times
📊 Statistical anomalies (< 1/5 of average time)
👥 User behavior analysis
📈 Time spent on each problem

🔍 Cheat Detection (Bonus) 🕵

The resume.ipynb notebook includes advanced cheat detection analytics:

python

Detect cheaters based on response time anomalies

cheaters = identify_cheaters(quiz_logs, threshold=0.2)

Detects:

🏃 Suspiciously fast quiz completions
📝 Inadequate problem-solving time
🎯 Statistically improbable answer patterns
🔗 Collaborative behavior indicators

Output Includes:

List of flagged users
Detailed timeline analysis
Early bird detectors
Fastest solvers per problem

📝 Examples

Example 1: Ranking Resumes 🎯

bash

Basic usage

python main.py --resumes ./resumes --jd job_description.txt

Advanced usage with custom settings

python main.py
--resumes /path/to/resumes
--jd /path/to/job_desc.txt
--top 15
--output results/my_rankings.csv

Example 2: Evaluating Candidates 📝

bash python evaluate_tests.py
--responses ./responses
--shortlist output/ranked_resumes.csv
--out output/final_results
--min_mcq 5
--min_code 200

Example 3: Running Spark Analytics 🔥

Launch Jupyter and execute cells in resume.ipynb:

python from pyspark import SparkContext sc = SparkContext("local", "Analytics")

Load and analyze quiz logs

logs_rdd = sc.textFile("ws-logs_filtered.csv")

... perform analysis

🎓 Learning Resources

PySpark Lab Features 📚

The included Jupyter notebook (resume.ipynb) teaches:

RDD Operations 🎯
- Creating RDDs from lists and files
- Map, filter, flatMap transformations
- Reduce and aggregation operations
Optimizations ⚡
- Lazy evaluation
- Caching and persistence
- Checkpointing
- Lineage tracking
Spark UI 🖥
- Job monitoring
- Stage analysis
- Storage management
Real-World Analytics 📊
- Quiz log analysis
- Cheat detection algorithms
- Performance metrics

🔐 Requirements

Python Packages:

scikit-learn # Machine learning pandas # Data manipulation PyPDF2 # PDF processing python-docx # DOCX processing numpy # Numerical computing pyspark # Distributed computing findspark # Spark initialization

🤝 Contributing

We welcome contributions! 🎉

Fork the repository 🍴
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request 🔔

🐛 Troubleshooting

❌ PySpark not found?

bash pip install pyspark --upgrade python -m pip install findspark

❌ Java not installed?

Download JDK from oracle.com
Set JAVA_HOME environment variable

❌ Permission denied on Linux/Mac?

bash chmod +x main.py evaluate_tests.py

❌ Module import errors?

bash pip install -r requirements.txt --force-reinstall

📧 Contact & Support

📧 Email: 23bds032@iiitdwd.ac.in
🐛 Issues: GitHub Issues
💬 Discussions: GitHub Discussions

📄 License

This project is licensed under the MIT License - see the LICENSE file for details. 📜

Made with ❤ by [Manikesh Kumar, Amarjeet Raj, Ojas Jogdand]

⭐ If you found this helpful, please give it a star! ⭐

⬆ back to top

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
AI_Job_Jhortlisting_with_resume_and_test/AI-Resume-Ranker		AI_Job_Jhortlisting_with_resume_and_test/AI-Resume-Ranker
2minutes_Video_Demo.mp4		2minutes_Video_Demo.mp4
Programming_result.txt		Programming_result.txt
Quiz_result.txt		Quiz_result.txt
README.md		README.md
lab2.ipynb		lab2.ipynb
ws-logs_filtered.csv		ws-logs_filtered.csv

Folders and files

Latest commit

History

Repository files navigation

🚀 AI Resume Ranker and Test Evaluator (Complete hiring process)

📋 Table of Contents

✨ Features

📦 Prerequisites

🔧 Installation

Step 1️⃣: Clone the Repository

Step 2️⃣: Create a Virtual Environment

On Windows

On macOS/Linux

Step 3️⃣: Install Dependencies

Step 4️⃣: Install PySpark & Additional Tools

Step 5️⃣: Verify Installation

🚀 Quick Start

Using the Resume Ranker 📄

Evaluating Test Responses 📝

📖 Usage Guide

1. Prepare Your Data 📁

2. Prepare Job Description 🎯

3. Run Resume Ranking 🏃

4. Evaluate Test Responses ✅

5. Analyze Website Logs 📊

📊 Project Structure

🎯 Expected Outcomes

Resume Ranking Output 📊

Test Evaluation Output ✅

Cheat Detection Report 🚨

🔍 Cheat Detection (Bonus) 🕵

Detect cheaters based on response time anomalies

📝 Examples

Example 1: Ranking Resumes 🎯

Basic usage

Advanced usage with custom settings

Example 2: Evaluating Candidates 📝

Example 3: Running Spark Analytics 🔥

Load and analyze quiz logs

... perform analysis

🎓 Learning Resources

PySpark Lab Features 📚

🔐 Requirements

🤝 Contributing

🐛 Troubleshooting

❌ PySpark not found?

❌ Java not installed?

❌ Permission denied on Linux/Mac?

❌ Module import errors?

📧 Contact & Support

📄 License

Made with ❤ by [Manikesh Kumar, Amarjeet Raj, Ojas Jogdand]

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages