AI-Powered Career Intelligence Platform
Skillence is a full-stack web application that combines resume parsing, AI career recommendations, job market analytics, campus placement management, and interview coaching into a single unified platform. Built as a final-year B.Tech project.
- AI-Powered Resume Parsing β Upload PDF/DOCX resumes. Azure AI Document Intelligence extracts text, then Google Gemini structures it into categorized data (skills, experience, education, projects, certifications).
- Fallback Parser β spaCy + NLTK regex-based parser activates automatically if Azure/Gemini are unavailable.
- SHA-256 Deduplication β Prevents re-processing of identical uploads.
- Profile Editor β Review and manually edit parsed resume data before saving.
- 3-Stage AI Pipeline β (1) Map user skills to a 692-skill vocabulary via exact match, aliases, and fuzzy regex. (2) Dot-product pre-filter narrows 894 O*NET occupations to top 30 candidates in ~1ms. (3) Google Gemini ranks and scores the top 10 with explanations.
- 3-Phase Learning Roadmaps β Personalized skill gap analysis using O*NET Skills, Technology Skills, and Knowledge databases. Synonym-aware matching with 70% coverage threshold. ML-enhanced skill recommendations.
- Progress Tracking β Save career paths, track learning milestones, mark skills as completed with checkpoint persistence.
- Rich Dashboard β Analyze ~30,000 job postings with advanced filtering (location, industry, experience level, salary range, company size, employment type).
- Trendiness Scoring β Composite score: recency (40%) + salary (30%) + benefits (20%) + growth (10%).
- AI Insights β Gemini-powered market analysis and career advice.
- ML Salary Prediction β Neural network predicts salaries based on skills, experience, location, industry, and more.
- Data Export β Download filtered results as CSV or JSON.
- Rich Learning Content β Detailed skill profiles including prerequisites, use cases, learning difficulty, and top courses from platforms like Udemy and Coursera.
- YouTube API Integration β Fetching and displaying top educational videos for each skill using the YouTube Data API v3.
- Interactive Roadmap β Visual, trackable learning roadmap with expandable phases, topics, and completion checkboxes.
- Dynamic Resources β Fetching relevant external resources (MDN, GeeksforGeeks, W3Schools, freeCodeCamp) for specific roadmap topics using DuckDuckGo search and static fallbacks.
- Bookmark System β Robust system for users to save skills with status tracking (Interested, Learning, Completed) and course progress percentage.
- Practice Platforms β Direct links to relevant coding, AI, and cloud practice sites.
- Side-by-Side Comparison β Compare two job offers with salary charts (Chart.js), cost-of-living analysis via Gemini, and interactive Leaflet maps with Nominatim geocoding.
- Market Context β Adzuna API integration for salary benchmarks across 25+ countries.
- Placement Cell Dashboard (admin role) β Create and manage company drives, upload/parse job descriptions (Gemini + keyword fallback), manage course catalogs, shortlist students algorithmically.
- Student Portal β Browse drives (upcoming/expired/not-eligible tabs), upload grade history PDFs, view eligibility and match scores, apply and track applications.
- Matching Engine β Pure algorithmic scoring (zero LLM):
total = required_skillsΓ0.70 + preferred_skillsΓ0.15 + resume_bonusΓ0.10 + cgpa_bonusΓ0.05. Eligibility gate checks 10th%, 12th%, CGPA, and active backlogs. - Grade History Parsing β VIT-style PDF parser extracts courses, grades (S=10 to F=0), credits, and CGPA. Handles retakes by keeping best grade.
- CourseβSkill Mapping β Curriculum PDFs parsed and mapped to canonical skills via Gemini AI.
- Career Guidance β Multi-turn conversational AI powered by Gemini. Context-aware with user profile data. Includes platform navigation guidance.
- Speech Support β Web Speech API for speech-to-text input and text-to-speech output.
- Session Persistence β Chat history stored in localStorage across page refreshes.
- Interview Diagnostic β AI-powered interview coaching using a structured "MistakeLoop" methodology.
- Contextual Coaching β Incorporates user profile and career goals for personalized feedback.
- Skill Recommender β Denoising Autoencoder trained on O*NET occupationβskill data. Recommends skills based on reconstruction confidence, suppressing 40+ generic/non-actionable terms.
- Salary Predictor β MLP with skip connections trained on ~30K job postings. Encodes skills (binary vector), experience, location, industry, company size, education level, and employment type.
- NumPy-Only Inference β Both models are trained with PyTorch offline, exported to
.npz, and run in production via pure NumPy to avoid PyTorch DLL crashes on Windows.
| Layer | Technology |
|---|---|
| Backend | FastAPI 0.104.1, Python 3.8+, Uvicorn |
| Frontend | React 19.1, Vite 7, pnpm |
| Database | MongoDB Atlas (async Motor 3.3 driver) |
| AI Services | Google Gemini 2.5 Flash, Azure AI Document Intelligence |
| ML Training | PyTorch 2.x (offline only) |
| ML Inference | Pure NumPy (no PyTorch at runtime) |
| Auth | JWT (HS256) + bcrypt + Google OAuth 2.0 |
| Career Data | O*NET (894 occupations, 692-skill vocabulary) |
| Charts | Recharts (job trends), Chart.js (offer evaluator) |
| Maps | Leaflet + react-leaflet + Nominatim geocoding |
| Icons | Lucide React |
| Animations | Motion library |
| WebGL | OGL (neural network background) |
| Speech | Web Speech API (STT/TTS) |
| HTTP | Fetch API (general), Axios (chatbot) |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Frontend (React 19) β
β Port 3000 Β· Vite 7 Β· pnpm Β· react-router-dom v7 β
β β
β ββββββββββββ ββββββββββββββ ββββββββββββ ββββββββββββββββ β
β β Resume β β Career β β Job β β Placement β β
β β Dashboard β β Recommend β β Trends β β (Student/ β β
β β β β β β β β Admin) β β
β ββββββββββββ ββββββββββββββ ββββββββββββ ββββββββββββββββ β
β ββββββββββββ ββββββββββββββ ββββββββββββ ββββββββββββββββ β
β β Chatbot β β Reflectionβ β Job β β Profile β β
β β β β Engine β β Offer β β Page β β
β ββββββββββββ ββββββββββββββ ββββββββββββ ββββββββββββββββ β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ
β REST API (fetch / axios)
ββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββ
β Backend (FastAPI) β
β Port 8000 Β· 10 Routers Β· Async Motor Β· JWT Auth β
β β
β Routers: auth Β· resume Β· profile Β· career_path Β· chatbot β
β job_trends Β· ml_predictions Β· placement_cell β
β student_placement Β· skills β
β β
β Services: 14 service modules (auth, resume parsing, β
β career matching, learning plans, job trends, β
β placement matching, JD parsing, academics, etc.) β
β β
β ML: NumPy-only inference (skill recommender + β
β salary predictor) loaded lazily on first request β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββΌβββββββββββββββββ
βΌ βΌ βΌ
ββββββββββββββ ββββββββββββ ββββββββββββββββββ
β MongoDB β β Gemini β β Azure AI Doc β
β Atlas β β 2.5 β β Intelligence β
β β β Flash β β β
ββββββββββββββ ββββββββββββ ββββββββββββββββββ
- Python 3.8+
- Node.js 16+ with pnpm
- MongoDB Atlas instance (or local MongoDB)
- API Keys: Google Gemini, Azure AI Document Intelligence
- Optional API Keys: O*NET Web Services, JSearch (RapidAPI), Adzuna, Google OAuth, YouTube Data API v3
Create a .env file in the project root (not inside backend/ or frontend/):
# Database
MONGODB_URI=mongodb+srv://...
# JWT Authentication
SECRET_KEY=your-secret-key
ALGORITHM=HS256
ACCESS_TOKEN_EXPIRE_MINUTES=30
# Azure AI Document Intelligence
AZURE_ENDPOINT=https://your-resource.cognitiveservices.azure.com/
AZURE_API_KEY=your-azure-key
# Google Gemini AI
GEMINI_API=your-gemini-api-key
GEMINI_ENDPOINT=https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash
# Email (Password Reset)
EMAIL_HOST=smtp.gmail.com
EMAIL_PORT=587
EMAIL_HOST_USER=your-email@gmail.com
EMAIL_HOST_PASSWORD=your-app-password
FRONTEND_URL=http://localhost:3000
# CORS (Backend)
# Comma-separated values
CORS_ORIGINS=http://localhost:3000,http://127.0.0.1:3000
# Regex for preview deployments (Vercel)
CORS_ORIGIN_REGEX=https://.*\\.vercel\\.app
# Data Enrichment (Optional - for ML training only)
ONET_API_KEY=your-onet-key
JSEARCH_API_KEY=your-jsearch-key
# Frontend (VITE_ prefix required)
VITE_API_URL=http://localhost:8000
VITE_GOOGLE_CLIENT_ID=your-google-oauth-client-id
VITE_ADZUNA_API_KEY=your-adzuna-key
VITE_ADZUNA_APP_ID=your-adzuna-app-id
VITE_GEMINI_API_KEY=your-gemini-key-for-frontendBackend:
cd backend
pip install -r requirements.txt
uvicorn main:app --reload --port 8000Frontend:
cd frontend
pnpm install
pnpm run devThe frontend runs at http://localhost:3000 and the backend API at http://localhost:8000.
This repository is set up for split deployment:
- Frontend:
frontend/on Vercel - Backend:
backend/on Render
- Import this repo into Render.
- Use the existing render.yaml blueprint, or create a Web Service manually with:
- Root Directory:
backend - Build Command:
pip install -r requirements.txt - Start Command:
uvicorn main:app --host 0.0.0.0 --port $PORT
- Root Directory:
- Set environment variables in Render (prompted by
sync: falsein blueprint), especially:MONGODB_URI,SECRET_KEY,GEMINI_API,AZURE_ENDPOINT,AZURE_API_KEYFRONTEND_URL= your Vercel production URL (e.g.https://your-app.vercel.app)CORS_ORIGINS= comma-separated Vercel + local origins
- After deploy, copy your Render backend URL (e.g.
https://your-backend.onrender.com).
- Import this repo into Vercel.
- Set Root Directory to
frontend. - Framework preset: Vite (auto-detected).
- Ensure environment variable:
VITE_API_URL= your Render backend URL (no trailing slash)
- SPA routing fallback is configured in frontend/vercel.json.
- Local dev defaults still work (
VITE_API_URLdefaults tohttp://localhost:8000). - Backend CORS now supports both local origins and configured deployment origins.
- Password reset email links now use
FRONTEND_URLinstead of hardcoded localhost.
/
βββ .env # All environment variables (root level)
βββ .github/copilot-instructions.md # AI agent context (comprehensive)
βββ package.json # Root package.json
β
βββ backend/
β βββ main.py # FastAPI entrypoint, 10 routers
β βββ requirements.txt # Python dependencies
β βββ run_salary_pipeline.py # CLI: salary model training pipeline
β βββ seed_course_catalog.py # CLI: curriculum PDF β MongoDB seed
β βββ app/
β βββ database.py # Async Motor client
β βββ models/ # Pydantic models (user, resume, placement, roadmap)
β βββ routers/ # 10 API routers
β βββ services/ # 14 business logic services
β βββ utils/ # JWT, role auth, email
β βββ data/ # skill_taxonomy.json
β βββ career_data/ # O*NET JSON + Excel + job CSVs
β βββ ml/ # ML pipeline (models, training, inference, data)
β
βββ frontend/
β βββ package.json # React 19, Vite 7
β βββ vite.config.js # Port 3000, envDir: root
β βββ src/
β βββ App.jsx # Routes, ProtectedRoute, RoleRoute
β βββ main.jsx # GoogleOAuthProvider, theme init
β βββ components/ # 23+ components organized by feature
β
βββ docs/ # Documentation files
| Method | Endpoint | Auth | Description |
|---|---|---|---|
| POST | /register |
β | Create account (student or placement_cell) |
| POST | /login |
β | Email/password login β JWT |
| POST | /google-login |
β | Google OAuth login (Gmail only) |
| GET | /verify |
Bearer | Verify JWT validity |
| GET | /me |
Bearer | Get current user info |
| POST | /forgot-password |
β | Send reset email |
| POST | /reset-password |
β | Reset with token |
| Method | Endpoint | Auth | Description |
|---|---|---|---|
| POST | /upload |
Bearer | Upload resume file |
| POST | /parse |
Bearer | Parse uploaded resume (Azure AI + Gemini) |
| GET | /{user_id} |
Bearer | Get parsed resume data |
| Method | Endpoint | Auth | Description |
|---|---|---|---|
| POST | /save |
Bearer | Save profile data |
| GET | / |
Bearer | Get user profile |
| PUT | /update |
Bearer | Update profile |
| Method | Endpoint | Auth | Description |
|---|---|---|---|
| POST | /recommendations |
Bearer | Get AI career recommendations (top 10) |
| POST | /save-career |
Bearer | Save selected career path |
| GET | /learning-plan |
Bearer | Generate 3-phase learning roadmap |
| PUT | /learning-plan |
Bearer | Update learning progress |
| POST | /update-learned-skill |
Bearer | Mark skill as learned |
| Method | Endpoint | Auth | Description |
|---|---|---|---|
| POST | /chat |
Optional | Career guidance conversation |
| Method | Endpoint | Auth | Description |
|---|---|---|---|
| POST | /chat |
Optional | Interview coaching conversation |
| Method | Endpoint | Auth | Description |
|---|---|---|---|
| GET | / |
β | List minimal skill info for browsing |
| GET | /search |
β | Search skills by name or category |
| GET | /{skill_id} |
β | Get full details of a specific skill |
| GET | /youtube/{skill_id} |
β | Fetch top YouTube educational videos |
| GET | /{skill_id}/resources |
β | Fetch external resources for a roadmap topic |
| GET | /user/activity |
Bearer | Get user's saved skills and progress |
| POST | /user/activity |
Bearer | Update saved skills, progress, and status |
| Method | Endpoint | Auth | Description |
|---|---|---|---|
| GET | /jobs |
β | List jobs with filters |
| GET | /analysis/{job_title} |
β | Job analysis |
| GET | /skills/{job_title} |
β | Skill demand for job |
| GET | /overview |
β | Market overview |
| GET | /trends/{job_title} |
β | Time-series trends |
| GET | /experience-distribution/{job_title} |
β | Experience breakdown |
| GET | /filter-options |
β | Available filter values |
| GET | /ai-insights |
β | AI market insights |
| GET | /ai-insights-gemini |
β | Gemini-specific insights |
| GET | /detailed-analysis/{job_title} |
β | Comprehensive analysis |
| GET | /cache-info |
β | Cache status |
| POST | /clear-cache |
β | Clear data cache |
| GET | /export/csv |
β | Export as CSV |
| GET | /export/json |
β | Export as JSON |
| POST | /predict-salary |
β | ML salary prediction |
| Method | Endpoint | Auth | Description |
|---|---|---|---|
| POST | /recommend-skills |
β | Skill recommendations from autoencoder |
| GET | /health |
β | Model health check |
| Method | Endpoint | Auth | Description |
|---|---|---|---|
| POST | /drives |
Role | Create company drive |
| GET | /drives |
Role | List all drives |
| GET | /drives/{id} |
Role | Get drive details |
| PUT | /drives/{id} |
Role | Update drive |
| DELETE | /drives/{id} |
Role | Delete drive |
| POST | /drives/{id}/jd |
Role | Upload/parse JD |
| GET | /courses |
Role | List course catalog |
| POST | /courses |
Role | Add course |
| PUT | /courses/{id} |
Role | Edit course |
| DELETE | /courses/{id} |
Role | Delete course |
| POST | /curriculum/upload |
Role | Upload curriculum PDF |
| POST | /drives/{id}/shortlist |
Role | Shortlist students |
| PUT | /applications/{id}/status |
Role | Update application status |
| Method | Endpoint | Auth | Description |
|---|---|---|---|
| GET | /academics |
Role | Get academic profile |
| POST | /academics/grade-history |
Role | Upload grade history PDF |
| PUT | /academics |
Role | Update 10th/12th% |
| GET | /academics/skills |
Role | Get skill profile |
| GET | /drives |
Role | Browse drives (with tabs) |
| GET | /drives/{id} |
Role | Drive detail + match score |
| POST | /apply/{drive_id} |
Role | Apply to drive |
| GET | /applications |
Role | View applications |
| DELETE | /applications/{id} |
Role | Withdraw application |
Both models follow the same workflow: Train (PyTorch) β Export (.npz) β Infer (NumPy).
- Architecture: Denoising Autoencoder β Encoder: Vβ256β128β64β32 (latent). Decoder mirrors. BatchNorm + LeakyReLU(0.2) + Dropout(0.3). Sigmoid output.
- Data: Binary skillβoccupation vectors from O*NET Excel files, occupation JSON, job CSVs, and cached API data.
- Training: BCE loss, Adam optimizer, 30% input corruption, early stopping (patience=20).
- Architecture: MLP with skip connections β Inputβ512β256β128β64β1. BatchNorm + ReLU + Dropout(0.3). He initialization.
- Data: ~30K job postings. Features: binary skill vector, scaled experience, label-encoded categoricals, one-hot experience level.
- Training: MSE loss, Adam, gradient clipping, early stopping (patience=25).
cd backend
# Skill recommender
python -m app.ml.data.fetch_api_data # Fetch O*NET + JSearch data (optional)
python -m app.ml.data.skill_data_processor # Build vocabulary
python -m app.ml.training.train_skill_recommender # Train
python -m app.ml.export_to_numpy # Export to .npz
# Salary predictor (all-in-one)
python run_salary_pipeline.py
# Or with flags: --skip-processing --skip-training --skip-export --model lite|fullcd backend
python seed_course_catalog.py --pdf ../curr.pdf| Role | Access | Key Pages |
|---|---|---|
| student (default) | Resume upload, career recommendations, job trends, placement portal, chatbot | /dashboard/resume, /career-path-recommendation, /campus-placement |
| placement_cell | All student features + drive management, JD parsing, shortlisting, curriculum management | /placement-dashboard |
| Unauthenticated | Landing page, job trends, job offer evaluator, reflection engine, static pages | /, /job-trends, /job-offer-evaluator, /reflection-engine |
| Service | Purpose | Used In |
|---|---|---|
| Azure AI Document Intelligence | Resume text extraction (PDF/DOCX) | Backend: azure_resume_parser.py |
| Google Gemini 2.5 Flash | Career ranking, learning plans, chatbot, interview coaching, JD parsing, resume structuring, course mapping, cost-of-living | Backend (6+ services) + Frontend (Job Offer Evaluator) |
| O*NET Web Services | Occupation data enrichment for ML training | Backend: fetch_api_data.py |
| JSearch (RapidAPI) | Job posting data for ML training | Backend: fetch_api_data.py |
| Adzuna API | Salary benchmarks, job market data | Frontend: JobOfferEvaluator |
| Nominatim (OSM) | Geocoding for map location picker | Frontend: MapLocationPicker |
| Google OAuth 2.0 | Social login | Frontend β Backend |
| YouTube Data API v3 | Fetching educational videos for skills | Backend: youtube_service.py |
| DuckDuckGo Search | Dynamic scraping of learning resources | Backend: resource_fetcher.py |
-
NumPy-Only Inference β PyTorch triggers
c10.dll/WinError 1114crashes when loaded inside async FastAPI on Windows. Models are trained offline with PyTorch, exported to.npzweight files, and the forward pass uses 4 pure-NumPy functions (_linear,_batchnorm,_leaky_relu/_relu,_sigmoid). -
Single Root
.envβ All environment variables live in one.envat the project root. Vite usesenvDir: '..'to access it. Backend services load it viadotenvwith relative paths. -
No Global State Management β Frontend uses React
useState+localStorageonly. No Redux, Context API, or Zustand. Auth tokens, chat sessions, and theme preferences are all inlocalStorage. -
Gemini JSON Repair β Gemini responses frequently include trailing commas, truncated output, or markdown fences. Multiple services implement
_try_repair_json()helpers. -
Lazy Model Loading β ML models (
SkillRecommender,SalaryPredictorInference) are loaded on first API request, not at server startup, to keep startup fast. -
Service-Oriented Backend β Routers delegate to services; services own business logic. Exception:
career_path.pyhas inline scoring logic alongside service calls (technical debt). -
Dual Resume Parser β Primary: Azure AI + Gemini (high accuracy). Fallback: spaCy + NLTK + regex (works offline). Automatic failover.
-
Pure Algorithmic Placement Matching β The matching engine uses zero LLM calls. Weighted formula with eligibility gates ensures deterministic, fast, and explainable results.
| Collection | Purpose |
|---|---|
users |
User accounts (email, hashed password, role, name) |
profiles |
Parsed resume data (contact, skills, experience, education, projects) |
resumes |
Resume metadata with SHA-256 file hashes |
company_drives |
Placement drives with JD, criteria, package info |
applications |
Student applications to drives with status tracking |
student_academics |
CGPA, 10th/12th%, courses, skills, grade history |
course_catalog |
Course β skill mappings from curriculum PDFs |
learning_roadmaps |
Saved career paths and learning plan progress |
user_skills |
Saved skills, progress, and status tracking for users |
This project was built as a final-year B.Tech capstone project.