🔍 Skills Radar

A intelligent web application that extracts structured job data from URLs using LLMs, with smart caching and deduplication.

✨ Features

Drag & Drop URLs File or Single URL Input - Upload .txt files containing URLs or process individual URLs one at a time
Smart Caching - Cached jobs aren't scraped again unless refreshed
Interactive UI - In-place job card expansion with full details
Responsive Design - Works on desktop and mobile devices

🚀 Quick Start

Prerequisites

Python 3.9+
Playwright (for browser automation)

Local Development

Clone & Setup

git clone https://github.com/caiocrocha/SkillsRadar.git
cd SkillsRadar
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -r requirements.txt

Configure Environment

cp .env.example .env
# Edit .env with your API keys

Install Playwright Browsers
```
playwright install
```
Run Development Server
```
uvicorn app.main:app --reload
```
Visit http://localhost:8000

📁 Project Structure

SkillsRadar/
├── app/                 # FastAPI 
│   ├── main.py            
│   ├── routes.py          
│   └── settings.py        
├── scraper/             # Web
│   ├── llm.py           # Prompt and LLM Instantiation
│   ├── nodes.py         # LangGraph
│   ├── graph.py         # LangGraph
│   └── extract.py       # LLM Call and JSON Validator
├── models/              # Pydantic / JSON Schema
├── normalization/       # Data
├── pipelines/           # Processing
├── frontend/            # Web UI
│   ├── index.html         
│   ├── app.js             
│   └── styles.css         
├── data/                # Cached data
├── requirements.txt     # Python
└── README.md

🔌 API Endpoints

Method	Endpoint	Description
POST	`/upload`	Upload `.txt` file with URLs
POST	`/process-urls`	Process URLs with cache checking
POST	`/refresh-with-urls`	Force refresh (ignore cache)
POST	`/clear-cache`	Delete all cached data

💾 Data Storage

Cache Location: data/cache.json
Structure:
- files: Map of file hashes to URL lists
- all_urls: Map of URLs to their job data
Persistence: Automatically saved after each operation

🔄 How Deduplication Works

File Upload or URL Input: Backend checks if URLs exist in cache
Refresh: Forces re-scraping regardless of cache
Frontend: Filters duplicate URLs by source_url before display

🛠️ Configuration

Environment Variables

HF_API_KEY=your_huggingface_key

See .env.example for all options.

Built with: FastAPI • LangGraph • LLMs • Playwright • Vanilla JavaScript

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔍 Skills Radar

✨ Features

🚀 Quick Start

Prerequisites

Local Development

📁 Project Structure

🔌 API Endpoints

💾 Data Storage

🔄 How Deduplication Works

🛠️ Configuration

Environment Variables

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
app		app
frontend		frontend
models		models
normalization		normalization
pipelines		pipelines
scraper		scraper
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🔍 Skills Radar

✨ Features

🚀 Quick Start

Prerequisites

Local Development

📁 Project Structure

🔌 API Endpoints

💾 Data Storage

🔄 How Deduplication Works

🛠️ Configuration

Environment Variables

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages