PYQ Finder

A web app to scrape, search, and download MIT Manipal previous year question papers.

Local-Only Docker Setup (Recommended)

This project now runs fully locally with Docker Compose:

Frontend + API served behind one URL: http://localhost:8080
Local SQLite database persisted in a Docker volume
Optional local PDF caching (saved under /data/pdfs inside backend container)
Selenium/Chromium included for Portal 1 and Portal 2 scraping

Prerequisites

Docker Desktop (or Docker Engine + Compose plugin)

1. Configure environment (optional)

cp .env.example .env

You can run without .env; defaults are provided in docker-compose.yml.

Admin access defaults:

Admin password is @Yush06012002! by default.
You should override it in .env (ADMIN_PASSWORD=...) or use ADMIN_PASSWORD_HASH for production.
You can also change the admin password from the Admin page; it is stored as a hash in SQLite and survives restarts.

2. Build and run

docker compose up -d --build

3. Open the app

http://localhost:8080
Host/port check: docker compose ps
Startup URL log: docker compose logs -f frontend
If 8080 is occupied on your machine, set PUBLIC_PORT in .env (example: PUBLIC_PORT=8090) and run docker compose up -d.

Stop / start behavior

Stop stack: docker compose down
Start existing stack: docker compose up -d
Services use restart: unless-stopped, so they come back when Docker starts (after initial creation).

Data persistence

Persistent volume: pyq_data
Stores:
- SQLite DB: /data/pyqfinder.db
- Local PDF cache: /data/pdfs
Recreating containers does not remove data unless volume is deleted.

Useful commands

docker compose logs -f backend
docker compose logs -f frontend
docker compose ps

Architecture

Frontend: SvelteKit static build served by Nginx
Backend: Flask + Gunicorn
Database: SQLite (local)
Scraping:
- Portal 1: requests + BeautifulSoup
- Portal 2: Selenium + Chromium

Notes on scraping and network

App infrastructure/data are local.
Scraping still needs outbound internet access to:
1. https://mitmpllibportal.manipal.edu/question-papers
2. https://libportal.manipal.edu/mit/Question%20Paper.aspx
Link scraping and bulk caching are separate actions:
1. Run scrape (Portal 1, Portal 2, or both) to collect linked PDFs.
2. Run "Download All Linked PDFs" from Admin to cache files locally.
Portal 2 parallel scraping:
1. In Admin, set Portal 2 Workers / Year up to 10.
2. Selected years are sharded so each year gets its own worker pool.
3. Total worker cap is controlled by PORTAL2_MAX_TOTAL_WORKERS (default 300).
Scrape status panel now includes a live event log and per-phase worker activity.
Scrape status events are verbose (year/session/folder/worker logs) and retained up to 3000 lines per run.
Backend uses a single Gunicorn worker by default so scrape state/progress is consistent across API calls.
Admin API routes are protected by cookie session auth + CSRF header checks and login attempt throttling.
Login lockouts now show a cooldown countdown in the Admin UI.
One-time duplicate cleanup for older data:
1. Open Admin and run "Preview (Dry Run)" in "Duplicate Cleanup".
2. If results look correct, run "Run Cleanup" to remove old duplicates and backfill dedupe keys.

Local (non-Docker) development

Backend

cd backend
pip install -r requirements.txt
python main.py

Frontend

cd frontend
npm install
npm run dev

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
backend		backend
frontend		frontend
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PYQ Finder

Local-Only Docker Setup (Recommended)

Prerequisites

1. Configure environment (optional)

2. Build and run

3. Open the app

Stop / start behavior

Data persistence

Useful commands

Architecture

Notes on scraping and network

Local (non-Docker) development

Backend

Frontend

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PYQ Finder

Local-Only Docker Setup (Recommended)

Prerequisites

1. Configure environment (optional)

2. Build and run

3. Open the app

Stop / start behavior

Data persistence

Useful commands

Architecture

Notes on scraping and network

Local (non-Docker) development

Backend

Frontend

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages