Skip to content

JoxNeis/ScholarScraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ScholarScraper

Project Status License Python Made with Love

Simple Scholar search engine from Google Scholar.

Overview

ScholarScraper is a lightweight Python-based tool designed to extract academic publication data directly from Google Scholar. It provides a simple interface to search for research papers, collect metadata such as titles, authors, publication years, citation counts, and links, and structure them into a usable format for analysis or integration into larger projects.

This project aims to simplify academic data collection for intelligent information retrieval final exams project in University of Surabaya

Features

  • Wrtier-based search – Enter any writer's papers from Google Scholar.
  • Topic-based search – Enter any topic or keyword to retrieve relevant papers from Google Scholar.
  • Extract structured data – Automatically fetch and organize paper details:
    • Title
    • Authors
    • Publication year
    • Citation count
    • Source link
  • Export capability – Save extracted data into CSV or JSON formats for further processing.
  • Lightweight and fast

How It Works

  1. The user inputs a search query.
  2. ScholarScraper sends a formatted request to Google Scholar’s search results page.
  3. It parses the HTML using Selenium to extract structured data (titles, authors, citations, etc.).
  4. Results are stored in a Pandas DataFrame, allowing easy export and analysis.

Evaluating How Google Scholar's Work

Evaluating google scholar's web page and how the page work is crucial for automation information retrieval. The evaluation can be accessed here.

Tech Stack

  • Python 3.13
  • Selenium – for dynamic content scraping
  • Pandas – for data organization and export

Disclaimer

Warning

This tool is intended for educational and research purposes only. Google Scholar does not provide an official public API, so excessive or automated requests may violate its terms of service. Please use responsibly.

About

A google scholar scraper and search engine based on similarities

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors