This project is focused on scraping and analyzing data from Goodreads.com, with a dual focus: exploring "Best Books" lists from a specific year and conducting a detailed analysis of works by a specific author. The project is structured into two main tasks: Best Books Analysis: Analysis of Goodreads' "Best Books" list for a given year. This includes scraping data like book titles, publication dates, authors, genres, ratings, and more. Author-Level Analysis: Detailed exploration of books by a designated author, collecting similar data points and performing specific analytical inquiries related to the author's works over time.
Scraped the "Best Books of [Year]" list from Goodreads. For example, if assigned the year 2023, the URL would be https://www.goodreads.com/list/best_of_year/2023. Data points include Title, Publication Date, Author, Genre, Average Rating, Number of Ratings, Number of Pages, Rank, Language, Current Readers, and Want-to-Read counts.
Scraped all books by a specific author from their Goodreads profile. For example, Stephen King’s profile for initials A–E. Data points similar to Task 1, with additional analysis on language distribution and the relationship between the author's age at publication and various metrics (page count, ratings).
Evaluated how average ratings vary across genres.
Investigated the relationship between the number of ratings a book receives and its average rating.
Analyzed the change in page count and book ratings as the author aged.
Explored correlations between reader interest (currently reading and want-to-read counts) and book ratings.
Identified the genres with the highest average ratings. Examined whether more popular books (higher ratings count) receive better or worse ratings. Analyzed the publishing trends of [Author Name], including how their writing evolved over time. Correlations found between page count, ratings, and reader interest.
Scatterplots and line graphs illustrating the relationship between various metrics such as ratings vs. popularity, page count over time, and author's age against book characteristics. Tables summarizing average ratings and ranks by genre, and books by language.
This project provides a comprehensive analysis of book trends on Goodreads, offering insights into reader preferences and authorial evolution over time. The findings can aid publishers, writers, and marketers in understanding the dynamics of book popularity and author development.