Vinayak Das Gupta vinayakdasgupta

Vinayak Das Gupta

Welcome to my GitHub profile. I work on digital humanities, text analysis, and cultural infrastructure.
Most of my work focuses on computational text processing, topic modelling, and public archival systems, with a particular interest in low-resource languages like Bengali.

My current projects explore how machine learning techniques such as Latent Dirichlet Allocation (LDA), named entity recognition (NER), and keyword extraction can be adapted for literary corpora, cultural datasets, and historical archives. I work extensively with Bengali texts, using natural language processing (NLP) to model themes, classify documents, and build interpretive tools for researchers and students.

Featured Projects

anvay: A web-based topic modelling tool for Bengali text corpora. Supports custom preprocessing, visualisations, and interpretive reports. Built with Python, Gensim, Flask, and Plotly.
gridOCR: gridOCR is a desktop OCR tool for digitising historical printed books and periodicals.

Areas of Interest

Bengali NLP | Topic Modelling | Text Mining
Digital Humanities in India | Cultural Analytics | Archive Infrastructure
Corpus Linguistics | Open-source Research Tools | Visualisation of Text Data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vinayak Das Gupta vinayakdasgupta

Block or report vinayakdasgupta

Vinayak Das Gupta

Featured Projects

Other Links

Areas of Interest

Pinned Loading

Uh oh!