A CLI scraper for collecting product reviews from Ceneo and Amazon.pl using Puppeteer.
The tool lets you:
- choose a source (
CeneoorAmazon), - provide a product ID,
- set how many reviews to collect,
- export results as one JSON file or many TXT files.
- Node.js 18+ (recommended)
- npm
- Internet connection
npm installRun:
node index.jsThen follow CLI prompts:
- Select service (
Ceneo/Amazon) - Enter product ID (e.g.
B0D3658SHDor163090037) - Choose number of reviews to collect
- Choose output format
Generated files are saved in:
reviews/ceneo/for Ceneoreviews/amazon/for Amazon
File name pattern:
data_<PRODUCT_ID>.json
Structure:
{
"productId": "...",
"productName": "...",
"engine": "Ceneo or Amazon",
"scrapedAt": "ISO date",
"totalReviews": 50,
"reviews": [
{
"score": "0.800",
"sentiment": "P",
"content": "Review text"
}
]
}One file per review, named like:
<PRODUCT_NAME>_<SCORE>_<SENTIMENT>_<INDEX>.txt
Where:
SCOREis normalized to0.000 - 1.000SENTIMENTis:P(positive)N(negative)
- The Amazon flow intentionally waits for manual user interaction (
[WAITING...]) after opening the review page. This helps handle anti-bot checks or login/captcha screens before scraping continues. - If no more pages/reviews are available, scraping stops early.
- Website layout changes may require selector updates in
index.js.
puppeteerpuppeteer-extrapuppeteer-extra-plugin-stealthinquirerchalk
This tool is for educational purposes only. Web scraping may violate the Terms of Service of the target websites.