-
Notifications
You must be signed in to change notification settings - Fork 143
Open
Labels
t-academyIssues related to Web Scraping and Apify academies.Issues related to Web Scraping and Apify academies.
Description
The Python lesson on building a scraper with framework uses methods which seem to be now removed from the framework. The export_data_json() was used mainly to pass down configuration and get an indented JSON which is suitable for reading by humans, in this particular case course students:
...
We can also export all the items to a single file of our choice. We'll do it at the end of the `main()` function, after the crawler has finished scraping:
```py
async def main():
...
await crawler.run(["https://warehouse-theme-metal.myshopify.com/collections/sales"])
# highlight-next-line
await crawler.export_data_json(path='dataset.json', ensure_ascii=False, indent=2)
# highlight-next-line
await crawler.export_data_csv(path='dataset.csv')
```
After running the scraper again, there should be two new files in your directory, `dataset.json` and `dataset.csv`, containing all the data. If we peek into the JSON file, it should have indentation.
...According to my reasearch, there is no way to achieve this with Crawlee as of now. That leads me to think that I should re-open apify/crawlee-python#526, but neverheless, this is currently a bug in the course, because the code suggested in the lesson won't work.
Metadata
Metadata
Assignees
Labels
t-academyIssues related to Web Scraping and Apify academies.Issues related to Web Scraping and Apify academies.