Skip to content

Crawlee doesn't seem to provide .export_data_json() and export_data_csv() anymore #2112

@honzajavorek

Description

@honzajavorek

The Python lesson on building a scraper with framework uses methods which seem to be now removed from the framework. The export_data_json() was used mainly to pass down configuration and get an indented JSON which is suitable for reading by humans, in this particular case course students:

...
We can also export all the items to a single file of our choice. We'll do it at the end of the `main()` function, after the crawler has finished scraping:

```py
async def main():
    ...

    await crawler.run(["https://warehouse-theme-metal.myshopify.com/collections/sales"])
    # highlight-next-line
    await crawler.export_data_json(path='dataset.json', ensure_ascii=False, indent=2)
    # highlight-next-line
    await crawler.export_data_csv(path='dataset.csv')
```

After running the scraper again, there should be two new files in your directory, `dataset.json` and `dataset.csv`, containing all the data. If we peek into the JSON file, it should have indentation.
...

According to my reasearch, there is no way to achieve this with Crawlee as of now. That leads me to think that I should re-open apify/crawlee-python#526, but neverheless, this is currently a bug in the course, because the code suggested in the lesson won't work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    t-academyIssues related to Web Scraping and Apify academies.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions