Skip to content

Feature Request: Add option on editting default loading code, depending on the file extension #635

@heji3838

Description

@heji3838

Request

Add option on editting default loading code, depending on the file extension.

Background

  • When you have heavy files, pandas backend takes much time to read.
  • if your parquet files have Enum datatype, created with polars, pandas doesn't recognize it.

Issue #622 on add support for polars as backend will be a help for it.

As another solution, especially when you just want to make sure the parquet schema, you can rewrite the loading code for more quick and comfortable response as:

import polars as pl
df = pl.scan_parquet({filepath}).head(100).collect().to_pandas()

without change for the backend engine, with the power of polars LazyFrame.

This will save the time and effort additionally instead of reading all the rows.

And if you have other types of files not supported by the default engine, you will be able to edit default loading code to read and convert, like if the extension is .fgb, use geopandas to read and convert to pandas.

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureFeature request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions