Skip to content

Add getcolumn to parquetdataframe#266

Merged
bleakley merged 1 commit into
masterfrom
getcolumn-parquetdataframe
Jun 23, 2025
Merged

Add getcolumn to parquetdataframe#266
bleakley merged 1 commit into
masterfrom
getcolumn-parquetdataframe

Conversation

@bleakley

Copy link
Copy Markdown
Contributor

No description provided.

@bleakley bleakley requested review from platypii and severo June 20, 2025 23:09

@platypii platypii left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically this could be made faster if we used parquetReadColumn which would return the column data as typed arrays, instead of transposing into rows and back. It would need to be added to the worker as a message type though. This is fine for now, but something to think about for the future.
https://github.com/hyparam/hyparquet/blob/v1.16.2/src/read.js#L105

@severo severo left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @platypii that it could be more efficient by avoiding reading all the columns + by using typedarrays, but as we're reworking the DataFrame (hyparam/hightable#208), we will have to refactor again soon.

@bleakley bleakley merged commit 6b16362 into master Jun 23, 2025
8 checks passed
@bleakley bleakley deleted the getcolumn-parquetdataframe branch June 23, 2025 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants