Skip to content

Consider support ELLPACK format #359

@howsiyu

Description

@howsiyu

A lot of features matrices in practice have small number of non-zero entries per row. E.g. data that come from one-hot encoding have exactly one non-zero entry per row. These can be handled nicely by CategoricalMatrix if all the non-zero entries are one. However, this is not always the case, e.g. data that comes from sklearn.preprocessing.SplineTransformer. These would be nicely supported by ELLPACK format which is a natural generalization of CategoricalMatrix.

Another option is to support Sliced Ellpack (SELL) format which can support general sparse matrix relatively well and make SplitMatrix consists of just a dense matrix and a SELL matrix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions