Skip to content

Add ability to load and inspect individual datasets #261

@npatki

Description

@npatki

Problem Description

The SDGym library currently allows you to list the available datasets for benchmarking purposes. However, it does not offer any abilities to inspect these datasets -- users may want to do this in order to see what the columns, data types, or values look like before they apply them to the benchmarking run.

Expected behavior

Add a download_demo method that is similar to the one in the SDV library. This method would return the data and metadata so that SDGym users can inspect the dataset.

Workaround

The SDV library is a prerequisite of SDGym. So as a workaround, you can access the demo datasets through it.

import sdv

from sdv.datasets.demo import download_demo

data, metadata = download_demo(
    modality='single_table',
    dataset_name='adult'
)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions