Skip to content

[Data Tests] Add ability to test your data #192

@aslotte

Description

@aslotte

Is your feature request related to a problem? Please describe.
Part of the model flow is to create and add data validation tests. These may check the sanity of your data, e.g. that certain columns have a specific cardinality (e.g. only 5 different kinds of values), or that a numeric data column has a specific range.

Creating these tests are pretty repetitive.

Describe the solution you'd like
Add a Fluent API to validate data structure.

I'm envisioning a syntax such as the following in a new project, e.g. MLOps.NET.Data.Tests

[TestMethod]
public void VerifyCardinalityOfColumn()
{
    var mlOpsTestingContext = new MLOpsTestingContext();

    mlOpsTestingContext.WithData(pathToData) 
        .HasColumn(index, x => x.WithCardinality(3))
        .Assert()
}

[TestMethod]
public void VerifyRangeOfColumn()
{
    var mlOpsTestingContext = new MLOpsTestingContext();

    mlOpsTestingContext.WithData(pathToData) 
        .HasColumn(index, x => x.WithRange(min: 0, max: 10000)
        .Assert()
}

[TestMethod]
public void VerifySchema()
{
    var mlOpsTestingContext = new MLOpsTestingContext();

    mlOpsTestingContext.WithData(pathToData) 
        .HasNumberOfColumns(10)
        .HasMinimumNumberOfRows(5000)
        .Assert()
}

[TestMethod]
public void VerifyColumOnlyContainsApprovedValues()
{
    var mlOpsTestingContext = new MLOpsTestingContext();

    mlOpsTestingContext.WithData(pathToData) 
        .HasColumn(index, x => x.WithValues(listOfApprovedValues)
        .Assert()
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions