Skip to content

Nullable column missing fails validate(..., cast=True) #253

@mattijsdp

Description

@mattijsdp

If a column is nullable, why do we not allow dataframely to create the column if it doesn't exist yet (filled with nulls) when casting? I realise you've probably thought about this but I couldn't find an explicit mention of this in the docs or GitHub issues.

import dataframely as dy
import polars as pl

class TableSchema(dy.Schema):
    column_a = dy.String(nullable=False)
    column_b = dy.String(nullable=True)

df = pl.DataFrame({"column_a": 0})
TableSchema.validate(df, cast=True)

# Raises SchemaError
# SchemaError: 1 missing columns for schema 'TableSchema': 'column_b'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions