Skip to content

schema validation #3

@colinator

Description

@colinator

I've been thinking about validation of these formats - dynamic, self-describing formats. The more I think about it, the more I think it's "non-trivial".

It'd certainly be useful - we want to be able to validate messages against schemas that can validate the correctness of the structure (which in my use case might include tensor shape validation), and possibly against value content; i.e. "this number must be 10 < x < 19", or "this string must obey this regex". Things like jsonschema perform this now against json. It'd be, maybe not trivially, but doable possible to create a nifty compile-time schema, or something by which a message code could emit a schema for itself.

But for all such validation schemes, it seems like they'd at least involve validating the structure, and possibly examining each map string key, thus needing to, at least, walk the message structure. And that might be more expensive than actually using the data, since zerialize allows us to use just what we need, without touching other parts. Of course, it depends on use-case; for example, actually using some large tensor might be way more expensive than anything else, including deserialization and/or heap allocation... So it would seem that validation is just a necessarily expensive step. Which, of course, conflicts with one of the main goals: performance. Users could use the pattern 'validate upon ingestion, trust thereafter', but that seems risky.

And of course, if you have a schema for validation, you might as well just use a schema-ful protocol, no? Which defeats the whole purpose of dynamic protocols - the 'truly distributed development' power - you can ingest and use json without having to locate and compile and use a schema. A very powerful idea. So: schemas for dynamic protocols: still useful? I still say yes - it can still bring benefit, and maybe be performant enough if we're careful.

Thoughts welcome.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions