Skip to content

Schema: Croissant/schema.org Extension Design #4

@PipFoweraker

Description

@PipFoweraker

Goal

Design a JSON-LD schema that extends Croissant / schema.org to represent interrogatory model cards, maximizing interoperability with existing tooling.

Why Croissant/schema.org?

  1. JSON-LD foundation - machine-readable, web-native, supports linked data
  2. Existing adoption - HuggingFace, Kaggle, Google Dataset Search use Croissant
  3. Extensible - designed for domain-specific vocabularies
  4. Future-proof - constrains future users least; can render to other formats

Design Questions

Namespace & Vocabulary

  • Define @context extending schema.org and Croissant
  • Identify which schema.org types to reuse (e.g., SoftwareSourceCode, Dataset, CreativeWork)
  • Define new types: InterrogatoryModelCard, ModelClaim, EvidenceLink, InterrogatoryPrompt

CAN/SHOULD/MUST Representation

  • How to represent requirement levels in JSON-LD?
  • Consider RFC 2119 terminology mapping
  • Schema validation: which fields required vs optional at each level?

Relationship to CycloneDX ML-BOM

  • CycloneDX v1.5+ has modelCard field - should we output compatible fragments?
  • Supply chain use case: can interrogatory cards feed into ML-BOM?

Evidence Linking Schema

  • How to represent dataset_version, eval_script_commit, run_hash?
  • Git-based provenance: reference commits, tags, releases?
  • External artifact stores (HF, Weights & Biases, DVC)?

Deliverables

  1. schema/interrogatory-model-card.jsonld - JSON-LD context definition
  2. schema/interrogatory-model-card.schema.json - JSON Schema for validation
  3. Example card in both formats demonstrating all field types

Resources

Open Questions

  1. Should the schema be strict (all MUST fields required) or permissive (flag missing fields)?
  2. How to handle proprietary models with limited disclosure?
  3. Versioning strategy for the schema itself?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions