Cobjectric

Complex Object Metric - A Python library for computing metrics on complex objects (JSON, dictionaries, lists, etc.).

📖 Description

Cobjectric is a library designed to help developers calculate metrics on complex objects such as JSON, dictionaries, and arrays. It was originally created for Machine Learning projects where comparing and evaluating generated JSON structures against ground truth data was a repetitive manual task.

📦 Installation

pip install cobjectric

🚀 Core Features

Cobjectric provides three main functionalities for analyzing complex structured data:

1. Fill Rate - Measure Data Completeness

Compute how "complete" your data is by measuring which fields are filled vs missing.

from cobjectric import BaseModel

class Person(BaseModel):
    name: str
    age: int
    email: str

person = Person.from_dict({
    "name": "John Doe",
    "age": 30,
    # email is missing
})

result = person.compute_fill_rate()
print(result.fields.name.value)   # 1.0 (present)
print(result.fields.age.value)    # 1.0 (present)
print(result.fields.email.value)  # 0.0 (missing)
print(result.mean())              # 0.667 (2 out of 3 fields filled)

Use cases: Data quality assessment, completeness scoring, field-level statistics.

2. Fill Rate Accuracy - Compare Completeness States

Compare the completeness of two models (got vs expected). Focus on field state (filled/missing), not on actual values.

got = Person.from_dict({"name": "John", "age": 30})           # email missing
expected = Person.from_dict({"name": "Jane", "age": 25, "email": "jane@example.com"})

accuracy = got.compute_fill_rate_accuracy(expected)
print(accuracy.fields.name.value)   # 1.0 (both filled)
print(accuracy.fields.age.value)    # 1.0 (both filled)
print(accuracy.fields.email.value)  # 0.0 (got missing, expected filled)
print(accuracy.mean())              # 0.667 (2 out of 3 states match)

Note: Fill Rate Accuracy compares state only (field present/missing), not values. To validate actual values, use Similarity.

Use cases: Validation pipelines, comparing generated vs expected data structures, quality control.

3. Similarity - Compare Values with Fuzzy Matching

Compare field values between two models with support for fuzzy text matching via rapidfuzz and intelligent list alignment strategies.

from cobjectric import BaseModel, Spec, ListCompareStrategy
from cobjectric.similarity import fuzzy_similarity_factory

class Person(BaseModel):
    name: str = Spec(similarity_func=fuzzy_similarity_factory("WRatio"))
    tags: list[Tag] = Spec(list_compare_strategy=ListCompareStrategy.OPTIMAL_ASSIGNMENT)

got = Person.from_dict({"name": "John Doe", "tags": [...]})
expected = Person.from_dict({"name": "john doe", "tags": [...]})

similarity = got.compute_similarity(expected)
print(similarity.fields.name.value)  # 0.99 (fuzzy match despite case difference)
print(similarity.fields.tags.mean()) # Uses optimal assignment for best matching

Key features:

Fuzzy text matching via rapidfuzz: handles typos, case differences, word order
List alignment strategies:
- PAIRWISE: Compare by index (default)
- LEVENSHTEIN: Order-preserving alignment based on similarity
- OPTIMAL_ASSIGNMENT: Hungarian algorithm for best one-to-one matching
Numeric similarity: Gradual similarity based on difference thresholds

Use cases: ML model evaluation, fuzzy matching, comparing generated text with ground truth, list item matching.

Additional Features

Pre-defined Specs: Optimized Specs for common types (KeywordSpec, TextSpec, NumericSpec, BooleanSpec, DatetimeSpec)
Contextual Normalizers: Normalizers that receive field context for intelligent type coercion
Statistical Aggregation: mean(), std(), var(), min(), max(), quantile() on all results
Nested Models: Recursive computation on complex structures
List Aggregation: Access aggregated statistics across list items via items.aggregated_fields.name.mean()
Path Access: result["address.city"] or result["items[0].name"]
Pandas Export: Export results to pandas Series and DataFrames for analysis (requires cobjectric[pandas])
Custom Functions: Define your own fill rate, accuracy, or similarity functions per field
Field Normalizers: Transform values before validation

See the documentation for complete details.

📚 Full Documentation

📖 https://cobjectric.nigiva.com

The documentation includes:

Quick Start - Get started in 5 minutes
Examples - Real-world usage examples
Complete API Reference - All classes and functions
Feature Guides - In-depth guides for all features

🛠️ Development

Getting Started

Prerequisites

Python 3.13.9 or higher
uv - Fast Python package installer

Install dependencies with uv (including optional extras for testing):

uv sync --dev --all-extras

Install pre-commit hooks:

uv run pre-commit install --hook-type pre-push

Available Commands

The project uses invoke for task management.

To see all available commands:

uv run inv --list
# or shorter:
uv run inv -l

To get help on a specific command:

uv run inv --help <command>
# Example:
uv run inv --help precommit

Release Guide

See the RELEASE.md file for the release guide.

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

Citing Cobjectric

If you use Cobjectric in your research or projects, please consider citing it:

@software{cobjectric2025,
  author = {Nigiva},
  title = {Cobjectric: A Library for Computing Metrics on Complex Objects},
  year = {2025},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/nigiva/cobjectric}},
  version = {3.0.0}
}

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
.github		.github
docs		docs
src/cobjectric		src/cobjectric
tasks		tasks
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
RELEASE.md		RELEASE.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
ruff.toml		ruff.toml
typos.toml		typos.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cobjectric

📖 Description

📦 Installation

🚀 Core Features

1. Fill Rate - Measure Data Completeness

2. Fill Rate Accuracy - Compare Completeness States

3. Similarity - Compare Values with Fuzzy Matching

Additional Features

📚 Full Documentation

🛠️ Development

Getting Started

Available Commands

Release Guide

📝 License

Citing Cobjectric

About

Uh oh!

Releases 9

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Cobjectric

📖 Description

📦 Installation

🚀 Core Features

1. Fill Rate - Measure Data Completeness

2. Fill Rate Accuracy - Compare Completeness States

3. Similarity - Compare Values with Fuzzy Matching

Additional Features

📚 Full Documentation

🛠️ Development

Getting Started

Available Commands

Release Guide

📝 License

Citing Cobjectric

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 9

Contributors

Uh oh!

Languages