Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion .github/workflows/build-docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,16 @@ jobs:
version: "0.9.28"
- name: Install dependencies
run: |
uv sync --extra docs
uv sync --extra docs --extra dev

- name: Generate ontology visualization
run: |
uv run python docs/scripts/build_ontology_viz.py
uv run pre-commit run --files docs/assets/graflo-ontology-viz/*.json || true

- name: Verify committed viz assets are fresh
run: |
git diff --exit-code docs/assets/graflo-ontology-viz/

- name: Build site
run: |
Expand Down
14 changes: 14 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,20 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [1.8.0]

### Added

- **GraFlo meta-ontology** — OWL vocabulary at `https://ontology.growgraph.dev/graflo` (`owl:versionIRI` `…/1.0.0`, `owl:versionInfo` `1.0.0`) describing `GraphManifest`, `Schema`, `IngestionModel`, `ProtoTransform`, pipeline actor steps, bindings, and related enumerations. Shipped as `graflo/rdf/ontology/graflo.ttl` plus JSON-LD context `graflo-context.jsonld`.
- **`graflo.rdf`** — `ManifestRdfSerializer` / `ManifestRdfDeserializer` for bidirectional conversion between `GraphManifest` (YAML/Pydantic) and RDF (Turtle, JSON-LD, N-Triples, RDF/XML).
- **CLI** — `manifest-to-rdf` and `rdf-to-manifest` console scripts (`graflo.rdf.cli`).

### Documentation

- **[GraFlo ontology](docs/model/graflo_ontology.md)** — meta-model vs user-domain RDF (`RdfInferenceManager`), versioning, URI layout, CLI, and round-trip semantics.
- **Interactive ontology visualization** — custom hierarchical class graph (rectangular nodes, subClassOf and optional property edges, pan/zoom) embedded on the GraFlo ontology page; built via `docs/scripts/build_ontology_viz.py` with committed assets under `docs/assets/graflo-ontology-viz/`.
- **README** and **docs index** — feature overview and quick links for manifest ↔ RDF workflows.

## [1.7.33]

### Added
Expand Down
9 changes: 6 additions & 3 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
abstract: <p><span>A framework for transforming</span><span>&nbsp;</span><strong>tabular</strong><span>&nbsp;</span><span>data</span><span>&nbsp;(</span><span>CSV,</span><span>&nbsp;</span><span>SQL</span><span>)&nbsp;</span><span>and</span><span>&nbsp;</span><strong>hierarchical</strong><span>&nbsp;</span><span>data</span><span>&nbsp;(</span><span>JSON,</span><span>&nbsp;</span><span>XML</span><span>)&nbsp;</span><span>into
property graphs and ingesting them into graph databases</span><span>&nbsp;(</span><span>ArangoDB,</span><span>&nbsp;</span><span>Neo4j</span><span>)</span><span>.</span></p>
abstract: >-
Manifest-driven graph schema and ingestion for labeled property graphs:
define schemas in GraphManifest (YAML/Python), ingest from CSV/JSON/Parquet/SQL/RDF/SPARQL/API,
infer from PostgreSQL 3NF or OWL/RDFS, apply schema migrations, and project to ArangoDB, Neo4j,
TigerGraph, FalkorDB, Memgraph, or NebulaGraph.
authors:
- affiliation: GrowGraph
family-names: Belikov
Expand All @@ -11,5 +14,5 @@ doi: 10.5281/zenodo.15446131
license: []
license-url: https://github.com/growgraph/graflo/blob/main/LICENSE
message: If you use this software, please cite it using the metadata from this file.
title: graflo
title: GraFlo — Graph Schema & Transformation Language (GSTL)
type: software
41 changes: 38 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,19 @@
[![pre-commit](https://github.com/growgraph/graflo/actions/workflows/pre-commit.yml/badge.svg)](https://github.com/growgraph/graflo/actions/workflows/pre-commit.yml)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.15446131.svg)]( https://doi.org/10.5281/zenodo.15446131)

**GraFlo** is a manifest-driven toolkit for **labeled property graphs (LPGs)**: describe vertices, edges, and ingestion (`GraphManifest` — YAML or Python), then project and load into a target graph database.
**GraFlo** is a manifest-driven schema and ingestion layer for **labeled property graphs (LPGs)**.
Write a `GraphManifest` (YAML or Python) once — it defines vertices, edges, typed properties,
identities, and DB profile — then infer, validate, migrate, and load into any supported graph engine.

### What you get

- **One pipeline, several graph databases** — The same manifest targets ArangoDB, Neo4j, TigerGraph, FalkorDB, Memgraph, or NebulaGraph; `DatabaseProfile` and DB-aware types absorb naming, defaults, and indexing differences.
- **Explicit identities** — Vertex identity fields and indexes back upserts so reloads merge on keys instead of blindly duplicating nodes.
- **Reusable ingestion** — `Resource` actor pipelines (including **vertex_router** / **edge** steps) bind to files, SQL, SPARQL/RDF, APIs, or in-memory batches via `Bindings` and the `DataSourceRegistry`.
- **Schema as the contract** — `GraphManifest` is the single source of truth: vertex/edge definitions,
typed properties, identity fields, and DB profile are validated at `finish_init` time, not at
write time. Schema migrations are first-class (`graflo migrate_schema`).
- **Manifest as linked data** — The [GraFlo ontology](https://growgraph.github.io/graflo/model/graflo_ontology/) (`gf:` at `ontology.growgraph.dev`) lets you export manifests to RDF and round-trip them for tooling, provenance, and SPARQL-facing catalogs.

### What’s in the manifest

Expand Down Expand Up @@ -56,7 +62,8 @@ The graph engines listed in **What you get** are the supported **output** `DBTyp

## More capabilities

- **SPARQL & RDF** — Endpoints and RDF files (`.ttl`, `.rdf`, `.n3`, …); optional OWL/RDFS schema inference (`rdflib`, `SPARQLWrapper` in the default install).
- **GraFlo ontology (manifest RDF)** — Serialize any `GraphManifest` to RDF (Turtle, JSON-LD) using the published vocabulary at [`https://ontology.growgraph.dev/graflo`](https://ontology.growgraph.dev/graflo) (`owl:versionInfo` **1.0.0**). Covers schema, ingestion (resources, transforms, pipeline actors), and bindings. Round-trip via `graflo.rdf` or the `manifest-to-rdf` / `rdf-to-manifest` CLI. This is the **meta-model** of GraFlo itself — distinct from importing a **domain** OWL ontology into an LPG schema (`RdfInferenceManager`). Details: [docs — GraFlo ontology](https://growgraph.github.io/graflo/model/graflo_ontology/).
- **SPARQL & RDF** — Endpoints and RDF files (`.ttl`, `.rdf`, `.n3`, …); optional OWL/RDFS **domain** schema inference (`rdflib`, `SPARQLWrapper` in the default install).
- **Schema inference** — From PostgreSQL-style 3NF layouts (PK/FK heuristics) or from OWL/RDFS (`owl:Class` → vertices, `owl:ObjectProperty` → edges, `owl:DatatypeProperty` → vertex fields).
- **Schema migrations** — Plan and apply guarded schema deltas (`migrate_schema` console script → `graflo.cli.migrate_schema`; library in `graflo.migrate`; see docs).
- **Typed `properties`** — Optional field types (`INT`, `FLOAT`, `STRING`, `DATETIME`, `BOOL`) on vertices and edges.
Expand Down Expand Up @@ -212,7 +219,35 @@ caster = Caster(schema=schema, ingestion_model=ingestion_model)
# ... continue with ingestion
```

### RDF / SPARQL Ingestion
### Manifest ↔ RDF (GraFlo ontology)

```bash
# Serialize manifest YAML to Turtle (embeds gf: vocabulary when --include-ontology is default)
uv run manifest-to-rdf manifest.yaml \
--base-uri https://growgraph.dev/manifests/mygraph/v1 \
--format turtle \
--output mygraph.ttl

# Restore YAML from RDF
uv run rdf-to-manifest mygraph.ttl \
--manifest-uri https://growgraph.dev/manifests/mygraph/v1 \
--output manifest.restored.yaml
```

```python
from graflo import GraphManifest
from graflo.rdf import ManifestRdfDeserializer, ManifestRdfSerializer

manifest = GraphManifest.from_yaml("manifest.yaml")
base = "https://growgraph.dev/manifests/mygraph/v1"

ttl = ManifestRdfSerializer().to_turtle(manifest, base)
restored = ManifestRdfDeserializer().from_turtle(ttl, base.rstrip("/"))
```

Ontology source: `graflo/rdf/ontology/graflo.ttl`. See [GraFlo ontology](https://growgraph.github.io/graflo/model/graflo_ontology/).

### RDF / SPARQL Ingestion (domain ontology → LPG)

```python
from pathlib import Path
Expand Down
Loading