growgraph · alexander-belikov · May 18, 2026 · May 17, 2026 · May 18, 2026 · May 18, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -5,14 +5,44 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [1.7.30]
+
+### Added
+
+- **`tolerate_transform_errors`** on **`ResourceConfig`** (default **`true`**) — a failing transform step sets its declared output fields to **`None`**, records a **`failure_kind=transform`** row in the doc error sink, and the rest of the resource pipeline (vertices, edges, later transforms) continues for that document. Set **`tolerate_transform_errors: false`** to fail fast on transform exceptions.
+
+### Changed
+
+- **`VertexActor` + `from_doc`** — transform-buffer projection is selective: only **`TransformPayload`** entries whose **`named`** keys cover the **`from_doc`** source fields are consumed, so dressed or pivot outputs for other vertex types are not stolen. Dressed dict payloads (`__transformed_value#*`) are handled consistently with passthrough from the merged observation doc.
+- **Blank vertices in `VertexConfig`** — mark placeholder types with **`blank: true`** on each **`Vertex`** (identity defaults to **`id`**). **`VertexConfig.blank_vertices`** is now a derived name list, not a separate manifest field. Runtime **`ResourceRuntime`** scopes **`VertexConfig`** to vertices referenced by the resource pipeline only; unreferenced blank types are no longer injected automatically.
+- **Ingestion contract layout** — declarative **`ResourceConfig`** lives under **`graflo.architecture.contract.ingestion`**; schema-bound execution is **`ResourceRuntime`** / **`build_resource_runtime`** under **`graflo.architecture.contract.runtime`**. **`Resource`** remains an internal alias for **`ResourceConfig`**.
+
+### Breaking
+
+- **Top-level `blank_vertices` on `vertex_config`** — no longer read from manifests; set **`blank: true`** on the corresponding **`vertices`** entries instead (silent ignore under `extra="ignore"` if the old key is left in place).
+- **Runtime blank vertex scope** — blank vertex types must appear in the resource pipeline (or edge inference selectors) to be present in the per-resource runtime **`VertexConfig`**; relying on schema-wide blank placeholders without a matching actor step will not add them at cast time.
+- **Imports** — prefer **`ResourceConfig`** from **`graflo.architecture.contract`** (or **`graflo.architecture.contract.ingestion`**); **`graflo.architecture.contract.declarations.resource`** is not the canonical module path.
+
+### Documentation
+
+- **[Document cast errors](docs/concepts/ingestion_doc_errors.md)** — **`tolerate_transform_errors`** and transform failure records.
+- **[Core components](docs/concepts/core_components.md)** — **`ResourceConfig`** / **`ResourceRuntime`**, per-vertex **`blank`**, **`from_doc`** with dressed transforms, identity defaults.
+- **[Architecture diagrams](docs/concepts/architecture_diagrams.md)** — contract and blank-vertex model aligned with 1.7.30.
+- **[Creating a manifest](docs/getting_started/creating_manifest.md)** — **`tolerate_transform_errors`** and blank vertex YAML.
+
+## [1.7.29]
+
+### Added
+
+- **Empty-identity filter on cast batches** — after resource casting, **`Caster`** can drop vertex docs and edge tuples whose schema identity fields are all missing, `null`, or `""` before **`DBWriter`** (identity rules from **`VertexConfig`**, not **`GraphContainer`**). Controlled by **`IngestionParams.drop_empty_identity_docs`** (default **`true`**). Blank vertex collections are exempt.
+
 ## [1.7.27]
 
 ### Added
 
 - **`ColumnTimeFilter`** — shared pandas-like time window on a single column (`column`, optional `start` / `end`, optional `interval` as a **`pandas.Timedelta`** string such as `"7D"` or `"2h"` for day/hour windows, optional `not_equals`, optional `start_inclusive` / `end_inclusive`). Rendered to SQL via **`FilterExpression`** (same path as other pushdown filters). Calendar-style offsets (for example month arithmetic) are not supported when `pandas.Timedelta` rejects the string; use explicit `start` / `end` ISO bounds instead.
 - **`FileConnector.time_filter`** and **`TableConnector.time_filter`** — canonical field replacing duplicated `date_field` / `date_filter` / `date_range_*` fields on the wire.
 - **Bindings — runtime connector patches**: **`ConnectorUpdate`**, **`Bindings.apply_connector_update`**, and **`Bindings.replace_connector`** so defining-field changes re-hash and reindex correctly while preserving **`conn_proxy`** wiring. Patches are applied **after** manifest load (not stored on `GraphManifest`).
-- **Empty-identity filter on cast batches** — after resource casting, **`Caster`** can drop vertex docs and edge tuples whose schema identity fields are all missing, `null`, or `""` before **`DBWriter`** (identity rules from **`VertexConfig`**, not **`GraphContainer`**). Controlled by **`IngestionParams.drop_empty_identity_docs`** (default **`true`**). Blank vertex collections are exempt.
 
 ### Breaking
 

diff --git a/docs/concepts/architecture_diagrams.md b/docs/concepts/architecture_diagrams.md
@@ -122,10 +122,10 @@ classDiagram
     }
 
     class IngestionModel {
-        +resources: list~Resource~
+        +resources: list~ResourceConfig~
         +transforms: list~ProtoTransform~
         +finish_init(core_schema)
-        +fetch_resource(name) Resource
+        +fetch_resource_config(name) ResourceConfig
     }
 
     class GraphMetadata {
@@ -136,13 +136,15 @@ classDiagram
 
     class VertexConfig {
         +vertices: list~Vertex~
-        +blank_vertices: list~Vertex~
+        +identity_from_all_properties: bool
+        +blank_vertices: list~str~
     }
 
     class Vertex {
         +name: str
         +identity: list~str~
         +properties: list~Field~
+        +blank: bool
         +filters: FilterExpression?
     }
 
@@ -165,11 +167,16 @@ classDiagram
         +filters: FilterExpression?
     }
 
-    class Resource {
+    class ResourceConfig {
         +name: str
-        +root: ActorWrapper
+        +pipeline: list~dict~
+        +tolerate_transform_errors: bool
+    }
+
+    class ResourceRuntime {
+        +config: ResourceConfig
+        +vertex_config: VertexConfig
         +executor: ActorExecutor
-        +finish_init(vertex_config, edge_config, transforms)
     }
 
     class ActorWrapper {
@@ -224,7 +231,7 @@ classDiagram
     Schema *-- CoreSchema : core_schema
     CoreSchema *-- VertexConfig : vertex_config
     CoreSchema *-- EdgeConfig : edge_config
-    IngestionModel *-- "0..*" Resource : resources
+    IngestionModel *-- "0..*" ResourceConfig : resources
     IngestionModel *-- "0..*" ProtoTransform : transforms
 
     VertexConfig *-- "0..*" Vertex : vertices
@@ -235,8 +242,9 @@ classDiagram
     Edge *-- "0..*" Field : properties
     Edge --> FilterExpression : filters
 
-    Resource *-- ActorWrapper : root
-    Resource *-- ActorExecutor : runtime orchestration
+    ResourceRuntime *-- ResourceConfig : config
+    ResourceRuntime *-- ActorWrapper : root
+    ResourceRuntime *-- ActorExecutor : runtime orchestration
     ActorWrapper --> Actor : actor
     ActorExecutor ..> ExtractionContext : produces
     ActorExecutor ..> AssemblyContext : consumes
@@ -378,5 +386,5 @@ These are the two key abstractions that decouple *data retrieval* from *graph tr
 
 - **DataSources** (`AbstractDataSource` subclasses) — handle *where* and *how* data is read. Each carries a `DataSourceType` (`FILE`, `SQL`, `SPARQL`, `API`, `IN_MEMORY`). Many DataSources can bind to the same Resource by name via the `DataSourceRegistry`.
 
-- **Resources** (`Resource`) — handle *what* the data becomes in the LPG. Each Resource is a reusable actor pipeline (descend → transform → vertex → edge) that maps raw records to graph elements. Because DataSources bind to Resources by name, the same transformation logic applies regardless of whether data arrives from a file, an API, or a SPARQL endpoint.
+- **Resources** (`ResourceConfig` → `ResourceRuntime`) — handle *what* the data becomes in the LPG. Each resource is a reusable actor pipeline (descend → transform → vertex → edge) that maps raw records to graph elements. Because DataSources bind to resources by name, the same transformation logic applies regardless of whether data arrives from a file, an API, or a SPARQL endpoint.
   - Optional **`drop_trivial_input_fields`** (default `false` on the model): when `true`, each record is preprocessed by dropping **top-level** keys whose value is `null` or the empty string `""` before actors run. This trims sparse wide rows (many unused columns) without extra transforms; nested dicts and lists are not walked.
diff --git a/docs/concepts/core_components.md b/docs/concepts/core_components.md
@@ -83,11 +83,14 @@ A `Vertex` describes vertices and their logical identity. It supports:
   - If one duplicate is typed and the other is untyped, the typed definition wins
   - Conflicting non-null types for the same field name are rejected
 - Filtering conditions
-- Optional blank vertex configuration
+- **`blank: true`** — placeholder vertex with no natural key; identity defaults to **`id`** when omitted
 
-Identity defaults are strict by default at schema level:
-- `VertexConfig.identity_from_all_properties: false` (default) do not require explicit vertex `identity`, defaults to all properties
-- `VertexConfig.identity_from_all_properties: false` disables compatibility fallback where missing identity uses all property names
+Identity defaults at schema level (`VertexConfig`):
+
+- **`identity_from_all_properties: true`** (default) — vertices without explicit **`identity`** use all **`properties`** names as the logical key.
+- **`identity_from_all_properties: false`** — each non-blank vertex must declare **`identity`** explicitly; blank vertices still default to **`id`**.
+
+**Blank vertices:** set **`blank: true`** on the vertex entry under **`schema.graph.vertex_config.vertices`**. **`VertexConfig.blank_vertices`** is a derived list of names (not a separate YAML field). At runtime, **`ResourceRuntime`** keeps only vertex types referenced by that resource’s pipeline (and edge-inference selectors); blank types that are declared in the schema but not used by the resource are not injected automatically—include a **`vertex`** (or edge) step when the placeholder must be populated.
 
 ### Edge
 An `Edge` describes edges and their logical identities. It allows:
@@ -166,19 +169,20 @@ An `AbstractDataSource` subclass defines where data comes from and how it is ret
 
 Data sources handle retrieval only. They bind to Resources by name via the `DataSourceRegistry`, so the same `Resource` can ingest data from multiple sources without modification.
 
-### Resource
-A `Resource` is the central abstraction that bridges data sources and the graph schema. Each Resource defines a reusable actor pipeline (descend → transform → vertex → edge) that maps raw records to graph elements:
+### Resource (`ResourceConfig` / `ResourceRuntime`)
+
+Ingestion resources split into two layers:
 
-- How data structures map to vertices and edges
-- What transformations to apply
-- The actor pipeline for processing documents
+- **`ResourceConfig`** — declarative contract in **`ingestion_model.resources`** (YAML/Python): pipeline steps, encoding, type casters, edge-inference flags, **`tolerate_transform_errors`**, and related options. Serialized in manifests; validated by **`IngestionModel`**.
+- **`ResourceRuntime`** — schema-bound executor built via **`build_resource_runtime`**: filtered **`VertexConfig`**, bound transforms, and **`ActorExecutor`** for document casting.
 
-Because DataSources bind to Resources by name, the same transformation logic applies regardless of whether data arrives from a file, an API, a SQL table, or a SPARQL endpoint.
+The name **`Resource`** in manifests and docs usually means **`ResourceConfig`**. Data sources bind to resources by name, so the same pipeline applies whether data arrives from a file, API, SQL table, or SPARQL endpoint.
 
-Resource-level edge inference controls:
+Resource-level controls:
 - **`infer_edges`**: Global toggle for inferred edge emission during assembly (default: `true`).
 - **`infer_edge_only`**: Allow-list of inferred edges (`source`, `target`, optional `relation`).
 - **`infer_edge_except`**: Deny-list of inferred edges (`source`, `target`, optional `relation`).
+- **`tolerate_transform_errors`** (default **`true`**): on transform failure, null declared outputs and continue the pipeline; see [Document cast errors](ingestion_doc_errors.md).
 - `infer_edge_only` and `infer_edge_except` are mutually exclusive and validated against declared schema edges.
 - These controls apply to inferred edges only; explicit edge actors in the pipeline are still emitted.
 - **Auto-exclusion**: When a resource pipeline contains any EdgeActor for edges of type `(source, target)`, `(source, target, None)` is automatically added to `infer_edge_except` for that resource, so inferred edges do not duplicate edges produced by explicit edge actors.
@@ -192,7 +196,7 @@ An `Actor` describes how the current level of the document should be mapped/tran
 - `TransformActor`: Applies data transformations
 - `VertexActor`: Creates vertices from the current level. Key options:
   - **`role`** (optional): named accumulator slot. When set the vertex is stored at `lindex.extend((role, 0))` instead of bare `lindex`, so multiple vertices of the same type in one row (e.g. `role: self`, `role: parent`, `role: child`) occupy distinct slots and can be addressed individually by a downstream edge step.
-  - **`from`**: rename map `{vertex_field: doc_field}`. Only mismatched column names need listing; remaining vertex schema properties are absorbed from the doc automatically (passthrough).
+  - **`from`** (`from_doc`): rename map `{vertex_field: doc_field}`. Only mismatched column names need listing; remaining vertex schema properties are absorbed from the doc and transform buffer automatically (passthrough). When multiple **`TransformPayload`** entries share a location, **`from_doc`** consumes only payloads whose **`named`** keys include all mapped source fields—so dressed metrics or pivot rows for other vertex types are left for their own **`vertex`** steps.
   - **`keep_fields`**: restrict passthrough to this field subset. Use on role-vertex steps to prevent shared row columns from leaking into placeholder vertices that only carry an ID.
 - `EdgeActor`: Creates edges between vertices. Operates in three modes:
   - **Static mode** (`from`/`to` set on both sides): vertex types declared at config time.

diff --git a/docs/concepts/features_and_practices.md b/docs/concepts/features_and_practices.md
@@ -106,7 +106,7 @@ Schema comparison gives you a predictable transition path between versions. Inst
 
 ## Best Practices
 1. Use compound identity fields for natural keys, and **`schema.db_profile`** secondary indexes for query performance
-2. Leverage blank vertices for complex relationship modeling
+2. Leverage blank vertices (`blank: true` on the vertex definition) for complex relationship modeling; include them in the resource pipeline when they must be populated at cast time
 3. Define reusable transforms in **`ingestion_model.transforms`** and reference them from resource steps
 4. Configure appropriate batch sizes based on your data volume
 5. Enable parallel processing for large datasets

diff --git a/docs/concepts/index.md b/docs/concepts/index.md
@@ -96,10 +96,10 @@ flowchart LR
 
 - **Bindings** (`FileConnector`, `TableConnector`, `SparqlConnector`) describe *where* data comes from (file paths, SQL tables, SPARQL endpoints). Multiple connectors may attach to the same ingestion resource name; optional **`connector_connection`** entries assign each SQL/SPARQL connector a **`conn_proxy`** by **connector `name` or `hash`** (not by resource name). The `ConnectionProvider` turns that label into real connection config at runtime so manifests stay credential-free.
 - **DataSources** (`AbstractDataSource` subclasses) handle *how* to read data in batches. Each carries a `DataSourceType` and is registered in the `DataSourceRegistry`.
-- **Resources** define *what* to extract — each `Resource` is a reusable actor pipeline (descend → transform → vertex → edge) that maps raw records to graph elements. Optional **`drop_trivial_input_fields`: `true`** removes top-level keys whose value is `null` or `""` **before** actors run (shallow only; `0` and `false` stay). **TigerGraph** physical defaults for missing attributes belong in **`schema.db_profile.default_property_values`** (GSQL `DEFAULT` at DDL time), not in the covariant `GraphContainer` assembly path.
+- **Resources** define *what* to extract — each **`ResourceConfig`** (manifest `ingestion_model.resources`) is a reusable actor pipeline (descend → transform → vertex → edge) executed at cast time by **`ResourceRuntime`**. Optional **`drop_trivial_input_fields`: `true`** removes top-level keys whose value is `null` or `""` **before** actors run (shallow only; `0` and `false` stay). Optional **`tolerate_transform_errors`: `true`** (default) continues the pipeline when a transform step fails. **TigerGraph** physical defaults for missing attributes belong in **`schema.db_profile.default_property_values`** (GSQL `DEFAULT` at DDL time), not in the covariant `GraphContainer` assembly path.
 - **GraphContainer** (covariant graph representation) collects the resulting vertices and edges in a database-independent format.
 - **DBWriter** pushes the graph data into the target LPG store (ArangoDB, Neo4j, TigerGraph, FalkorDB, Memgraph, NebulaGraph).
-- **Document cast errors** — when a single source document fails inside a resource, **`IngestionParams.on_doc_error`** chooses skip vs fail-the-batch; optional **gzip JSONL** persistence uses **`doc_error_sink_path`** (CLI **`ingest --doc-error-sink`**). Details: [Document cast errors and doc error sink](ingestion_doc_errors.md).
+- **Document cast errors** — when a single source document fails inside a resource, **`IngestionParams.on_doc_error`** chooses skip vs fail-the-batch; optional **gzip JSONL** persistence uses **`doc_error_sink_path`** (CLI **`ingest --doc-error-sink`**). Per-resource **`tolerate_transform_errors`** (default **`true`**) lets a single transform step fail without aborting the rest of the pipeline for that document. Details: [Document cast errors and doc error sink](ingestion_doc_errors.md).
 
 ### Minimal canonical config contract