This page explains how to think about InQL's execution model as it works today.
There are two distinct phases:
- Build deferred relational work in
LazyFrame[T] - Ask a
Sessionto bind and run that work
LazyFrame[T] is not local data in hand. It is deferred relational intent. DataFrame[T] is local materialized data.
Session owns the parts that are not just logical plan shape:
- source registration
- logical-name to physical-source binding
- backend execution
- materialization
- writing to sinks
That keeps the carrier model clean. A LazyFrame[T] describes work. A Session makes that work run.
These two APIs are related but not interchangeable.
Use execute(...) when you want an execution checkpoint:
- the plan binds successfully
- lowering succeeds
- the backend can run it
It returns LazyFrame[T] again because the point is validation and execution success, not local materialization.
Use collect(...) when you want local data:
- it runs the same backend path
- it materializes a
DataFrame[T] - it records structured materialization metadata such as resolved columns and row count
- it may also retain preview text for display/debugging
This is the boundary where deferred relational work becomes local data in hand.
Some convenience APIs are nicer when they do not force the session parameter through every call site. lazy.collect() is one of those cases.
That convenience still needs a real execution context underneath, so it resolves through the active session at call time.
session.activate()sets the current active sessionlazy.collect()uses that active session
If there is no active session, the convenience API fails clearly instead of pretending execution context can be ambient without definition.
session.write_csv(...) and session.write_parquet(...) remain explicit Session methods because writing is not just a carrier concern. It requires binding, execution, and sink ownership.
So the current ergonomic split is:
- convenience materialization:
lazy.collect() - explicit writes:
session.write_csv(...),session.write_parquet(...)
This is a current package ergonomics choice, not a statement that all future convenience APIs must keep the same shape.
from pub::inql import Session
from pub::inql.functions import col, gt, int_expr, int_lit, mul
from models import Order
session = Session.default()
orders = session.read_csv[Order]("orders", "orders.csv")?
enriched = orders.with_column("amount_x2", mul(col("amount"), int_expr(2)))
filtered = enriched.filter(gt(col("amount"), int_lit(100))).limit(10)
session.activate()
preview = filtered.collect()?
session.write_csv(filtered, "orders_out.csv")?
This pattern is intentionally simple:
- read returns deferred work
- projection/filter/aggregate transforms stay declarative
- transforms stay deferred
- collect materializes when needed
- writes remain explicit on the session
For the exact method surface, see Dataset methods (Reference).
DataFrame[T] is already the materialized carrier, but its row-level user API is still intentionally narrow. The important current semantic distinction is already in place:
LazyFrame[T]= deferredDataFrame[T]= local materialized
Today that materialized carrier exposes structured collection metadata first:
- resolved columns
- row count
- preview text
For exact API shape, see Execution context (Reference).