An Erlang graph library — digraph and then some
graffeo wraps Erlang's two stdlib graph modules, digraph and digraph_utils,
and carries them the rest of the way toward "batteries included." Where the
stdlib stops — topological sort, components, reachability, a handful of
connectivity predicates — graffeo continues: weighted shortest paths,
composable traversal, richer connectivity, and the assorted algorithms one ends
up hand-rolling on real graph projects. The benchmark it measures itself
against is Rust's petgraph, which set the recent bar
for what a graph library should give you out of the box.
graffeo is, in large part, a wrapper around Erlang's two standard-library graph
modules, digraph and digraph_utils — bringing them together behind a single
module so that graph-theoretic programming in Erlang has one obvious front door.
It embraces digraph and digraph_utils wholesale. graffeo does not redesign
or reinterpret them: every ported function keeps the stdlib's exact name, arity,
argument order, return shape, and semantics, so a digraph user is immediately at
home. graffeo only adds — it never changes what the standard library already got
right.
Structurally, it adds a thin seam and a choice of storage. A graph-access
behaviour lets each algorithm be written once and run over any conforming
backend: the immutable, map-backed value backend graffeo_map (copyable,
pattern-matchable, message-passable); an ETS-backed, process-owned handle backend
graffeo_ets (mutable, implemented over the stdlib digraph); and a persistent,
DETS-backed handle backend graffeo_dets (on disk, survives restarts); and a
transactional, replicated handle backend graffeo_mnesia (Mnesia-backed — atomic
multi-mutation and multi-node graphs). The design rationale lives in
docs/architecture.md.
And it adds the graph-theoretic functions you end up hand-rolling on real projects —
the ones neither digraph nor digraph_utils provide:
| Function | What it adds |
|---|---|
dijkstra/2,3 |
Weighted shortest paths (Dijkstra), with a pluggable cost function |
astar/3,4 |
Weighted shortest paths (A*), with a pluggable cost and an admissible heuristic |
bfs/2,3 |
Breadth-first traversal with direction (out/in/both), an edge filter, and distances |
degree/2 |
Combined in+out degree (the stdlib exposes only in_degree/out_degree) |
degree_centrality/2 |
Normalised degree centrality |
top_k_by_degree/2 |
The k most-connected vertices |
source_vertices/1 |
Vertices with no incoming edges |
sink_vertices/1 |
Vertices with no outgoing edges |
graffeo also carries the entire digraph / digraph_utils surface — topological
sort, (strongly) connected components, reachability, path and cycle queries, subgraph
and condensation — over all four backends, each checked for exact parity with its
stdlib counterpart and cross-tier parity between backends. "One algorithm layer,
many backends" is enforced, not merely asserted.
On the roadmap: minimum spanning trees, negative-weight shortest paths (Bellman-Ford), and multi-edge support — graffeo currently models simple directed graphs (at most one edge per ordered pair).
Both tiers share one algorithm layer: you build a graph one of two ways, then
call the same graffeo:* functions over it.
%% Every build step returns a NEW graph; the original is untouched.
G0 = graffeo:new(),
G1 = graffeo:add_edge(G0, a, b, #{weight => 1}),
G2 = graffeo:add_edge(G1, b, c, #{weight => 2}),
G3 = graffeo:add_edge(G2, a, c, #{weight => 10}),
G = graffeo:add_edge(G3, c, d, #{weight => 3}),
{ok, _Order} = graffeo:topsort(G), %% a valid topological order, e.g. [a, b, c, d]
{Dist, _Prev} = graffeo:dijkstra(G, a), %% #{a => 0, b => 1, c => 3, d => 6}
{ok, _Path, 6} = graffeo:astar(G, a, d), %% weighted A*: cheapest a→d is a,b,c,d (6), not a,c,d (13)
3 = graffeo:degree(G, c), %% in: a, b (2) + out: d (1)
[a, b, c, d] = lists:sort(graffeo:reachable(G, [a])), %% the digraph_utils family, on the same value
true = graffeo:is_acyclic(G),
[{a, 0} | _] = graffeo:bfs(G, a). %% [{Vertex, Distance}], breadth-first from the sourceBecause the graph is a plain value, G0 still has zero edges after all of the
above — nothing was mutated, and G can be pattern-matched or sent between
processes like any other term.
%% A mutable handle, ETS-backed (over stdlib digraph) and owned by your process.
%% One value, one namespace: build, run algorithms, then clean up.
G = graffeo_ets:new(),
graffeo_ets:add_edge(G, a, b, #{weight => 1}),
graffeo_ets:add_edge(G, b, c, #{weight => 2}),
graffeo_ets:add_edge(G, a, c, #{weight => 10}),
graffeo_ets:add_edge(G, c, d, #{weight => 3}),
%% The SAME graffeo:* algorithm calls — no wrapping needed.
{ok, _Order} = graffeo:topsort(G),
{Dist, _Prev} = graffeo:dijkstra(G, a), %% #{a => 0, b => 1, c => 3, d => 6}
graffeo_ets:delete(G). %% lifecycle stays in graffeo's namespaceIf you already have a bare digraph handle, graffeo_ets:wrap/1 lifts it into the
envelope so the algorithm layer works on it. unwrap/1 hands the bare handle back
when you need raw digraph:* access.
graffeo_dets is the same handle experience as graffeo_ets, backed by DETS on disk.
new/0 gives an ephemeral graph (just like graffeo_ets); open/1 gives a named,
persistent one that outlives the VM.
%% Ephemeral — identical experience to graffeo_ets, just on disk.
G = graffeo_dets:new(),
graffeo_dets:add_edge(G, a, b, #{weight => 1}),
{Dist, _Prev} = graffeo:dijkstra(G, a), %% the SAME graffeo:* algorithms
graffeo_dets:delete(G). %% close + remove the files%% Persistent — build once, reopen later, query without rebuilding.
G = graffeo_dets:open(my_graph),
graffeo_dets:add_edge(G, a, b, #{weight => 1}),
graffeo_dets:add_edge(G, b, c, #{weight => 2}),
graffeo_dets:close(G), %% flush + keep the files on disk
%% ... a VM restart later ...
G2 = graffeo_dets:open(my_graph), %% the data is still there
{ok, _Order} = graffeo:topsort(G2),
graffeo_dets:close(G2).To persist an in-memory graph, graffeo:copy/2 materialises any graph into an opened
backend: graffeo:copy(MapG, graffeo_dets:open(snapshot)).
Where files live. On-disk backends store under a configurable data_dir, resolved
as: an explicit data_dir in sys.config → priv/data (if writable) → a CWD-based
graffeo_data/ → the OS cache dir. In a release priv is typically read-only, so
production should set data_dir in sys.config:
%% sys.config
[{graffeo, [{data_dir, "/var/lib/myapp/graffeo"}]}].graffeo_mnesia is the same handle experience again, backed by Mnesia — reach for it
when you need atomic multi-mutation transactions or multi-node replicated
graphs (for plain persistence, graffeo_dets is lighter). new/0 is ephemeral
(ram_copies); open/1 is persistent (disc_copies).
G = graffeo_mnesia:new(),
graffeo_mnesia:add_edge(G, a, b, #{weight => 1}),
{Dist, _Prev} = graffeo:dijkstra(G, a), %% the SAME graffeo:* algorithms
graffeo_mnesia:delete(G).Wrap a batch of mutations — or a read — in transaction/1 for atomicity and a
consistent snapshot:
graffeo_mnesia:transaction(fun() ->
graffeo_mnesia:add_edge(G, a, b, #{weight => 1}),
graffeo_mnesia:add_edge(G, b, c, #{weight => 2})
end), %% atomic: both edges, or neither
{atomic, {ok, _Order}} =
graffeo_mnesia:transaction(fun() -> graffeo:topsort(G) end).Inside transaction/1 the read and write paths automatically use locked Mnesia
operations; outside, they're dirty (fast, lock-free) — the same code, the context
decides. open/2 takes #{storage => disc_copies | ram_copies | disc_only_copies, nodes => [node()], majority => boolean()}; multi-node replication rides the nodes
option, but its partition behaviour is Mnesia's own — use it knowingly. Mnesia's data
directory comes from the same data_dir config as graffeo_dets.
examples/erlang-concepts
is a guided tour, not a toy: it ingests the Erlang concept-card knowledge base — ~1,400
concepts distilled from twelve Erlang/OTP books — and walks from degree centrality
through strongly-connected-component condensation to a derived learning order, finding
real defects in the source data along the way. The whole tour runs identically over all
four backends, checked in CI against an independent oracle.
rebar3 compileApache License 2.0. See LICENSE.
