Precise Incremental Datalog for Embedded and Enterprise Environments
wirelog is a pure C11 Datalog engine designed for high-performance incremental evaluation across embedded systems and enterprise applications. It compiles Datalog programs into optimized columnar execution plans and evaluates them using delta-seeded semi-naive iteration, delivering 2.8x+ speedup on incremental updates while maintaining strict memory safety and portability.
Wirelog is a declarative logic programming engine that brings the power of Datalog—a language for expressing complex queries and analyses—to performance-critical applications. Unlike traditional Datalog implementations that re-compute entire results on every update, wirelog uses incremental evaluation to propagate only new facts, delivering orders-of-magnitude speedups on real-world workloads.
Use cases include:
- Pointer analysis (C/C++ static analysis, vulnerability detection)
- Program analysis (data-flow analysis, security policies, reachability)
- Graph algorithms (transitive closure, strongly connected components, shortest paths)
- Configuration analysis (policy compliance, access control, dependency resolution)
- Network analysis (reachability, routing, audit log correlation)
- 2.8x faster incremental updates via delta-seeded semi-naive evaluation
- Minimal memory footprint — columnar layout (nanoarrow) with on-demand materialization
- Cache-efficient execution — columnar batching reduces memory bandwidth
- Frontier-driven optimization — skips unnecessary iterations on unchanged data
- Strict C11 compliance — no Rust, no virtual machines, no GC pauses
- Minimal external dependencies — core libs vendored (nanoarrow, xxHash); optional: mbedTLS, zlib, pthreads
- Cross-platform — Unix/Linux/macOS/Windows with identical semantics
- Memory-safe — AddressSanitizer + UndefinedBehaviorSanitizer validated
- CI/CD hardened — three-phase PR gates (lint, build, sanitizers) + main branch monitoring
- Declarative syntax — express logic once, optimize automatically
- Session API — incremental snapshot/update/query workflow
- Symbol interning — efficient string-to-integer mapping for large vocabularies
- Optimization passes — Logic Fusion, Join-Project Planning, Semijoin Information Passing
| Property | Value |
|---|---|
| Language | C11 (strict compliance) |
| Build | Meson + Ninja |
| Backend | Pure columnar (nanoarrow) |
| Threading | Cross-platform abstraction (POSIX pthreads / MSVC) |
| Platforms | Unix/Linux/macOS (primary), Windows (MSVC) |
| SIMD Support | AVX2 (x86-64), ARM NEON (ARM64) |
| Phase | 4C — Incremental evaluation with SIMD + Memory backpressure |
| Tests | 83/84 passing (1 expected failure), ASan/UBSan validated |
| CSPA benchmark | 2.55x speedup (baseline 36s → incremental 8.9s) with delta-seeding |
| Latest Version | 0.30.0-dev (released 0.21.0 with SIMD + backpressure) |
- Delta-seeded propagation: Pre-seed
$d$<name>delta relations from new rows only; avoids full IDB re-derivation - Per-stratum frontier tracking: Each rule layer remembers its convergence iteration; unchanged strata skip unnecessary re-evaluation
- Selective stratum frontier reset: Compute affected strata via data dependency graph; only affected rules re-evaluate with reset frontier
- Semi-naive iteration: Optimized delta computation strategy from Datalog theory
- Stride-based evaluation: Efficient iteration over columnar arrangements in cache-friendly strides
- Pure nanoarrow backend: Apache Arrow columnar format for memory efficiency and SIMD friendliness
- Cache-efficient batching: Columnar memory layout reduces cache misses and memory bandwidth
- SIMD vectorization: AVX2 (x86-64) and ARM NEON (ARM64) for hash operations, key comparisons, and filter predicates
- No intermediate materialization: Results computed on-demand without storing full intermediate relations
- Logic Fusion: Merge adjacent FILTER+PROJECT operations into efficient FLATMAP instructions
- Join-Project Planning (JPP): Greedy join reordering to minimize intermediate cardinalities
- Semijoin Information Passing (SIP): Pre-filter joins with semijoin keys to reduce data flowing through the pipeline
- Memory safety guarantees: AddressSanitizer + UndefinedBehaviorSanitizer pass on all platforms
- Portable C11: No Rust, no vendored bytecode; single portable C codebase
- Memory backpressure system: Thread-safe memory ledger tracking with JOIN budget enforcement and graceful EOVERFLOW truncation
- Comprehensive test suite: 83+ unit tests, regression suite, and 15+ benchmarks
- Symbol interning: Efficient string→i64 mapping for large vocabularies (Issue #56)
- Worker queue support: Cross-platform threading (POSIX pthreads / MSVC)
- Stratification analysis: Automatic rule ordering for negation handling
- CLI tools: wirelog-cli for standalone execution; examples for programmatic use
# Clone the repository
git clone https://github.com/justinjoy/wirelog.git
cd wirelog
# Build (requires Meson + C11 compiler)
meson setup build
meson compile -C build
# Run all tests
meson test -C build --print-errorlogs
# Run a simple graph benchmark (transitive closure)
./build/bench/bench_flowlog \
--workload tc \
--data bench/data/graph_100.csv
# Run the CSPA (pointer analysis) benchmark
./build/bench/bench_flowlog \
--workload cspa \
--data-cspa bench/data/cspaWirelog's incremental evaluation shines when processing updates to derived relations. The CSPA (Demand-Driven Context-Sensitive Pointer Analysis) benchmark demonstrates real-world gains with delta-seeded propagation:
| Metric | Baseline (Full Re-eval) | Incremental + Delta-Seeding | Gain |
|---|---|---|---|
| Evaluation time | 36.0s | 8.9s | 2.55x faster |
| Peak memory | 13.5GB | 6.2GB | 54% reduction |
| Iterations evaluated | 6 | 5 | 1 iteration skipped |
Why the speedup? The CSPA workload inserts derived facts incrementally. Wirelog:
- Pre-seeds delta relations (
$d$<name>) with only new tuples (not full re-derivation) - Identifies affected strata (rules depending on inserted facts)
- Re-evaluates only affected strata; unaffected rules skip to their frontier
- Result: Only delta facts propagate through evaluation, avoiding full IDB recomputation
Workload portfolio: 15+ benchmarks across graph analysis (TC, Reach, CC, SSSP), pointer analysis (Andersen, CSPA, CSDA), and program analysis (DOOP, Polonius, DDISASM)
Pipeline Stages:
- Parser — C11 hand-written recursive descent parses
.dlfiles into IR - Optimization Passes — IR → Fusion → JPP → SIP (logic fusion, join planning, semijoin filtering)
- Columnar Executor — Apache Arrow operators (FILTER, PROJECT, JOIN, FLATMAP, ARRANGE)
- Evaluation Modes — Baseline (full re-eval) or Incremental (delta-seeded with selective frontier skip)
- Result Callback — Query results delivered via session callback or written to output files
After an initial session_step() call, wirelog tracks the frontier (convergence point) for each stratum. On subsequent calls with new EDB facts:
- Delta pre-seeding: Populate
$d$<name>delta relations from only new rows (not full re-derivation of entire IDB) - Affected strata detection: Compute which strata depend on the inserted relations via data dependency graph
- Selective frontier reset: Reset only affected strata frontiers; unaffected strata retain convergence point and skip unnecessary iterations
- Semi-naive re-evaluation: Only delta tuples propagate through evaluation pipeline with selective frontier skipping
This delta-seeding strategy produces the 2.55x CSPA speedup: the delta path avoids re-computing the full IDB and only propagates new information through affected strata.
| Pass | Description |
|---|---|
| Fusion | Merge adjacent FILTER+PROJECT into FLATMAP |
| JPP | Greedy join reorder for 3+ atom chains to minimize intermediate sizes |
| SIP | Insert semijoin pre-filters in join chains to reduce intermediate cardinality |
| Stride Evaluation | Iterate over arrangements in efficient strides to improve cache locality |
| Magic Sets | Demand-driven optimization pass for bottom-up evaluation |
| Category | Workloads |
|---|---|
| Graph | TC, Reach, CC, SSSP, SG, Bipartite |
| Pointer Analysis | Andersen, CSPA, CSDA, Dyck-2 |
| Program Analysis | Galen, Polonius, CRDT, DDISASM, DOOP |
wirelog uses a two-track CI strategy: strict blocking gates on pull requests, and comprehensive non-blocking monitoring on the main branch (Issues #116, #117).
Every PR targeting main runs three sequential phases. Failure in an earlier phase stops later phases.
Phase 1: Lint (lint-pr.yml)
editorconfig-check
|
clang-format-18 [blocks if formatting differs from .clang-format]
|
clang-tidy-18 [blocks if static analysis warnings or errors found]
|
Phase 2: Build and Test (ci-pr.yml)
Linux / GCC -+
Linux / Clang +-- fail-fast: first failure cancels remaining matrix jobs
macOS / Clang-+
|
Phase 3: Sanitizers (ci-pr.yml)
Linux / GCC (ASan + UBSan) -+
Linux / Clang (ASan + UBSan) -+ fail-fast
A PR cannot be merged unless all three phases pass.
Pushes to main trigger broader coverage that runs to completion regardless of individual failures.
All phases run in parallel, continue-on-error: true
Lint monitor (lint-main.yml)
editorconfig-check
clang-format-18
clang-tidy-18
Build monitor (ci-main.yml)
Linux / GCC
Linux / Clang
macOS / Clang
Windows / MSVC <-- additional platform vs PR workflow
Sanitizers monitor (ci-main.yml)
Linux / GCC (ASan + UBSan)
Linux / Clang (ASan + UBSan)
macOS / Clang (ASan + UBSan) <-- additional platform vs PR workflow
| File | Trigger | Mode | Purpose |
|---|---|---|---|
lint-pr.yml |
PR to main |
Blocking | Sequential lint gates |
ci-pr.yml |
PR to main |
Blocking | Build + sanitizer gates |
lint-main.yml |
Push to main |
Non-blocking | Lint health monitoring |
ci-main.yml |
Push to main |
Non-blocking | Comprehensive build + sanitizer monitoring |
Before opening a PR, run these checks locally:
# 1. Format all modified C files (required -- CI hard-gates on this)
/opt/homebrew/opt/llvm@18/bin/clang-format --style=file -i wirelog/*.c wirelog/*.h
# 2. Verify no formatting diff remains
git diff wirelog/ tests/
# 3. Run the full test suite
meson test -C build --print-errorlogs
# 4. Optional: run with sanitizers locally
meson setup build-san \
-Db_sanitize=address,undefined \
-Db_lundef=false \
-Dtests=true \
--buildtype=debug
meson test -C build-san --print-errorlogsPre-commit hook (recommended): Add this to .git/hooks/pre-commit:
#!/bin/sh
CLANG_FORMAT=/opt/homebrew/opt/llvm@18/bin/clang-format
changed=$(git diff --cached --name-only --diff-filter=ACM | grep -E '\.(c|h)$')
if [ -n "$changed" ]; then
echo "$changed" | xargs "$CLANG_FORMAT" --style=file -i
echo "$changed" | xargs git add
fiPR merge requirements:
- All three CI phases must be green (lint, build, sanitizers)
- clang-format 18 must report zero violations
- clang-tidy 18 must report zero warnings or errors
- All tests must pass on Linux GCC, Linux Clang, and macOS Clang
Interpreting main branch CI failures:
- Failures in
CI MainandLint Mainare informational and do not block subsequent commits - They should be investigated and addressed in a follow-up PR
- The Windows MSVC job and macOS sanitizer job exist only in the main monitor
- ARCHITECTURE.md -- Internal system design (developer reference)
- LICENSE.md -- Licensing information
- CONTRIBUTING.md -- Contribution guidelines
- SECURITY.md -- Security policy
Wirelog uses dual licensing to serve both open-source and enterprise needs.
Wirelog is distributed under the GNU Lesser General Public License v3.0 (LGPL-3.0):
You can:
- ✓ Use wirelog as a library in proprietary applications (no source disclosure required)
- ✓ Modify and distribute modified versions (under LGPL-3.0)
- ✓ Deploy in closed-source products
- ✓ Link with proprietary code
You must:
- Document use of wirelog and provide a copy of LGPL-3.0
- Allow recipients to relink against modified versions of wirelog
- Disclose modifications to wirelog itself (not your application)
For full details, see LICENSE.md.
For use cases requiring different terms or proprietary extensions:
Contact: inquiry@cleverplant.com
Commercial licensing covers:
- Closed-source OEM embedding
- Custom feature development
- Priority support agreements
- Proprietary extensions (no LGPL obligations)
- Volume licensing discounts
Wirelog is designed for security-critical applications. We take security seriously.
Security Commitment:
- All pull requests are validated with AddressSanitizer (ASan) and UndefinedBehaviorSanitizer (UBSan)
- Memory safety is guaranteed by strict C11 + sanitizer passes
- Zero-copy columnar architecture eliminates many attack surfaces
Report Security Issues: Please review SECURITY.md for vulnerability disclosure procedures.
- Do NOT open public GitHub issues for security vulnerabilities
- Email security concerns to the maintainers (see SECURITY.md)
- Wirelog follows responsible disclosure practices
Wirelog is open source under LGPL-3.0. Contributions are welcome and essential to the project's growth.
- Read the guidelines: Review CONTRIBUTING.md for development workflow
- Understand the code of conduct: See CODE_OF_CONDUCT.md
- Review security policy: Check SECURITY.md for vulnerability handling
- Sign the CLA: By submitting a PR, you agree to the Contributor License Agreement (CLA)
- Required due to dual licensing (LGPL-3.0 + Commercial)
- Protects both contributors and users
- Bug reports: File issues with reproduction steps and test case
- Performance improvements: Profile, optimize, and benchmark (CSPA workload preferred)
- New optimizations: Propose fusion rules, join orders, or query transformations
- Platform support: Add Windows/ARM/exotic platform handling
- Documentation: Improve guides, API docs, architecture notes
- Test coverage: Add regression tests, property-based tests, or benchmarks
# 1. Fork and clone
git clone https://github.com/YOUR_USERNAME/wirelog.git
cd wirelog
# 2. Create a feature branch
git checkout -b feat/your-feature
# 3. Make changes; ensure linting passes
/opt/homebrew/opt/llvm@18/bin/clang-format --style=file -i wirelog/*.c wirelog/*.h
# 4. Run tests locally
meson setup build
meson compile -C build
meson test -C build --print-errorlogs
# 5. Push and open a PR
git push origin feat/your-featurePR Requirements:
- All three CI phases pass (lint, build+test, sanitizers)
- clang-format 18 validation ✓
- clang-tidy 18 clean ✓
- Tests pass on Linux (GCC + Clang) and macOS (Clang)
| Resource | Purpose |
|---|---|
| ARCHITECTURE.md | Internal design, optimizer pipeline, execution model |
| CONTRIBUTING.md | Development guidelines, PR workflow, CI expectations |
| SECURITY.md | Vulnerability disclosure, security policy |
| LICENSE.md | Full LGPL-3.0 text and licensing details |
| CODE_OF_CONDUCT.md | Community standards and expected behavior |
| CLA.md | Contributor License Agreement for dual licensing |
- FlowLog Paper: "FlowLog: Efficient and Extensible Datalog via Incrementality" — PVLDB 2025 — The foundational research behind wirelog's incremental evaluation strategy
- Apache Arrow / nanoarrow: apache/arrow-nanoarrow — Columnar memory format powering wirelog's backend
- Issues: File bugs, request features, or ask questions on GitHub Issues
- Discussions: Join the community on GitHub Discussions
- Commercial Support: Contact inquiry@cleverplant.com for enterprise support and consulting
Wirelog — Precise incremental Datalog for embedded and enterprise environments.
Built with performance, safety, and portability in mind.