This document defines the current technical architecture of CCT and organizes it by phase evolution.
CCT is a compiled language toolchain that transforms .cct sources into:
- executable binaries (via generated C + host C compiler)
- deterministic sigil artifacts (
.svg+.sigil)
This architecture document preserves the detailed host-compiler architecture that was stabilized through FASE 20, and extends it with the validated bootstrap, self-hosting, and operational-platform architecture delivered in FASE 21 through FASE 30.
.cct source(s)
-> module closure/resolution (ADVOCARE)
-> lexer
-> parser + AST
-> semantic analysis
-> sigilo generation (local/system)
-> C code generation (.cgen.c)
-> host C compiler (cc/gcc/clang)
-> executable
- Backend remains C-hosted:
.cgen.c+ host C compiler is the official backend. - Determinism-first for sigilo and generic materialization naming.
- Metadata-first sigilo model: structural counters and status fields are first-class outputs.
- Single-entry module closure with deterministic import graph traversal.
- Incremental phase evolution: no large redesign between phases unless explicitly planned.
- Project workflow layer: canonical
build/run/test/bench/cleanorchestration over the same compiler backend. - Documentation layer: canonical
cct docAPI generation over project/module closure.
Purpose: shared types and error infrastructure.
Main files:
types.herrors.herrors.cdiagnostic.hdiagnostic.cfuzzy.hfuzzy.c
Status: stable foundation with 12A diagnostic infrastructure (location snippets + suggestions).
Purpose: command parsing and user-facing compiler interface.
Main files:
cli.hcli.c
Status: supports compile/check/ast/tokens/sigilo commands, formatter/linter, and 12F project workflow commands.
Purpose: canonical project discovery/cache/runner orchestration (build/run/test/bench/clean).
Main files:
project.hproject.cproject_discovery.hproject_discovery.cproject_cache.hproject_cache.cproject_runner.hproject_runner.c
Status: introduced in 12F with deterministic root discovery, basic incremental cache, and canonical project command execution.
Purpose: API documentation generation (cct doc) with markdown/html output.
Main files:
doc.hdoc.c
Status: introduced in 12G with deterministic module/symbol page generation and strict warning gate mode.
Purpose: module discovery, graph closure, symbol visibility boundaries, inter-module resolution.
Main files:
module.hmodule.c
Status: stable through 9A/9B/9C, integrated with 9D/9E sigilo system behavior and 10C/10D contract/constraint visibility checks, extended in 11A with canonical stdlib namespace resolution (cct/...).
Purpose: lexical analysis.
Main files:
lexer.hlexer.ckeywords.h
Status: complete for current language surface.
Purpose: recursive-descent parsing and AST construction.
Main files:
parser.hparser.cast.hast.c
Status: supports current syntax through 19D.4, including ELIGE (CUM legacy alias), FORMA, payload ORDO, and expanded ITERUM arity forms in supported contexts.
Purpose: symbol registration, scope checks, type checks, contract/constraint validation.
Main files:
semantic.hsemantic.c
Status: consolidated through 19D.4 for the defined subset, including ORDO exhaustiveness diagnostics and ITERUM arity validation by collection kind.
Purpose: executable code generation and generic materialization.
Main files:
codegen.hcodegen.ccodegen_internal.hcodegen_contract.ccodegen_runtime_bridge.c
Status: pragmatic executable subset with deterministic monomorphization and dedup behavior.
Purpose: generated-code runtime helpers and low-level support paths.
Main files:
runtime.hruntime.c
Status: minimal but functional runtime layer for current executable subset.
Purpose: physical root of Bibliotheca Canonica modules distributed with the compiler.
Status: introduced in 11A as foundation (namespace/resolution contract + test stub module), expanded in later 11.x subphases.
Purpose: deterministic local and system sigilo generation.
Main files:
sigilo.hsigilo.c
Status: mature through 9D/9D2/9E and extended in FASE 14T with source-context extraction, tooltip normalization, conditional SVG wrapper/title emission, deterministic local node/call-edge data-*, and root-level semantic toggles.
- FASE 0: project and build foundation
- FASE 1: lexer
- FASE 2A/2B: parser + AST hardening
- FASE 3: semantic core
- FASE 4A/4B/4C: executable codegen subset establishment
- FASE 5/5B/6A: sigilo v1 -> v2 -> style/metadata refinement
- FASE 6B/6C/6D: executable language expansion + backend/runtime hardening
- FASE 7A/7B/7C/7D: pointer/memory subset + advanced
SIGILLUMsubset + phase-7 consolidation - FASE 8A/8B/8C/8D: failure-control model (
IACE,TEMPTA/CAPE/SEMPER) and final consolidation
- FASE 9A: module discovery (
ADVOCARE) - FASE 9B: inter-module resolution and linking policy
- FASE 9C: visibility boundary (
ARCANUMinternal) - FASE 9D: official local + system sigilo model with essential/complete modes
- FASE 9D2: system sigilo as vector inline sigil-of-sigils composition
- FASE 9E: modular architecture closure and CLI contract stabilization
- FASE 10A:
GENUScore model - FASE 10B: executable monomorphization and dedup
- FASE 10C:
PACTUMsemantic conformance - FASE 10D:
GENUS + PACTUMconstraints (GENUS(T PACTUM C)) in supported scope - FASE 10E: final consolidation, harmonized diagnostics, frozen subset contract
- FASE 11A: reserved namespace
cct/..., canonical stdlib path resolution, distribution contract, and stdlib/runtime/builtin boundary documentation - FASE 11B.1 / 11B.2: canonical text and formatting modules (
cct/verbum,cct/fmt) - FASE 11C: canonical static collections and baseline algorithms (
cct/series,cct/alg) - FASE 11D.1 / 11D.2 / 11D.3: canonical memory and dynamic-vector stack (
cct/mem, runtime storage core,cct/fluxus) - FASE 11E.1 / 11E.2: canonical IO/filesystem/path stack (
cct/io,cct/fs,cct/path) - FASE 11F.1: canonical numeric/random baseline (
cct/math,cct/random) - FASE 11F.2: canonical parse/compare layer and moderate algorithm expansion (
cct/parse,cct/cmp,alg_binary_search,alg_sort_insertion) - FASE 11G: canonical showcase/public face consolidation + stdlib usage metadata integration in sigilo
- FASE 11H: final stdlib subset freeze, stability governance, packaging/install closure
- FASE 12A: structured diagnostics with source snippets, optional correction hints, typo fuzzy matching, and
--no-colorCLI switch - FASE 12B: explicit numeric cast baseline (
cast GENUS(T)(expr)) for executable subset - FASE 12C: canonical
Option/Resultpointer-backed baseline in stdlib/runtime bridge - FASE 12D.1: canonical hash-backed containers (
cct/map,cct/set) integrated with Option and sigilo counters - FASE 12D.2: canonical collection combinators (
cct/collection_ops) for FLUXUS/SERIES map/filter/fold/find/any/all with callback-pointer bridge - FASE 12D.3: iterator statement baseline (
ITERUM item IN collection COM ... FIN ITERUM) over FLUXUS/SERIES and collection-op outputs - FASE 12E.1: standalone formatter command (
cct fmt) with in-place, check, and diff modes - FASE 12E.2: canonical linter command (
cct lint) with stable rule IDs, strict mode, and safe-fix subset
- FASE 13A.1: canonical
.sigilparser/reader runtime suite (test_sigil_parse) - FASE 13A.2: structural diff engine runtime suite (
test_sigil_diff) - FASE 13A.3:
sigilo inspect|diff|checkoperational CLI contract - FASE 13A.4: baseline check/update contract for local and CI workflows
- FASE 13B.1/13B.2: local + project workflow opt-in integration (
build|test|bench --sigilo-check) - FASE 13B.3/13B.4: CI profile gates (
advisory|gated|release) and operator-facing report/explain outputs - FASE 13C.1–13C.4: schema governance, compatibility profiles, and strict/tolerant validator contract (
sigilo validate) - FASE 13D.1: dedicated regression matrix runner:
tests/run_phase13_regression.sh- canonical fixture tree
tests/integration/phase13_regression_13d1/ - integrated in the global runner (
tests/run_tests.sh)
- FASE 13D.2: determinism audit runner and release audit evidence:
tests/run_phase13_determinism_audit.sh
- FASE 13D.3: release-document consolidation package:
- snapshot + stability/compatibility/limits/release-notes documents under
docs/release/
- snapshot + stability/compatibility/limits/release-notes documents under
- FASE 13D.4: final closure gate and residual-risk register for phase exit governance
- FASE 13M.A1: scope freeze and semantic contract (
**,//,%%;^deferred) - FASE 13M.A2: compiler implementation in lexer/parser/semantic/codegen/runtime bridge
- FASE 13M.B1: deep test matrix and non-regression proof integrated in
tests/run_tests.sh - FASE 13M.B2: documentation/release closure with executable examples and phase closure gate
- FASE 14A.1: canonical diagnostic taxonomy (
error|warning|note|hint) with compatibility preservation (suggestion:+hint:) - FASE 14A.2: canonical exit-code contract harmonized across CLI and strict gates (
0|1|2|3|4) - FASE 14A.3: explain/troubleshooting mode (
--explain) on sigilo operational commands with actionable guidance - FASE 14A.4: deterministic sigilo diagnostic ordering and log-noise hygiene for repeatable CI outputs
- FASE 14B.1: public contract harmonization across
README,spec, architecture, roadmap, and CLI status strings - FASE 14B.2: consolidated sigilo operations guide for high-value workflows:
docs/sigilo_operations_14b2.md
- FASE 14B.3: publication visibility policy (
md_outprivate vs curateddocspublic) with manifests: - FASE 14B.4: release-doc reference pack templates:
- FASE 14C.1: expanded regression matrix with composed risk blocks:
tests/run_phase14c1_regression_matrix.sh
- FASE 14C.2: stress/soak operational stability gate:
tests/run_phase14c2_stress_soak.sh
- FASE 14C.3: performance baseline/budget gate:
tests/run_phase14c3_perf_budget.sh
- FASE 14C.4: residual risk + known limits hardening:
- public summary in
docs/release/FASE_14_RELEASE_NOTES.md
- public summary in
- source-context extraction normalizes LF/CRLF input and indexes semantic elements by original source location
- tooltip builder prepares deterministic, XML-safe, clipped hover text for ritual nodes, structural nodes, local edges, and system nodes/edges
- local SVG emission gained conditional wrappers for hoverable elements plus additive deterministic
data-*on local nodes and call edges - local/system roots gained lightweight semantic metadata (
role,aria-label,desc) when instrumentation is enabled - CLI controls
--sigilo-no-titlesand--sigilo-no-dataallow explicit generation of partially instrumented or plain pre-14T SVGs without changing the underlying layout engine - practical effect: the rendered SVG itself became an inspection surface, where circles and lines can be hovered directly to reveal semantic context
- detailed governance artifacts in internal release records
- FASE 14D.1: reproducible packaging gate:
tests/run_phase14d1_packaging_repro.sh
- FASE 14D.2: release candidate validation matrix:
tests/run_phase14d2_rc_validation.sh
- FASE 14D.3: final release artifact consolidation:
docs/release/FASE_14_RELEASE_NOTES.md- consolidated internal release artifacts under private governance records
- FASE 14D.4: closure gate and rollback:
- C-hosted backend is official.
- Sigilo architecture is dual-level and stable:
- local sigilo per module
- system sigilo (
.system) as composed sigil-of-sigils
- Generic execution requires explicit instantiation.
- Contract conformance is explicit (
SIGILLUM ... PACTUM ...). - Constraint form is intentionally limited to the stabilized subset.
- Bibliotheca Canonica foundational contract is active:
- reserved namespace
cct/... - canonical stdlib path resolution
- stdlib modules remain observable in
--ast-compositeand--check
- reserved namespace
- Bibliotheca Canonica operational baseline now includes:
- text/format layer (
verbum,fmt) - collection/memory layer (
series,alg,mem,fluxus,collection_ops,map,set) - external interaction layer (
io,fs,path) - utility baseline (
math,random,parse,cmp)
- text/format layer (
- IO/FS/PATH boundaries are explicit:
io: terminal input/output primitivesfs: whole-file operationspath: textual path composition/decomposition (no mini-OS scope)
- MATH/RANDOM boundaries are explicit:
math: minimal deterministic numeric helpersrandom: runtime-backed PRNG baseline (seed,random_int,random_real)
- PARSE/CMP boundaries are explicit:
parse: strict textual conversion for primitive valuescmp: canonical comparator contract (<0,0,>0) across core scalar/text domains
- Option/Result + hash-backed collection boundaries are explicit:
option/result: pointer-backed value-absence and success/failure wrappersmap/set: hash-backed opaque containers with explicit lifecycle (*_free)
- Public showcase layer is now part of platform architecture:
- curated canonical examples in
examples/showcase_stdlib_*_11g.cct - mirrored integration fixtures in
tests/integration/showcase_*_11g.cct - multi-module showcase validated with
--ast-compositeand modular sigilo modes
- curated canonical examples in
- Sigilo metadata now carries explicit stdlib usage inventory/counters used by public showcase and module workflows
- Release packaging contract is now defined:
make distbuilds relocatable bundle (bin,lib/cct,docs,examples)- wrapper binary can inject stdlib root through
CCT_STDLIB_DIR make install/make uninstallare prefix-driven and install wrapper + stdlib tree
- Type argument inference
- Multi-contract constraints per type parameter
- Advanced constraint solver and dynamic contract dispatch
- Full ownership/lifetime system
Architecture direction:
- introduce a coherent stdlib surface over the stabilized language core
- define packaging/import conventions for reusable modules
- keep compiler core contracts stable while expanding library ergonomics
Architecture direction:
- performance and codegen/runtime quality improvements
- possible backend specialization paths while preserving public compiler behavior
- stronger runtime structure without breaking existing subset guarantees
Architecture direction:
- richer tooling around sigilo analysis and workflows
- persisted, versioned baseline contract for sigilo drift checks
- explicit schema governance for
.sigilevolution (13C.1), including additive-only policy in FASE 13 - canonical local workflow profiles (minimal/strict) to operationalize sigilo checks without coupling to CI gates
- optional visual/metadata enhancements while preserving deterministic model
- no regression of local/system sigilo contracts
Architecture direction:
- release engineering, diagnostics polish, compatibility audits
- documentation and operational consistency across compiler and tooling
Architecture direction:
- close the loop-control and operator gap in the executable subset with full regression coverage
- preserve deterministic and compatibility-first behavior while expanding semantics
Implemented closure set:
- 15A.1/15A.2/15A.3/15A.4:
FRANGEandRECEDEstabilized acrossDUM,DONEC,REPETE, andITERUM, including outside-loop diagnostics and REPETE increment safety - 15B.1/15B.2/15B.3/15B.4: logical
ET/VELfinalized with short-circuit guarantees, precedence (NON > ET > VEL), parentheses stress support, and comparator integration - 15C.1/15C.2/15C.3/15C.4: bitwise and shift family finalized (
ET_BIT,VEL_BIT,XOR,NON_BIT,SINISTER,DEXTER) with integer-only enforcement and closure integration checks - 15D.1/15D.2/15D.3/15D.4:
CONSTANSfinalized for locals and rituale parameters,constcodegen guarantees, pointer-binding (CONSTANS SPECULUM) behavior, edge-case closure tests, and phase-closure governance
Architecture direction:
- start the bootstrap-oriented track while preserving host defaults
- keep CCT host-executed while introducing freestanding/ASM delivery contracts for LBOS integration
- enforce ABI/profile compatibility gates before broader bootstrap expansion
Reference architecture dossier:
md_out/FASE_16_CCT.md
Architecture direction:
- unlock bootstrap/tooling workflows with practical text, cursor, args, variant, AST, and utility modules.
- preserve deterministic module/tooling behavior while expanding canonical host APIs.
Architecture direction:
- deliver broad host utility surface (
fs,io,path,process) and deeper canonical algorithms/collections. - extend deterministic utility baseline (
hash,bit,randomexpansions) without destabilizing compiler contracts.
Architecture direction:
- add first-class selection/pattern ergonomics (
ELIGE, withCUMkept for compatibility) while preserving deterministic codegen contracts - add interpolated string construction (
FORMA) through runtime-backed builders in host profile - add payload-capable ORDO representation with union-tagged lowering and scoped destructuring bindings
- unify collection iteration ergonomics by extending
ITERUMtomap/setwith insertion-order semantics - close documentation/release/handoff artifacts for the full phase (
19D2,19D3,19D4)
Codegen/runtime notes:
ELIGElowers to integer/string dispatch chains and ORDO-tag switch forms with payload extraction locals.FORMAlowers to runtime helper sequences (molde_begin/append/end) and keeps ownership behavior explicit.- payload
ORDOlowers to tagged C structs with variant-specific payload unions. ITERUMnow emits dedicated runtime iterator helpers formapandset(cct_rt_map_iter_*,cct_rt_set_iter_*), preserving insertion order.
Architecture direction:
- turn the stabilized host subset into a practical application platform while keeping the language core unchanged
- keep protocol/model/config logic in CCT wherever possible
- keep runtime C additions thin and limited to host boundary surfaces such as sockets and SQLite
Library/runtime notes:
cct/json,cct/http,cct/config, and most ofcct/netare implemented as canonical CCT modules over existing runtime services.cct/socketandcct/db_sqliteextend the generated-host runtime with narrow OS/library bindings only.- generated C now links
-lsqlite3on demand when SQLite builtins are actually used by the source program. - host-only boundaries remain explicit: sockets, HTTP, config over host FS/env, and SQLite are not part of the freestanding profile.
Operational closure:
- documentation, examples, release notes, and handoff artifacts are part of the phase definition of done.
- examples now cover HTTP JSON client/server, config + SQLite integration, and UDP loopback/demo flows.
Architecture quality is enforced by a phase-accumulated regression suite (tests/run_tests.sh) covering:
- lexical, parser, semantic, codegen, runtime, sigilo, module behavior
- deterministic output constraints
- boundary diagnostics and subset-policy enforcement
The architecture is considered stable only when full historical regression remains green.
Latest closure evidence (FASE 20F final gate): Passed: 1181, Failed: 0.
This file is an architectural snapshot and organization layer. For detailed language behavior, execution subset boundaries, and roadmap specifics, also see:
docs/spec.mddocs/roadmap.mdREADME.md
From FASE 21 onward, CCT gained a second compiler implementation written in CCT itself. The bootstrap compiler mirrors the host architecture:
- bootstrap lexer in
src/bootstrap/lexer/ - bootstrap parser/AST in
src/bootstrap/parser/ - bootstrap semantic analyzer in
src/bootstrap/semantic/ - bootstrap code generator in
src/bootstrap/codegen/ - self-hosted compiler entrypoints in
src/bootstrap/main_*.cct
This was not introduced as a prototype-only experiment. The bootstrap stack is a validated architectural layer with dedicated regression gates and stage convergence requirements.
The host backend remains generated C plus a host C compiler. The bootstrap compiler converges to the same backend model:
CCT source
-> bootstrap front-end in CCT
-> bootstrap semantic model in CCT
-> bootstrap C code generation in CCT
-> host C compiler
-> executable
This preserves a single executable artifact strategy while allowing self-hosted evolution.
The post-FASE-20 architecture gained the following validated backend milestones:
- FASE 26: foundational bootstrap code generation
- FASE 27: structural code generation for
SIGILLUM,ORDO, andELIGE - FASE 28: generic materialization, advanced control flow, failure-control lowering, and
FORMA - FASE 29: stage0/stage1/stage2 convergence
- FASE 30: operational self-host workflows and mature application modules
CCT now has multiple validation planes:
- historical legacy suites
- rebased legacy compatibility suites
- bootstrap compiler suites
- self-host convergence suites
- operational self-host platform suites
The authoritative whole-project gate is make test-all-0-30.
CCT should now be understood as a dual-implementation compiler platform:
- host compiler in C for baseline production stability
- bootstrap/self-host compiler in CCT for language self-sufficiency and future primary-toolchain evolution
This duality is intentional and is part of the current architecture, not technical debt.
The architecture after FASE 31 adds an entry-layer abstraction on top of the already-complete dual-implementation model.
The repository should now be read as four distinct layers:
- host compiler binary (
cct.binand thecct-hostwrapper) - explicit self-host wrapper (
cct-selfhost) - default wrapper (
cct) - mode state controlling which compiler path the default wrapper exposes
This layer exists to preserve CLI continuity while changing the operational implementation path.
The wrapper is not cosmetic. It solves a real architectural problem:
- users should not need different top-level commands for normal work depending on the active compiler implementation
- the project must preserve an emergency fallback path while promoting the self-hosted compiler
- promotion and demotion must be reversible and testable without replacing or destroying the host implementation
The architecture evolution should now be read as:
- FASE 29: self-host convergence achieved
- FASE 30: self-host operational workflows validated
- FASE 31: self-hosted compiler promoted to the default operational compiler path
CCT is no longer only a dual-implementation compiler platform. It is now a dual-implementation platform with an explicit operational entry layer and state-driven mode switching.
The authoritative whole-project gate therefore moves conceptually from make test-all-0-30 toward make test-all-0-31 for publication and post-promotion auditing.