diff --git a/CHANGELOG.md b/CHANGELOG.md index f2df0cb4..ab172280 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -29,6 +29,7 @@ while in pre-1.0 mode (`v0.x.y`). ### Changed - **BREAKING (`pkg/nucleus` rewrite — ADR-010 Phase 1 Foundation): the legacy fluent chain is replaced wholesale.** The pre-Phase-1 surface (`nucleus.New().Port().Host().SQLite().Postgres().MySQL().WithAdmin().SPA().Templates().Static().Cors().Provide().Model().AutoMigrate().Run()`, the `Resource(path, controller)` shape requiring a five-method controller, the `RouterGroup` struct, the legacy `Load(path)` that panicked on error) is removed entirely. The new surface — canonical `nucleus.App{}` struct embedding `app.Config`, generic `nucleus.Module[C any]` with `Build() ModuleSpec`, `nucleus.Router` interface with three coexisting registration styles (flat, REST `Resource(path, controller, nucleus.Methods(...))` with explicit verb registration, and nested `Group(prefix, func(g Router))`), three coexisting entry surfaces (fluent builder, direct struct, bootstrap pattern) producing equal `App{}` values per `pkg/nucleus/equivalence_test.go` — lands in this PR. `FromConfigFile` is shape-only in Phase 1 and returns `ErrConfigLoaderNotImplemented` at `Build`/`Start`/`Serve` time; the five-layer validator and merge engine arrive in Phase 2. Pre-`v1.0` clean break per the ADR-006 / ADR-008 precedent — no DEP/MA artefacts, no WARN-wrapped legacy methods. See [ADR-010](docs/adrs/ADR-010-fluent-api-v2-pkg-nucleus.md). +- **BREAKING (`pkg/nucleus.FromConfigFile` is now operational — ADR-010 Phase 2a): the Phase 1 stub is replaced by a real single-file YAML loader.** `AppBuilder.FromConfigFile(path)` now loads the named file via koanf, applies struct defaults from `app.DefaultConfig()`, and returns a populated `nucleus.App` when called through `Build`/`Start`/`Serve`. Three validation guards land alongside: a **1 MiB per-file size cap** (`MaxConfigFileBytes`) enforced before parsing — eliminates anchor-expansion / deep-nesting DoS classes against `gopkg.in/yaml.v3`; **strict-unknown-fields schema validation** against `app.ContractConfigKeyPatterns()` — unknown keys surface as `ErrUnknownConfigKeys` with did-you-mean hints for likely typos (Levenshtein distance ≤3 on the final segment); and **extension-based parser inference** — `.yaml`/`.yml` work today, `.toml`/`.json` produce a targeted `ErrUnsupportedConfigFormat` referencing Phase 2b. Multi-file `FromConfigFile(a.yaml, b.yaml)` fails fast with a Phase 2b reference until the merge engine lands. `ErrConfigLoaderNotImplemented` is removed (clean break — pre-`v1.0` Phase-1 stub now retired). Three new exported sentinels (`ErrConfigFileTooLarge`, `ErrUnsupportedConfigFormat`, `ErrUnknownConfigKeys`) and one constant (`MaxConfigFileBytes`) join the `pkg/nucleus` baseline. New dep: `github.com/knadh/koanf/providers/rawbytes` (zero-go, sibling of the YAML provider already in tree). See [ADR-010](docs/adrs/ADR-010-fluent-api-v2-pkg-nucleus.md) §2. - **BREAKING (`examples/*` removed): every example application is removed; new reference applications will land in v0.9.X.** Owner decision dated 2026-05-16, recorded in ADR-010: the original Phase 1 plan rewrote the two `examples/ecommerce_dashboard/backend/*` consumers in the same PR. Instead, the entire `examples/*` tree (`admin-quickstart`, `balancer`, `ecommerce_dashboard`, `fleetmanager`, `ministore`, `mvc_api`, `plugins`) was removed, alongside the runnable lab scripts (`scripts/cluster-{start,stop}.sh`, `scripts/dev/run_admin_cluster_lab.{sh,ps1}`) and the example-dependent docs (`docs/ADMIN_CLUSTER_LAB.md`, `docs/reference/PLUGIN_EXAMPLES.md`). The compatibility harness loses its three fixture profiles (`minimal-api`, `admin-heavy`, `plugin-heavy`) for this window and runs a `core-build` placeholder; the fixture profiles return with the new reference applications in v0.9.X (ADR-010 Phase 4). The `Dockerfile` now builds and ships the `nucleus` CLI rather than the previous `examples/mvc_api` server. Migration is empty for external users (there were none); operators downstream that previously consumed the example-server Docker image should pin to a pre-2026-05-16 tag until v0.9.X. - **Behaviour change (`pkg/observe`, stable surface): `NewLogger` redacts secret-keyed attributes by default.** A deployment that intentionally logged a field under a denylisted key (e.g. an opaque non-secret named `token`) now sees `[REDACTED]` there. This is the intended security default per [ADR-007](docs/adrs/ADR-007-slog-secret-redaction.md); the escape hatch is `observe.NewLoggerWithRedaction` with the key omitted, a renamed attribute, or `RedactionConfig.Disabled`. No `DEP-` entry (no symbol removed or renamed). - **BREAKING (CSRF XSRF-cookie config): `EncryptionKey` is mandatory and must be exactly 32 bytes when `EnableXSRFCookie` is `true`.** `pkg/router` is a `stable` surface; this is a deliberate behaviour change per [ADR-006](docs/adrs/ADR-006-csrf-hardening.md). An application that called `CSRFMiddleware` with `EnableXSRFCookie: true` and no (or a non-32-byte) `EncryptionKey` previously started successfully with a weak/truncated key; it now **panics at startup** (or, via `NewCSRFMiddleware`, returns `router.ErrCSRFEncryptionKey`). Migration: set `EncryptionKey` to exactly 32 bytes, sourced from the environment or a secret manager — see `docs/guides/CSRF_GUIDE.md`. Deployments with `EnableXSRFCookie: false` (the default) are unaffected: `EncryptionKey` stays optional and unvalidated for them. diff --git a/contracts/baseline/api_exported_symbols.txt b/contracts/baseline/api_exported_symbols.txt index 9ac58f30..857c51ba 100644 --- a/contracts/baseline/api_exported_symbols.txt +++ b/contracts/baseline/api_exported_symbols.txt @@ -642,6 +642,7 @@ github.com/jcsvwinston/nucleus/pkg/model type:QueryOpts github.com/jcsvwinston/nucleus/pkg/model type:Registry github.com/jcsvwinston/nucleus/pkg/model type:SQLQueryEvent github.com/jcsvwinston/nucleus/pkg/model type:SQLQueryObserver +github.com/jcsvwinston/nucleus/pkg/nucleus const:MaxConfigFileBytes github.com/jcsvwinston/nucleus/pkg/nucleus field:App.Lifecycle github.com/jcsvwinston/nucleus/pkg/nucleus field:App.Middleware github.com/jcsvwinston/nucleus/pkg/nucleus field:App.Modules @@ -742,7 +743,9 @@ github.com/jcsvwinston/nucleus/pkg/nucleus type:ServiceRegistration github.com/jcsvwinston/nucleus/pkg/nucleus type:Shower github.com/jcsvwinston/nucleus/pkg/nucleus type:Updater github.com/jcsvwinston/nucleus/pkg/nucleus type:WebhookRegistry -github.com/jcsvwinston/nucleus/pkg/nucleus var:ErrConfigLoaderNotImplemented +github.com/jcsvwinston/nucleus/pkg/nucleus var:ErrConfigFileTooLarge +github.com/jcsvwinston/nucleus/pkg/nucleus var:ErrUnknownConfigKeys +github.com/jcsvwinston/nucleus/pkg/nucleus var:ErrUnsupportedConfigFormat github.com/jcsvwinston/nucleus/pkg/observe const:RedactionPlaceholder github.com/jcsvwinston/nucleus/pkg/observe field:RedactionConfig.Disabled github.com/jcsvwinston/nucleus/pkg/observe field:RedactionConfig.ExtraKeys diff --git a/go.mod b/go.mod index 19159cc5..549bcfc2 100644 --- a/go.mod +++ b/go.mod @@ -7,6 +7,8 @@ require ( github.com/Azure/azure-sdk-for-go/sdk/storage/azblob v1.6.4 github.com/alexedwards/scs/v2 v2.9.0 github.com/alicebob/miniredis/v2 v2.37.0 + github.com/aws/aws-sdk-go-v2/config v1.32.17 + github.com/aws/aws-sdk-go-v2/service/secretsmanager v1.41.7 github.com/bradfitz/gomemcache v0.0.0-20260422231931-4d751bb6e37c github.com/casbin/casbin/v2 v2.135.0 github.com/go-playground/validator/v10 v10.25.0 @@ -18,10 +20,12 @@ require ( github.com/knadh/koanf/parsers/yaml v1.1.0 github.com/knadh/koanf/providers/env v1.1.0 github.com/knadh/koanf/providers/file v1.2.1 + github.com/knadh/koanf/providers/rawbytes v1.0.0 github.com/knadh/koanf/providers/structs v1.0.0 github.com/knadh/koanf/v2 v2.1.2 github.com/microsoft/go-mssqldb v1.10.0 github.com/minio/minio-go/v7 v7.0.100 + github.com/prometheus/client_golang v1.23.2 github.com/redis/go-redis/v9 v9.14.1 github.com/robfig/cron/v3 v3.0.1 github.com/shirou/gopsutil/v3 v3.24.5 @@ -29,6 +33,7 @@ require ( go.opentelemetry.io/otel v1.43.0 go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.35.0 go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.43.0 + go.opentelemetry.io/otel/exporters/prometheus v0.65.0 go.opentelemetry.io/otel/metric v1.43.0 go.opentelemetry.io/otel/sdk v1.43.0 go.opentelemetry.io/otel/sdk/metric v1.43.0 @@ -54,7 +59,6 @@ require ( github.com/GoogleCloudPlatform/opentelemetry-operations-go/exporter/metric v0.55.0 // indirect github.com/GoogleCloudPlatform/opentelemetry-operations-go/internal/resourcemapping v0.55.0 // indirect github.com/aws/aws-sdk-go-v2 v1.41.7 // indirect - github.com/aws/aws-sdk-go-v2/config v1.32.17 // indirect github.com/aws/aws-sdk-go-v2/credentials v1.19.16 // indirect github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.23 // indirect github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.23 // indirect @@ -62,7 +66,6 @@ require ( github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.24 // indirect github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.9 // indirect github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.23 // indirect - github.com/aws/aws-sdk-go-v2/service/secretsmanager v1.41.7 // indirect github.com/aws/aws-sdk-go-v2/service/signin v1.0.11 // indirect github.com/aws/aws-sdk-go-v2/service/sso v1.30.17 // indirect github.com/aws/aws-sdk-go-v2/service/ssooidc v1.35.21 // indirect @@ -116,7 +119,6 @@ require ( github.com/philhofer/fwd v1.2.0 // indirect github.com/planetscale/vtprotobuf v0.6.1-0.20240319094008-0393e58bdf10 // indirect github.com/power-devops/perfstat v0.0.0-20210106213030-5aafc221ea8c // indirect - github.com/prometheus/client_golang v1.23.2 // indirect github.com/prometheus/client_model v0.6.2 // indirect github.com/prometheus/common v0.67.5 // indirect github.com/prometheus/otlptranslator v1.0.0 // indirect @@ -137,7 +139,6 @@ require ( go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.67.0 // indirect go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.67.0 // indirect go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.43.0 // indirect - go.opentelemetry.io/otel/exporters/prometheus v0.65.0 // indirect go.opentelemetry.io/proto/otlp v1.10.0 // indirect go.yaml.in/yaml/v2 v2.4.4 // indirect go.yaml.in/yaml/v3 v3.0.4 // indirect diff --git a/go.sum b/go.sum index 41c53f30..2a3a8753 100644 --- a/go.sum +++ b/go.sum @@ -202,6 +202,8 @@ github.com/knadh/koanf/providers/env v1.1.0 h1:U2VXPY0f+CsNDkvdsG8GcsnK4ah85WwWy github.com/knadh/koanf/providers/env v1.1.0/go.mod h1:QhHHHZ87h9JxJAn2czdEl6pdkNnDh/JS1Vtsyt65hTY= github.com/knadh/koanf/providers/file v1.2.1 h1:bEWbtQwYrA+W2DtdBrQWyXqJaJSG3KrP3AESOJYp9wM= github.com/knadh/koanf/providers/file v1.2.1/go.mod h1:bp1PM5f83Q+TOUu10J/0ApLBd9uIzg+n9UgthfY+nRA= +github.com/knadh/koanf/providers/rawbytes v1.0.0 h1:MrKDh/HksJlKJmaZjgs4r8aVBb/zsJyc/8qaSnzcdNI= +github.com/knadh/koanf/providers/rawbytes v1.0.0/go.mod h1:KxwYJf1uezTKy6PBtfE+m725NGp4GPVA7XoNTJ/PtLo= github.com/knadh/koanf/providers/structs v1.0.0 h1:DznjB7NQykhqCar2LvNug3MuxEQsZ5KvfgMbio+23u4= github.com/knadh/koanf/providers/structs v1.0.0/go.mod h1:kjo5TFtgpaZORlpoJqcbeLowM2cINodv8kX+oFAeQ1w= github.com/knadh/koanf/v2 v2.1.2 h1:I2rtLRqXRy1p01m/utEtpZSSA6dcJbgGVuE27kW2PzQ= diff --git a/pkg/nucleus/config.go b/pkg/nucleus/config.go new file mode 100644 index 00000000..d481d3ba --- /dev/null +++ b/pkg/nucleus/config.go @@ -0,0 +1,315 @@ +// Package nucleus — config.go implements the configuration loader +// surfaced by `AppBuilder.FromConfigFile`. ADR-010 §2 names this as +// Phase 2 work. This file lands Phase 2a (single-file load with +// size cap, schema strict-unknown-fields, and did-you-mean hints +// for typos); multi-file merge with the `_append`/`_remove` suffix +// operators ships in Phase 2b. The package-level `Run(App)` and the +// direct-struct surface never traverse this loader — only the +// builder-chain `FromConfigFile` does. +package nucleus + +import ( + "errors" + "fmt" + "io" + "os" + "path/filepath" + "sort" + "strings" + + "github.com/jcsvwinston/nucleus/pkg/app" + "github.com/knadh/koanf/parsers/yaml" + "github.com/knadh/koanf/providers/rawbytes" + "github.com/knadh/koanf/providers/structs" + "github.com/knadh/koanf/v2" +) + +// MaxConfigFileBytes is the per-file size cap enforced by +// FromConfigFile before invoking the YAML parser. The cap is the +// ADR-010 §17 compliance item — it eliminates the parser-DoS class +// (anchor expansion / deep nesting) that `gopkg.in/yaml.v3` is not +// hardened against by itself. 1 MiB is generous for application +// configuration in practice while still small enough to make a +// pathological file fail loud rather than wedge the process. +const MaxConfigFileBytes = 1 << 20 // 1 MiB + +// ErrConfigFileTooLarge is returned when a configuration file exceeds +// MaxConfigFileBytes. Callers can errors.Is against this sentinel to +// distinguish a configuration-management problem (file is genuinely +// too big — split it) from a parser-side problem (bad YAML). +var ErrConfigFileTooLarge = errors.New("nucleus: configuration file exceeds the per-file size cap") + +// ErrUnsupportedConfigFormat is returned when FromConfigFile is asked +// to parse a file whose extension is not yet supported. Phase 2a +// supports .yaml / .yml only; .toml and .json land in a Phase 2b +// follow-up that adds the corresponding parsers and updates this +// sentinel's call sites. +var ErrUnsupportedConfigFormat = errors.New("nucleus: unsupported configuration file format") + +// ErrUnknownConfigKeys is returned when strict schema validation +// (the default for FromConfigFile) finds keys in the loaded file +// that do not map to any field on `app.Config` or its nested +// structs. The error's Error() reproduces the offending keys with +// "did you mean …?" hints when a close match exists. +var ErrUnknownConfigKeys = errors.New("nucleus: unknown configuration key(s)") + +// loadFromFile is the Phase 2a single-file loader used by +// AppBuilder.FromConfigFile. It returns an *app.Config populated from +// the file plus the framework's struct defaults — the same precedence +// chain LoadConfig in pkg/app uses, minus the environment-variable +// layer (env vars stay attached to LoadConfig's path; FromConfigFile +// is exclusively about files for now). +// +// Validation: +// +// - Layer 1 (syntactic): YAML parse with a 1 MiB file size cap +// enforced before reading. +// - Layer 2 (schema, strict): every yaml key must map to a +// `koanf:"..."`-tagged field on `app.Config`. Unknown keys +// produce ErrUnknownConfigKeys with did-you-mean hints. +// +// Layers 3–5 (semantic, referential, module-specific) land in later +// Phase 2 sub-iterations. +func loadFromFile(path string) (*app.Config, error) { + if path == "" { + return nil, errors.New("nucleus: FromConfigFile path is empty") + } + + ext := strings.ToLower(filepath.Ext(path)) + switch ext { + case ".yaml", ".yml": + // supported + case ".toml", ".json": + return nil, fmt.Errorf("%w: %s parsing is a Phase 2b deliverable (path=%q)", ErrUnsupportedConfigFormat, ext, path) + default: + return nil, fmt.Errorf("%w: extension %q (path=%q)", ErrUnsupportedConfigFormat, ext, path) + } + + data, err := readFileWithCap(path, MaxConfigFileBytes) + if err != nil { + return nil, err + } + + // Phase 2a uses two koanf instances: one for the file content alone + // (to enumerate keys for the strict-schema check) and one for the + // final layered load (defaults < file). Layering through a single + // koanf would merge before the strict check could fire — we'd lose + // the visibility into which keys came from the user. + fileK := koanf.New(".") + if err := fileK.Load(rawbytes.Provider(data), yaml.Parser()); err != nil { + return nil, fmt.Errorf("nucleus: parse %s: %w", path, err) + } + + // Layer 2: strict schema. Anything in the file that does not appear + // in the framework's schema-key set is a typo or stale config; fail + // loud with did-you-mean hints. + schemaKeys := app.ContractConfigKeyPatterns() + if unknown := unknownKeys(fileK.All(), schemaKeys); len(unknown) > 0 { + return nil, formatUnknownKeys(unknown, schemaKeys) + } + + // Combined load: struct defaults < file. Mirrors app.LoadConfig but + // scoped to file-only (the env-var layer remains owned by + // app.LoadConfig for callers that want it). + k := koanf.New(".") + if err := k.Load(structs.Provider(defaultsForConfig(), "koanf"), nil); err != nil { + return nil, fmt.Errorf("nucleus: load defaults: %w", err) + } + if err := k.Load(rawbytes.Provider(data), yaml.Parser()); err != nil { + return nil, fmt.Errorf("nucleus: re-parse %s: %w", path, err) + } + + var cfg app.Config + if err := k.Unmarshal("", &cfg); err != nil { + return nil, fmt.Errorf("nucleus: unmarshal %s: %w", path, err) + } + return &cfg, nil +} + +// readFileWithCap reads up to capBytes+1 bytes from path. When the +// file is larger than capBytes (the +1 is the overshoot signalling), +// it returns ErrConfigFileTooLarge wrapped with the path. Stat is not +// used as the only check because some filesystems (procfs, FUSE) lie +// about file size; reading is the source of truth. +func readFileWithCap(path string, capBytes int64) ([]byte, error) { + f, err := os.Open(path) + if err != nil { + return nil, fmt.Errorf("nucleus: open %s: %w", path, err) + } + defer f.Close() + + limited := io.LimitReader(f, capBytes+1) + data, err := io.ReadAll(limited) + if err != nil { + return nil, fmt.Errorf("nucleus: read %s: %w", path, err) + } + if int64(len(data)) > capBytes { + return nil, fmt.Errorf("%w (path=%q, cap=%d bytes)", ErrConfigFileTooLarge, path, capBytes) + } + return data, nil +} + +// defaultsForConfig returns the same defaults app.LoadConfig uses, +// reached through the public app.DefaultConfig accessor so this +// package does not need to import pkg/app's internals. +func defaultsForConfig() app.Config { + return app.DefaultConfig() +} + +// unknownKeys returns the leaf keys present in the file-koanf's +// flattened map that do NOT appear in any schemaKey prefix. The +// `app.ContractConfigKeyPatterns()` set is the canonical schema +// surface — it enumerates the koanf-bindable keys +// `pkg/app.Config` and its nested structs expose. +// +// A key matches the schema if any schemaKey is either equal to the +// key or is a prefix that the koanf flattening expanded into. Map- +// typed schema slots (like `databases..url`) are represented +// in the patterns set as `databases.*.url`; we recognise these via a +// segment-by-segment match where `*` is a wildcard. +func unknownKeys(loaded map[string]any, schemaKeys []string) []string { + patterns := compileKeyPatterns(schemaKeys) + var unknown []string + for k := range loaded { + if !keyMatchesAny(k, patterns) { + unknown = append(unknown, k) + } + } + sort.Strings(unknown) + return unknown +} + +// compiledKeyPattern is the segment-by-segment shape used by +// keyMatchesAny. `*` segments are wildcards; everything else must +// match literally. +type compiledKeyPattern []string + +func compileKeyPatterns(patterns []string) []compiledKeyPattern { + out := make([]compiledKeyPattern, 0, len(patterns)) + for _, p := range patterns { + out = append(out, strings.Split(p, ".")) + } + return out +} + +// keyMatchesAny reports whether key matches at least one of the +// supplied patterns. Matching is segment-by-segment with `*` as a +// single-segment wildcard. +func keyMatchesAny(key string, patterns []compiledKeyPattern) bool { + segments := strings.Split(key, ".") + for _, pat := range patterns { + if len(pat) != len(segments) { + continue + } + match := true + for i, p := range pat { + if p == "*" { + continue + } + if p != segments[i] { + match = false + break + } + } + if match { + return true + } + } + return false +} + +// formatUnknownKeys produces an ErrUnknownConfigKeys-wrapped error +// listing every unknown key with a did-you-mean hint when a close +// match exists in the schema (within a Levenshtein-style edit +// distance of 3 on the deepest-segment basis). +func formatUnknownKeys(unknown, schemaKeys []string) error { + var b strings.Builder + for _, k := range unknown { + b.WriteString("\n - ") + b.WriteString(k) + if hint := didYouMean(k, schemaKeys); hint != "" { + b.WriteString(" (did you mean ") + b.WriteString(hint) + b.WriteString("?)") + } + } + // Wrap the sentinel — its Error() text already names the + // "unknown configuration key(s)" preamble; we append the bullet + // list rather than re-stating the preamble. + return fmt.Errorf("%w:%s", ErrUnknownConfigKeys, b.String()) +} + +// didYouMean returns the closest schema key to `unknown` within an +// edit-distance threshold of 2 on the final segment, or the empty +// string when no schema key is close enough. The intent is to catch +// typos like `loging.level` → `logging.level` without producing +// noisy false-positive hints. +func didYouMean(unknown string, schemaKeys []string) string { + uTail := lastSegment(unknown) + if uTail == "" { + return "" + } + best := "" + bestDist := 4 // accept distance ≤3; reject 4+ + for _, k := range schemaKeys { + sTail := lastSegment(k) + if sTail == "" { + continue + } + d := levenshtein(uTail, sTail) + if d < bestDist { + bestDist = d + best = k + } + } + return best +} + +func lastSegment(k string) string { + if i := strings.LastIndex(k, "."); i >= 0 { + return k[i+1:] + } + return k +} + +// levenshtein computes the edit distance between two ASCII strings. +// Simple O(n*m) DP — config keys are short (rarely >30 chars), so +// the allocation cost is negligible compared with the readability +// win of a textbook implementation. +func levenshtein(a, b string) int { + if a == b { + return 0 + } + if len(a) == 0 { + return len(b) + } + if len(b) == 0 { + return len(a) + } + prev := make([]int, len(b)+1) + curr := make([]int, len(b)+1) + for j := range prev { + prev[j] = j + } + for i := 1; i <= len(a); i++ { + curr[0] = i + for j := 1; j <= len(b); j++ { + cost := 1 + if a[i-1] == b[j-1] { + cost = 0 + } + del := prev[j] + 1 + ins := curr[j-1] + 1 + sub := prev[j-1] + cost + curr[j] = del + if ins < curr[j] { + curr[j] = ins + } + if sub < curr[j] { + curr[j] = sub + } + } + prev, curr = curr, prev + } + return prev[len(b)] +} diff --git a/pkg/nucleus/config_test.go b/pkg/nucleus/config_test.go new file mode 100644 index 00000000..4589708a --- /dev/null +++ b/pkg/nucleus/config_test.go @@ -0,0 +1,274 @@ +package nucleus + +import ( + "errors" + "os" + "path/filepath" + "strings" + "testing" +) + +// writeTempConfig writes content to a temp file with the given +// extension and returns the path. Cleanup happens via t.TempDir(). +func writeTempConfig(t *testing.T, ext, content string) string { + t.Helper() + dir := t.TempDir() + path := filepath.Join(dir, "config"+ext) + if err := os.WriteFile(path, []byte(content), 0o600); err != nil { + t.Fatalf("write temp config: %v", err) + } + return path +} + +func TestLoadFromFile_HappyPathYAML(t *testing.T) { + t.Parallel() + + yamlBody := ` +host: 0.0.0.0 +port: 9090 +log_level: warn +` + path := writeTempConfig(t, ".yaml", yamlBody) + cfg, err := loadFromFile(path) + if err != nil { + t.Fatalf("loadFromFile: %v", err) + } + if cfg.Host != "0.0.0.0" { + t.Errorf("Host: got %q want %q", cfg.Host, "0.0.0.0") + } + if cfg.Port != 9090 { + t.Errorf("Port: got %d want %d", cfg.Port, 9090) + } + if cfg.LogLevel != "warn" { + t.Errorf("LogLevel: got %q want %q", cfg.LogLevel, "warn") + } +} + +func TestLoadFromFile_PreservesDefaultsForUnsetKeys(t *testing.T) { + t.Parallel() + + // Body sets only Port; every other field should come from + // app.DefaultConfig() (struct defaults applied first). + path := writeTempConfig(t, ".yaml", "port: 1234\n") + cfg, err := loadFromFile(path) + if err != nil { + t.Fatalf("loadFromFile: %v", err) + } + if cfg.Port != 1234 { + t.Errorf("Port: got %d want 1234", cfg.Port) + } + // LogLevel is part of app.DefaultConfig and must survive the load. + if cfg.LogLevel == "" { + t.Error("LogLevel was reset to zero value; defaults not applied") + } +} + +func TestLoadFromFile_RejectsUnsupportedExtension(t *testing.T) { + t.Parallel() + + path := writeTempConfig(t, ".ini", "[server]\nport = 80\n") + _, err := loadFromFile(path) + if err == nil { + t.Fatal("expected an error for .ini extension") + } + if !errors.Is(err, ErrUnsupportedConfigFormat) { + t.Errorf("want ErrUnsupportedConfigFormat, got %v", err) + } +} + +func TestLoadFromFile_TOMLAndJSONReportPhase2b(t *testing.T) { + t.Parallel() + + for _, ext := range []string{".toml", ".json"} { + path := writeTempConfig(t, ext, "port = 80\n") + _, err := loadFromFile(path) + if !errors.Is(err, ErrUnsupportedConfigFormat) { + t.Errorf("ext=%s want ErrUnsupportedConfigFormat, got %v", ext, err) + } + if !strings.Contains(err.Error(), "Phase 2b") { + t.Errorf("ext=%s error should reference Phase 2b, got %q", ext, err.Error()) + } + } +} + +func TestLoadFromFile_FileTooLarge(t *testing.T) { + t.Parallel() + + // Build a YAML body larger than MaxConfigFileBytes. Use a single + // long scalar to avoid producing valid YAML by accident — the + // cap is enforced BEFORE the parser ever runs. + big := strings.Repeat("# padding line that takes some bytes per line\n", (MaxConfigFileBytes/40)+1) + path := writeTempConfig(t, ".yaml", big) + _, err := loadFromFile(path) + if !errors.Is(err, ErrConfigFileTooLarge) { + t.Fatalf("want ErrConfigFileTooLarge, got %v", err) + } + if !strings.Contains(err.Error(), "cap=") { + t.Errorf("error should mention the cap, got %q", err.Error()) + } +} + +func TestLoadFromFile_FileAtCapBoundaryIsAccepted(t *testing.T) { + t.Parallel() + + // Exactly at the cap should succeed. Use a body sized so the + // final file is at or just below MaxConfigFileBytes. The body + // must remain parseable, so we keep one valid key + padding + // comment lines. + header := "port: 8080\n" + want := MaxConfigFileBytes + pad := want - len(header) + if pad < 0 { + t.Skip("MaxConfigFileBytes is smaller than the header; skipping boundary test") + } + body := header + strings.Repeat("#", pad) + if len(body) > MaxConfigFileBytes { + body = body[:MaxConfigFileBytes] + } + path := writeTempConfig(t, ".yaml", body) + cfg, err := loadFromFile(path) + if err != nil { + t.Fatalf("expected boundary-sized file to load, got %v", err) + } + if cfg.Port != 8080 { + t.Errorf("Port: got %d want 8080", cfg.Port) + } +} + +func TestLoadFromFile_StrictUnknownKey(t *testing.T) { + t.Parallel() + + // `prot` is a likely typo for `port`. Strict mode rejects it. + path := writeTempConfig(t, ".yaml", "prot: 80\n") + _, err := loadFromFile(path) + if !errors.Is(err, ErrUnknownConfigKeys) { + t.Fatalf("want ErrUnknownConfigKeys, got %v", err) + } +} + +func TestLoadFromFile_DidYouMeanHint(t *testing.T) { + t.Parallel() + + // `loging_level` is one insertion away from `log_level`. The + // hint should surface. + path := writeTempConfig(t, ".yaml", "loging_level: warn\n") + _, err := loadFromFile(path) + if !errors.Is(err, ErrUnknownConfigKeys) { + t.Fatalf("want ErrUnknownConfigKeys, got %v", err) + } + if !strings.Contains(err.Error(), "did you mean") { + t.Errorf("error should include a did-you-mean hint, got %q", err.Error()) + } +} + +func TestLoadFromFile_MissingFile(t *testing.T) { + t.Parallel() + + _, err := loadFromFile("/nonexistent/path/nucleus.yaml") + if err == nil { + t.Fatal("expected an error for missing file") + } + if errors.Is(err, ErrUnknownConfigKeys) || errors.Is(err, ErrConfigFileTooLarge) { + t.Errorf("missing-file error should not be wrapped as a config-content error, got %v", err) + } +} + +func TestLoadFromFile_MalformedYAML(t *testing.T) { + t.Parallel() + + path := writeTempConfig(t, ".yaml", "port: : bad\n - mixed: types\n") + _, err := loadFromFile(path) + if err == nil { + t.Fatal("expected a parse error for malformed YAML") + } +} + +func TestLoadFromFile_EmptyPath(t *testing.T) { + t.Parallel() + + _, err := loadFromFile("") + if err == nil { + t.Fatal("expected an error for empty path") + } +} + +func TestAppBuilder_FromConfigFile_Happy(t *testing.T) { + t.Parallel() + + path := writeTempConfig(t, ".yaml", "port: 7777\n") + a, err := New().FromConfigFile(path).Build() + if err != nil { + t.Fatalf("Build: %v", err) + } + if a.Port != 7777 { + t.Errorf("Port: got %d want 7777", a.Port) + } + if a.Modules == nil { + t.Error("Modules map should be non-nil after Build") + } +} + +func TestAppBuilder_FromConfigFile_PreservesPriorMount(t *testing.T) { + t.Parallel() + + // Mount BEFORE FromConfigFile and confirm the file load does not + // drop the registered module. + mod := Module[struct{}]{Name: "articles", Prefix: "/articles"}.Build() + path := writeTempConfig(t, ".yaml", "port: 7777\n") + a, err := New().Mount(mod).FromConfigFile(path).Build() + if err != nil { + t.Fatalf("Build: %v", err) + } + if _, ok := a.Modules["articles"]; !ok { + t.Error("Modules registered before FromConfigFile were dropped by the loader") + } +} + +func TestLevenshtein_Basics(t *testing.T) { + t.Parallel() + + for _, tc := range []struct { + a, b string + want int + }{ + {"", "", 0}, + {"port", "port", 0}, + {"prot", "port", 2}, // two transpositions (substitution-based distance) + {"port", "ports", 1}, // one insertion + {"port", "", 4}, + {"", "port", 4}, + } { + got := levenshtein(tc.a, tc.b) + if got != tc.want { + t.Errorf("levenshtein(%q, %q): got %d want %d", tc.a, tc.b, got, tc.want) + } + } +} + +func TestKeyMatchesAny_Wildcards(t *testing.T) { + t.Parallel() + + patterns := compileKeyPatterns([]string{ + "port", + "databases.*.url", + "jwt_keys.*.kid", + }) + cases := []struct { + key string + want bool + }{ + {"port", true}, + {"databases.default.url", true}, + {"databases.analytics.url", true}, + {"databases.default.user", false}, // *.user not in patterns + {"jwt_keys.signing.kid", true}, + {"jwt_keys.signing.algorithm", false}, + {"unknown", false}, + } + for _, tc := range cases { + got := keyMatchesAny(tc.key, patterns) + if got != tc.want { + t.Errorf("keyMatchesAny(%q): got %v want %v", tc.key, got, tc.want) + } + } +} diff --git a/pkg/nucleus/equivalence_test.go b/pkg/nucleus/equivalence_test.go index 83a0150a..9133a089 100644 --- a/pkg/nucleus/equivalence_test.go +++ b/pkg/nucleus/equivalence_test.go @@ -211,19 +211,25 @@ func sameFunc(a, b any) bool { return false } -// TestFromConfigFile_Phase1Stub verifies that calling FromConfigFile -// in the Phase 1 build records a deferred ErrConfigLoaderNotImplemented -// that surfaces on Build / Start. The full multi-layer loader lands in -// Phase 2. -func TestFromConfigFile_Phase1Stub(t *testing.T) { +// TestFromConfigFile_NoPaths verifies the empty-paths guard surfaces a +// clean error rather than a panic or a silent no-op. +func TestFromConfigFile_NoPaths(t *testing.T) { t.Parallel() - b := New().FromConfigFile("nucleus.yaml") + b := New().FromConfigFile() if b.Err() == nil { - t.Fatal("expected FromConfigFile to record a deferred error in Phase 1") + t.Fatal("expected FromConfigFile() with no paths to record an error") } - if _, err := b.Build(); err != ErrConfigLoaderNotImplemented { - t.Errorf("Build error: got %v, want ErrConfigLoaderNotImplemented", err) +} + +// TestFromConfigFile_MultiPathIsPhase2b verifies that passing multiple +// paths today fails loud, deferring to Phase 2b for the merge engine. +func TestFromConfigFile_MultiPathIsPhase2b(t *testing.T) { + t.Parallel() + + b := New().FromConfigFile("a.yaml", "b.yaml") + if b.Err() == nil { + t.Fatal("expected multi-path FromConfigFile to fail in Phase 2a") } } diff --git a/pkg/nucleus/nucleus.go b/pkg/nucleus/nucleus.go index d2f24d71..e3f442a4 100644 --- a/pkg/nucleus/nucleus.go +++ b/pkg/nucleus/nucleus.go @@ -34,11 +34,15 @@ // pkg/nucleus): it pins the canonical struct shape, the `Module[C any]` // generic constructor, the `Router` interface with three coexisting // registration styles, and the three-surface equivalence guarantee. -// Configuration loading (`FromConfigFile`) lands shape-only in this -// phase — the five-layer validator and the suffix-operator merge -// engine arrive in Phase 2; until then, calling `FromConfigFile` -// surfaces `ErrConfigLoaderNotImplemented` when the builder is -// realised via `Build`, `Start`, or `Serve`. +// Configuration loading lands progressively: Phase 2a (this state) +// ships `FromConfigFile` against a single YAML file with the 1 MiB +// size cap (MaxConfigFileBytes), strict-unknown-fields schema +// validation, and did-you-mean hints for likely typos. Multi-file +// merge with the `_append` / `_remove` suffix operators is the +// Phase 2b deliverable — passing more than one path today returns a +// targeted error referencing that phase. TOML / JSON parsers and +// the deeper semantic / referential / module-specific validator +// layers follow in subsequent Phase 2 sub-iterations. package nucleus import ( @@ -79,17 +83,6 @@ func WithExtensions(exts ...Extension) Option { return app.WithExtensions(exts.. // option is active so the choice is visible in operational telemetry. func WithOpenAuthz() Option { return app.WithOpenAuthz() } -// ErrConfigLoaderNotImplemented is returned by `AppBuilder.Build`, -// `AppBuilder.Start`, and `AppBuilder.Serve` when the application was -// constructed via `FromConfigFile` and the Phase 1 build of -// `pkg/nucleus` is in use. The full multi-layer config loader and -// merge engine lands in ADR-010 Phase 2; until then, callers should -// either drop the `FromConfigFile(...)` call (configure -// programmatically) or pass a pre-loaded configuration via the -// direct-struct surface — the package-level `Run(App)` never goes -// through the loader and therefore never surfaces this sentinel. -var ErrConfigLoaderNotImplemented = errors.New("nucleus: FromConfigFile is not implemented in Phase 1 (see ADR-010 §Implementation phases)") - // LifecycleHooks holds app-level callbacks that fire before the // HTTP listener starts and after the listener returns. Module-level // `OnStart` / `OnShutdown` continue to live on `ModuleSpec`; the @@ -161,14 +154,31 @@ func New() *AppBuilder { } } -// FromConfigFile records the intent to load configuration from one or -// more files. The full implementation — five-layer validator, -// suffix-operator merge engine, mixed-format support — lands in -// ADR-010 Phase 2. In Phase 1 the call sets a deferred error on the -// builder; `Build` / `Start` / `Serve` surface -// `ErrConfigLoaderNotImplemented`. Including the method on the -// builder today keeps the public shape stable across phases so module -// authors can integrate against the canonical signature now. +// FromConfigFile loads configuration from one or more files. The +// first path is read via the Phase 2a single-file loader +// (`loadFromFile` in config.go) which enforces: +// +// - 1 MiB per-file size cap (see MaxConfigFileBytes) — eliminates +// parser-DoS classes against the underlying YAML parser. +// - YAML format only (extension `.yaml` or `.yml`). TOML and JSON +// parsers ship in Phase 2b. +// - Strict-unknown-fields schema validation against +// `app.ContractConfigKeyPatterns()`. Unknown keys surface as +// `ErrUnknownConfigKeys` with did-you-mean hints for likely +// typos. +// +// Multi-file merge (with the `_append` / `_remove` suffix operators +// and `null`-resets-to-default semantics) is the Phase 2b deliverable; +// passing more than one path today surfaces a Phase-2b sentinel error +// when the builder is realised. The signature accepts variadic paths +// now so that module authors can write the canonical call site +// (`FromConfigFile("base.yaml", "override.yaml")`) before the merge +// engine lands. +// +// Errors accumulate on the builder and surface at `Build` / `Start` / +// `Serve` — the bufio.Scanner pattern. `Err()` exposes the +// accumulator for callers that want to inspect chain status before +// realising. func (b *AppBuilder) FromConfigFile(paths ...string) *AppBuilder { if b.err != nil { return b @@ -177,7 +187,23 @@ func (b *AppBuilder) FromConfigFile(paths ...string) *AppBuilder { b.err = errors.New("nucleus: FromConfigFile requires at least one path") return b } - b.err = ErrConfigLoaderNotImplemented + if len(paths) > 1 { + // Phase 2b sentinel — the merge engine isn't ready yet, so + // failing loud is safer than silently using only the first + // path. The error references the ADR phase so the call-site + // hint is actionable. + b.err = fmt.Errorf("nucleus: multi-file FromConfigFile is a Phase 2b deliverable (received %d paths)", len(paths)) + return b + } + cfg, err := loadFromFile(paths[0]) + if err != nil { + b.err = err + return b + } + // Preserve fluent-chain Modules/Middleware/Services/etc. that the + // caller registered before FromConfigFile — only the embedded + // app.Config slot is replaced. + b.a.Config = *cfg return b }