- ANE forward/backward pass pipeline
- Adam optimizer with gradient accumulation
- Checkpoint save/resume (survives exec() restarts)
- NDJSON app-CLI communication protocol
- SwiftUI macOS app with live dashboard
- BPE tokenizer (encode + decode)
- CLI commands: train, tokenize, export, info, benchmark
- GGUF export (for llama.cpp)
- CoreML export
- Basic test suite (43 CLI + 32 Swift)
Six major features added:
-
E: LR Scheduler — Cosine annealing with linear warmup
--warmup N,--lr-min,--lr-schedule cosine- LR displayed in dashboard, included in step JSON
-
F: Data Pipeline — Multi-shard, shuffle, train/val split
--val-data,--val-every N,--shuffle- Validation loss tracked and charted
-
C: Live Charts — EMA smoothing, TFLOPS chart, val loss overlay
- EMA toggle (alpha=0.98), chart window picker (All/500/1K/2K)
- TFLOPS over time, validation loss dashed overlay
-
D: Text Generation — Autoregressive inference with sampling
neuralforge generate --prompt "..." --temperature 0.8 --top-p 0.9- Streaming token output via NDJSON
- GenerateView in app with parameter controls
-
A: LoRA Fine-Tuning — Low-rank adaptation
- Rank 4-64, configurable alpha, target selection (Q/K/V/O)
- Tiny checkpoints (~2MB), merge-on-export support
-
B: Multi-Model Support — Runtime dimensions
- ModelConfig replaces compile-time #defines
- All MIL generators and CPU ops parameterized
Additional v2.0 work:
- Security audit (input validation, bounds checking, NDJSON escaping)
- Compile timer UX (orange banner with seconds counter)
- Expanded test suite (109 CLI + 119 Swift tests)
- Replace O(n^2) BPE with priority queue (max-heap) algorithm — O(n log n)
- Target: tokenize 1MB text in <1 second — achieved: 936ms for 1MB
- Maintain compatibility with existing tokenizer.bin format — all 112 tests pass
- Speed tests added: 10K (8.7ms), 100K (86ms), 1M (936ms)
- CLI
tokenizecommand now works on large files (previously hung on >10KB) - Status: Complete.
-
neuralforge ingestCLI subcommand with full pipeline - PDF text extraction (via PDFKit/Quartz)
- DOCX text extraction (via macOS textutil)
- Plain text (.txt, .md, .csv, .json, code files) support
- Code file support (.py, .js, .c, .m, .h, .swift, .rs, .go, .java, .ts, .tsx, .jsx)
- Manifest file for shard tracking (JSON with version, timestamps, processed files)
- Incremental mode (--incremental) — skips unchanged files based on mtime
- Configurable shard size (--max-shard-mb, default 50MB)
- App UI: IngestView with source/output folder pickers, shard size picker, incremental toggle
- CLIRunner.ingest() with streaming per-file progress via NDJSON
- Audit log integration for ingest events
- 16 CLI tests + 12 Swift tests covering extraction, scanning, manifests, JSON parsing
- Status: Complete.
- Append-only JSONL log file (
~/Library/Logs/NeuralForge/audit.jsonl) - Log: training start/stop, config used, checkpoint saves, exports, generation
- Log: who ran what, when, with which data (user, timestamp, model, data paths)
- Tamper detection — SHA-256 hash chain (prev_hash → hash per entry)
- Hash chain verification function (
nf_audit_verify) with tamper location detection - Convenience functions:
nf_audit_training_start/stop,checkpoint,export,generate - 11 CLI tests + 11 Swift tests covering chain integrity, tamper detection, format validation
- Status: Complete.
- Model registry with 5 built-in models (SmolLM 135M/360M/1.7B, TinyLlama 1.1B/1.1B-base)
-
neuralforge modelsCLI command (text + JSON output) -
neuralforge downloadCLI command with streaming NDJSON progress - Python converter: HuggingFace safetensors → llama2.c format (convert_hf.py)
- GQA → MHA KV head expansion for cross-architecture compatibility
- Tokenizer conversion (HF tokenizer.json → tokenizer.bin)
- Model card JSON metadata (model_card.json per download)
- ModelCardView in app: browse models, download with progress, architecture details
- 13 CLI tests + 8 Swift tests covering registry, search, JSON emission, download events
- Status: Complete.
- Claude API client (NFIntelligence.swift) — async URLSession, rate limiting (5 req/min), retry
- API key management — macOS Keychain storage (save/load/delete), settings UI
- Training assistant chat view (AssistantView.swift) — full chat UI with message history
- System prompt with live training context (model info, loss curve, config, TFLOPS)
- Auto hyperparameter suggestions — analyzes setup, returns structured JSON, one-click apply
- Generated text evaluation — fluency/coherence/creativity scoring with grammar analysis
- Privacy-first design: only metadata sent, never weights or training data, 100% optional
- 12 Swift tests covering message types, JSON parsing, context building, rate limiting
- Status: Complete.
- AuditLogReader.swift — JSONL parser with SHA-256 hash chain verification
- AuditEntry model with computed properties (eventIcon, eventColor, summary, date)
- AuditStats aggregation (entry counts, users, training time, best loss)
- AuditVerification with chain integrity status and tamper location detection
- AuditDashboardView.swift — full compliance audit log viewer
- Stats bar (entries, trainings, checkpoints, exports, generations, train time, best loss)
- Filter bar with event type picker and text search
- Scrollable entry list with icons, seq numbers, event types, timestamps, truncated hashes
- Hash chain verification sheet with visual pass/fail indicators
- Entry detail sheet showing all audit fields + hash chain info
- CSV export via NSSavePanel
- 12 Swift tests covering entry parsing, chain format, verification, stats, CSV export
- Status: Complete.
- SyncService.swift — checkpoint sync engine with configurable shared directory
- Automatic checkpoint detection and sync across all projects
- Shared model registry — browse synced checkpoints and models
- LaunchAgent plist generation with configurable interval (5-120 min)
- LaunchAgent install/uninstall via launchctl
- Restore checkpoints from shared directory to any project
- Sync status tracking (idle, syncing, success, error)
- Pending sync detection (unsynced checkpoints)
- SyncDashboardView.swift — full sync UI with setup, status, shared browser
- Sync history with file sizes and step numbers
- Project name sanitization for safe directory names
- Sync tab added to ProjectDetailView
- 12 Swift tests covering config, codable, paths, plist, sanitization
- Status: Complete. CloudKit/S3 backend deferred to future release.
- CloudKit or S3 backend for remote sync
- Conflict resolution for concurrent training
- ComplianceReportGenerator.swift — generates structured reports from audit data
- Three compliance frameworks: General Audit, HIPAA, SOX
- HIPAA sections: §164.312(a-e) — audit controls, access control, integrity, authentication, transmission
- SOX sections: §302 management assessment, §404 internal controls, separation of duties, config changes
- Date range filtering for report period selection
- Hash chain integrity verification integrated into all report types
- Multi-user access tracking with threshold-based warnings
- ComplianceReportView.swift — full report UI with framework picker, preview, export
- Report status badges: Compliant (green), Needs Review (orange), Non-Compliant (red)
- Section severity indicators: INFO, PASS, REVIEW, CRITICAL
- Text export (plain text with formatted sections)
- PDF export (via NSPrintOperation)
- Reports tab added to ProjectDetailView
- 12 Swift tests covering frameworks, statuses, sections, filtering, thresholds
- Status: Complete.
- Web dashboard for audit log aggregation across machines
- Multi-user compliance reporting with merged logs
- ComputeClusterService.swift — Bonjour-based multi-Mac ANE compute cluster
- DeviceCapabilities model — chip detection, ANE TFLOPS estimation, memory, CPU/GPU cores
- IOPlatformUUID-based stable device identification
- Chip family database: M1/M2/M3/M4 (base/Pro/Max/Ultra) with TFLOPS + GPU core estimates
- NWListener-based service advertisement with TXT record metadata
- NWBrowser-based automatic device discovery on local network
- ClusterNode model with status tracking (discovered/available/training/syncing/error/offline)
- TFLOPS-weighted shard distribution algorithm for data parallelism
- Cluster metrics aggregation (total TFLOPS, total memory, node count)
- ComputeClusterView.swift — full cluster dashboard UI
- Local device info card with chip, memory, ANE TFLOPS, CPU cores
- Discovered nodes list with status badges and capability display
- Shard distribution visualization with proportional bars
- Cluster tab added to ProjectDetailView
- 12 Swift tests covering service type, status, TFLOPS/memory formatting, GPU/ANE estimates, shard distribution, device model
- Status: Complete. Gradient aggregation protocol deferred to future release.
- Dedicated SettingsView.swift with 4-tab layout (General, Training, API Keys, About)
- CLI binary path management with browse, auto-detect, and status indicator
- Default training hyperparameters (steps, LR, accumulation, checkpoint, grad clip, seed)
- Default scheduler settings (warmup, LR schedule, shuffle, LoRA rank)
- API key management — Claude API + HuggingFace token via macOS Keychain
- Export format defaults (GGUF, llama2c, CoreML)
- Auto-save interval and max history entries settings
- About tab with version info, CLI status, and feature summary
- 12 Swift tests covering defaults, ranges, identifiers, options
- Status: Complete.
- TrainingHistoryService.swift — persist completed training runs to JSON
- TrainingRun model with full metadata: project, model, config snapshot, results, timestamps
- TrainingRunConfig snapshot preserving all hyperparameters at time of run
- LossPoint model for serializable loss curve data
- Loss curve downsampling for efficient storage (500 train + 200 val points max)
- Auto-save on training completion via
recordCompletedRun() - Run queries: by project, best run, recent, search (name/notes/model/LoRA)
- CRUD operations: add, delete, batch delete, update notes, clear
- TrainingHistoryView.swift — browse runs with sortable table (date, steps, loss, duration, LR, LoRA, TFLOPS)
- Search bar with text filtering
- Sort orders: newest, oldest, best loss, longest
- Run detail sheet with loss chart, config, model info, notes
- ComparisonSheet — side-by-side run comparison with overlaid loss curves
- CSV export via NSSavePanel
- Best run trophy indicator
- History tab added to ProjectDetailView
- 12 Swift tests covering model, formatting, codable, downsample, search, export
- Status: Complete.
- BenchmarkService.swift — perplexity evaluation engine with persistent results
- BenchmarkResult model: perplexity, avg loss, tokens, eval time, checkpoint info
- Perplexity scoring via CLI
evaluatePerplexitycommand - CLIRunner.evaluatePerplexity() — streaming batch evaluation with NDJSON progress
- Checkpoint-to-checkpoint comparison with trend detection (improving/stable/degrading)
- BenchmarkStats aggregation (best/worst/avg perplexity, trend analysis)
- Automated quality regression detection with configurable thresholds
- RegressionAlert model with warning (>0.5) and critical (>2.0) severity levels
- BenchmarkView.swift — full evaluation UI with stats bar, perplexity chart, results table
- Evaluation controls: data path picker, run button, streaming progress
- BenchmarkDetailSheet with metrics, checkpoint info, metadata
- CSV export for benchmark results
- Best result star indicator
- Benchmarks tab added to ProjectDetailView
- 12 Swift tests covering perplexity math, formatting, trends, regression, stats, export
- Status: Complete.
- OnboardingView.swift — 4-page first-run wizard (Welcome, Setup, Goal, Ready)
- CLI binary auto-detection from common paths + manual browse
- HuggingFace token input with macOS Keychain storage
- Training goal selection (Experiment / Fine-tune / Production) with per-goal defaults
- Goal-based default hyperparameters (steps: 1K/5K/10K, LR: 3e-4/2e-4/1e-4)
- First project creation on completion
- @AppStorage("onboardingComplete") conditional routing in app entry point
- Dynamic window sizing (620×520 onboarding, 1200×800 main)
- NFKeychain extension for arbitrary service/account key pairs
- 12 Swift tests covering goals, defaults, page nav, path validation
- Status: Complete.
- MenuBarManager.swift — @MainActor singleton for training status
- Real-time tracking: step, total, loss, best loss, TFLOPS, ms/step
- Progress percentage and ETA calculation with timer-based elapsed tracking
- Dynamic menu bar icon (bolt.fill when training, cpu when idle)
- MenuBarView with training metrics grid (loss, TFLOPS, ms/step, elapsed, ETA)
- Idle state display with "Open NeuralForge" and "Quit" actions
- MenuBarExtra scene with .window style in app entry point
- 10 Swift tests covering progress, ETA, formatting, icons, loss tracking
- Status: Complete.
- QuantizationService.swift — GGUF quantization and CoreML conversion pipeline
- 8 quantization types: F16, Q8_0, Q5_1, Q5_0, Q4_1, Q4_0, Q3_K_M, Q2_K
- Per-type metadata: bits/weight, quality rating (1-10), descriptions
- Size estimation: estimateSize(modelParams:quantType:) and formatSize()
- QuantizationJob model with status tracking (pending/running/success/failed)
- CoreMLConfig with compute unit selection (All/CPU+GPU/CPU) and precision (F16/F32)
- QuantizationService @MainActor singleton: quantizeGGUF(), convertCoreML()
- ExportView updated with quantization picker, size estimates, export history
- 18 Swift tests covering types, ordering, size estimation, formatting, jobs
- Status: Complete.
- Full eval pipeline in AssistantView.requestEvaluation()
- Auto-detect tokenizer from model directory (tokenizer.bin, tokenizer.model)
- Generate 3 text samples with diverse prompts via CLIRunner.generate()
- Evaluate samples via Claude API (NFIntelligence.evaluateGeneratedText)
- Display eval report and collected samples in AssistantView
- 9 Swift tests covering prompts, tokenizer detection, path construction, params
- Status: Complete.
- Fixed 3 HIGH bugs: orphan CLIRunner, swallowed taps, broken selection checkboxes
- Fixed 5 MEDIUM bugs: @StateObject → @ObservedObject for singletons, CSV export filtering, tokenizer auto-detection, SyncDashboardView status enum matching
- Fixed 4 LOW bugs: deprecated onChange, sync config save, unused env object, eval stub
- Status: Complete. All 15 bugs verified fixed, BUILD SUCCEEDED.
- @AppStorage("onboardingComplete") conditional routing
- Dynamic defaultSize based on onboarding state
- MenuBarExtra scene with MenuBarView and dynamic status icon/title
- EnvironmentObject injection for projectManager + cliRunner on all views
- 8 Swift tests covering routing, window sizing, env objects
- Status: Complete.
- CloudSyncProvider protocol (upload, download, list, delete, testConnection)
- S3SyncProvider with AWS Signature V4 (HMAC-SHA256), presigned URLs
- CloudKitSyncProvider with CKContainer/CKDatabase/CKAsset (iCloud private DB)
- CloudSyncConfig (Codable) with S3/CloudKit settings, Keychain credential storage
- CloudSyncManager (@MainActor singleton): upload/download/list/sync/testConnection
- 12 Swift tests covering config, errors, URL construction, credential handling
- Status: Complete.
- GradientMessage wire protocol (Codable): assignWork, gradientReady, aggregated, heartbeat, syncCheckpoint
- AggregationConfig: AllReduce, ParameterServer, GossipProtocol strategies
- StragglerPolicy: Wait, Skip, Timeout modes
- GradientAggregator (@MainActor singleton): coordinator/worker modes, all-reduce averaging, ring-reduce
- GradientMetrics (ObservableObject): rounds, throughput, straggler/failure counts, rolling averages
- Gradient compression (threshold-based sparsification) and checksum verification
- 15 Swift tests covering strategies, metrics, compression, ring topology
- Status: Complete.
- WebDashboardConfig (Codable): port, bind address, auth token, refresh interval
- AuditAggregator: local log scanning, multi-machine sync directory scanning, entry merging
- AuditAPIHandler: HTTP request parsing, 6 REST routes (/, /api/entries, /api/stats, /api/verify, /api/machines, /health)
- Full HTML dashboard with dark theme, stats cards, filter bar, audit entry table, auto-refresh
- AuditWebServer (@MainActor singleton): NWListener-based HTTP server, CORS support, bearer token auth
- Thread-safe connection handling with ObjectIdentifier-based tracking
- 18 Swift tests covering config, URL generation, request parsing, response serialization, auth
- Status: Complete.
- XCUITest target added to Xcode project (NeuralForgeUITests)
- 22 UI test cases covering onboarding flow, main view, project creation, settings, menus, window sizing
- Launch argument support for
-onboardingCompleteto test both onboarding and main flows - Accessibility validation tests
- Status: Complete. UI test target builds and compiles.
- Fixed
CLIRunner.swiftunused[weak self]capture in CoreML export callback - Fixed
ComputeClusterService.swiftnon-exhaustive switch on NWTXTRecord.Entry - Fixed
BenchmarkService.swiftunusedbatchCountvariable in eval callback - Fixed AppIcon.appiconset — added 3 unassigned children (64x64, 64x64@2x, 1024x1024) to Contents.json
- Status: Complete. Zero warnings on clean build.
- Comprehensive README rewrite with all v5.x features
- Updated test counts (508 total: 152 CLI + 356 Swift)
- Added generate command documentation and examples
- Added macOS app feature list (16 features)
- Updated architecture diagram (39 source files, UITests)
- Status: Complete.
- TrainingProfile model (Codable): name, description, config, tags, lastUsed
- 5 built-in presets: Quick Test, Standard, Long Run, LoRA Fine-Tune, Conservative
- TrainingProfileService (@MainActor singleton): CRUD, search, filter by tag, recent tracking
- Profile diff computation — show config differences between two profiles
- Apply profile to project — one-click config swap
- Create profile from project — extract current config into reusable preset
- Import/export profiles as JSON for sharing across machines
- Duplicate profiles with auto-naming
- 15 Swift tests covering presets, serialization, search, diff, apply, recent tracking
- Status: Complete.
- DragDropDataService (@MainActor singleton): batch file processing with progress tracking
- 9 supported file types: txt, md, json, jsonl, csv, pdf, swift, py, html
- File validation: size limits (100MB), empty file detection, UTF-8 encoding check
- DroppedFileResult model with success/skipped/error status tracking
- IngestBatch aggregation: success/error/skip counts, total characters, line counts
- Staging directory workflow: validate → stage → concatenate → tokenize
- Token count estimation (chars ÷ 4 for English text)
- File size formatting (B/KB/MB)
- Configurable batch limit (1000 files per batch)
- 15 Swift tests covering extensions, limits, formatting, batching, staging, progress
- Status: Complete.
- WebhookNotificationService (@MainActor singleton): multi-provider webhook delivery
- 3 providers: Slack (attachments), Discord (embeds), Generic (JSON)
- 7 event types: training started/completed/failed, checkpoint, validation improved, loss target, export
- WebhookConfig (Codable): per-endpoint event filtering, metrics toggle, custom message
- Provider-specific payload formatting with color-coded status indicators
- Delivery tracking with success/failure history (max 100 entries)
- Test webhook functionality for verification
- Success rate monitoring and per-webhook delivery history
- Thread-safe nonisolated network calls with URLSession
- 15 Swift tests covering providers, events, payloads, delivery tracking, serialization
- Status: Complete.
- MLXBackendService (@MainActor singleton): compute backend selection and management
- 3 compute backends: ANE (Neural Engine), MLX (Metal GPU), CPU (Accelerate)
- MLX availability detection via Python subprocess (version check)
- MLXModelInfo model: param count formatting, memory estimation per quantization
- Backend compatibility matrix per model format (bin/safetensors/gguf/npz)
- Performance multiplier estimates (ANE ~10x, MLX ~7x, CPU baseline)
- CLI argument generation per backend (--backend mlx, --no-ane-extras)
- MLX training/generate command generation (mlx_lm.lora, mlx_lm.generate)
- Backend benchmarking with forward/backward pass timing, TFLOPS, memory
- System capabilities query (CPU count, memory, OS version)
- 15 Swift tests covering backends, formatting, memory estimation, commands, benchmarks
- Status: Complete.
| Component | Tests | Last Verified |
|---|---|---|
| CLI (test_cli.m) | 152 | 2026-03-07 |
| Swift (NeuralForgeTests.swift) | 416 | 2026-03-07 |
| Xcode build (43 source files) | SUCCEEDED | 2026-03-07 |
| Real training (50 steps) | PASSED | 2025-03-07 |
| Real generation (100 tokens) | PASSED | 2025-03-07 |
| CLI tokenize (45KB file) | PASSED | 2025-03-07 |
| Issue | Severity | Status |
|---|---|---|
| Fixed — replaced with O(n log n) heap | ||
| Training data may be Git LFS placeholder (15 bytes) | Medium | Workaround: regenerate with Python |
tokenize command on large files |
Fixed — same fix, 1MB in <1s | |
| First ANE compile takes 20-30s (no visible progress) | Low | Fixed — compile timer added |