chore(O): remove SIMD/NEON backend and BackendType::Metal by dexwritescode · Pull Request #21 · dexwritescode/neurons

dexwritescode · 2026-05-06T19:17:59Z

Summary

Delete cpu_buffer.h, compute_backend.mm (abandoned Obj-C++ draft), and the simd/ directory
Remove simd_graph() and metal_graph() convenience helpers from graph.h
Drop BackendType::Metal from the enum — updates compute_backend.cpp, neurons_service.cpp, and the non-MLX mock in test_model_loader.cpp

Apple Silicon → MLX only. Linux/Windows → CUDA/ROCm (future phases). No CPU SIMD fallback or bare Metal backend will exist for LLM-scale inference.

Delete cpu_buffer.h, compute_backend.mm (abandoned Obj-C++ draft), and the simd/ directory. Remove the misleadingly-named simd_graph() helper. Apple Silicon → MLX only. Linux/Windows → CUDA/ROCm (future phases). No CPU SIMD fallback path will ever be needed for LLM-scale inference.

Delete cpu_buffer.h, compute_backend.mm (abandoned Obj-C++ draft), and the simd/ directory. Remove simd_graph() and metal_graph() helpers, and drop BackendType::Metal from the enum. Apple Silicon → MLX only. Linux/Windows → CUDA/ROCm (future phases). No CPU SIMD or bare Metal backend will be added for LLM-scale inference.

…backs Phase O cleanup: the Tensor/BackendBuffer and ComputeGraph/ComputeGraphBuilder abstractions were bypassed entirely on Apple Silicon (all three model families used mlx_weights_ directly). Removing them closes ~7 100 lines of dead code and leaves ComputeBackend as a thin lifecycle handle only. Removed: - core/tensor.{h,cpp}, core/graph.{h,cpp} - backends/mlx/mlx_buffer.h, mlx_utils.h - model/kv_cache.h - model/gemma_model{,_base}.{h,cpp} - model/qwen3_moe_model{,_base}.{h,cpp} - tests/compute/test_symbolic_api.cpp, test_mlx_backend.cpp Simplified: - ComputeBackend: 5 lifecycle methods only (type/name/is_available/initialize/cleanup) - MlxBackend: implements those 5 methods; ~730 lines of Tensor ops deleted - LlamaModel, GemmaModelMLX, Qwen3MoeModelMLX: removed inheritance from base Tensor-path classes; MLX classes own config_ and tokenizer_ directly - ModelLoader: load_model()/load_all_safetensors() removed; load_model_mlx() kept - language_model.cpp: Gemma/Qwen3MoE dispatch is now MLX-only - BackendType::Metal removed (vestigial, never instantiated) - Tests updated to remove calls to deleted APIs (forward(), attention_layer(), wrap_native_tensor(), load_model(backend))

…ethods ComputeBackend is now a pure lifecycle abstraction: type(), name(), is_available(), initialize(), cleanup(). All ~40 Tensor-based math methods (matmul, dequantize, rope, softmax, sdpa, etc.) are removed from the interface and MlxBackend. GemmaModelMLX and Qwen3MoeModelMLX no longer inherit from their Tensor-based base classes; config_ and tokenizer_ are owned directly. ModelLoader no longer exposes load_model() or load_all_safetensors().

- ErrorCode: remove InvalidArgument, InsufficientMemory, TensorNotFound, NotImplemented — none were ever returned in production code - ComputeBackend: remove preferred_batch_size() and supports_async() — declared and overridden in MlxBackend but never called by any client - ModelConfig: remove name_or_path and transformers_version — parsed from JSON but never read after parsing - LlamaModel: remove context_size_ member — set in mlx_setup(), never read - Qwen3MoeModelMLX: remove context_size_ member — same pattern - Delete tinyllama_inference.h/.cpp — Phase D compatibility alias no longer needed; update 5 test files to use LlamaModel directly - Delete test_attention_qkv_trace.cpp — became an empty placeholder after attention_layer() was removed in Phase O

dexwritescode added 2 commits May 6, 2026 14:57

dexwritescode added the release:skip Skips release creation on merge label May 6, 2026

dexwritescode added 3 commits May 6, 2026 16:04

dexwritescode merged commit 7ff379e into main May 6, 2026
3 checks passed

dexwritescode deleted the chore-remove-simd-neon-backend branch May 6, 2026 23:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(O): remove SIMD/NEON backend and BackendType::Metal#21

chore(O): remove SIMD/NEON backend and BackendType::Metal#21
dexwritescode merged 5 commits intomainfrom
chore-remove-simd-neon-backend

dexwritescode commented May 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dexwritescode commented May 6, 2026

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant