Default continuous_batching breaks inference: ArraysCache.__init__() missing 'size' (vllm-mlx 0.2.6)

### Summary
Stacks generated by mlx-stack enable `continuous_batching: true` by default. With **vllm-mlx v0.2.6 + mlx 0.31.1**, this makes every inference request fail in the engine loop, while the server still accepts connections (so requests just hang).

### Error (server log, repeats per request)
```
ERROR:vllm_mlx.engine_core:Engine loop error: ArraysCache.__init__() missing 1 required positional argument: 'size'
```

### Isolation (scratch port, single flag at a time)
| flag | result |
|---|---|
| `--continuous-batching` | ❌ inference fails, 198 ArraysCache errors logged |
| `--use-paged-cache` | ✅ inference OK, 0 errors |

So **`continuous_batching` is the trigger**; `use_paged_cache` is fine in isolation.

### Repro
1. Stack tier with `vllm_flags: { continuous_batching: true }`
2. `mlx-stack up`
3. `curl localhost:8000/v1/chat/completions -d '{"model":"...","messages":[{"role":"user","content":"hi"}],"max_tokens":8}'` → hangs; log fills with the error above.

### Notes / suggested fix
The root `ArraysCache` defect is almost certainly in vllm-mlx, but mlx-stack enabling `continuous_batching` **by default** means every generated stack is broken on this (current) vllm-mlx/mlx combo. Suggest one of: make it opt-in, gate the default behind a vllm-mlx/mlx version check, or validate it during `up` (see companion issue on health-check false positives). Removing `continuous_batching` from the tier flags fully resolves it.

**Environment**
- mlx-stack 0.3.8
- vllm-mlx v0.2.6
- mlx 0.31.1
- macOS 26.2 (arm64), Apple M4 Pro, 64 GB
- model: `mlx-community/Qwen3.5-9B-4bit`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default continuous_batching breaks inference: ArraysCache.init() missing 'size' (vllm-mlx 0.2.6) #51

Summary

Error (server log, repeats per request)

Isolation (scratch port, single flag at a time)

Repro

Notes / suggested fix

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

flag	result
`--continuous-batching`	❌ inference fails, 198 ArraysCache errors logged
`--use-paged-cache`	✅ inference OK, 0 errors

Default continuous_batching breaks inference: ArraysCache.__init__() missing 'size' (vllm-mlx 0.2.6) #51

Description

Summary

Error (server log, repeats per request)

Isolation (scratch port, single flag at a time)

Repro

Notes / suggested fix

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Default continuous_batching breaks inference: ArraysCache.init() missing 'size' (vllm-mlx 0.2.6) #51