GitHub - jskoiz/yemma: Gemma 4 right on your iPhone.

AI that lives on your iPhone.
Good for notes, writing, questions, and image help. Private by design, offline after setup, and honest about what local AI does best.

What it's good for · Screenshots · Structure · Gemma 4 MLX Port · Model Bundle · Build

This repo contains the iOS app, landing page, and brand assets.

Yemma runs Gemma 4 E2B locally through a Swift-native MLX multimodal runtime. One model bundle handles both text and image flows. After a one-time download (~4.2 GB), the app works entirely offline — no cloud inference, no accounts, no telemetry.

What it's good for

Quick rewrites and everyday writing help
Personal notes and thinking out loud
Everyday questions answered on-device
Image explanations and visual help
Offline use — planes, commutes, anywhere without signal
Low-friction, no-account AI when you just need a hand

Yemma is not trying to replace frontier cloud models. Where you need deep reasoning, broad world knowledge, or giant workflows, cloud AI is still better. Where you want something local, private, and always available, Yemma is a good fit.

Features

Streaming chat with markdown rendering, image attachments, and conversation history
Resumable background model bundle download (~4.2 GB first-time setup)
On-device multimodal text and image inference via MLXVLM
Local model-bundle validation before the app marks setup complete
Configurable response style, temperature, and response limits
Light / Dark / System appearance modes
Built-in diagnostics, debug probes, and simulator mock mode

Screenshots


Advanced controls Temperature, context window, flash attention, response length.	Debug probes Markdown and renderer test scenarios.	Diagnostics Event log, copyable logs, runtime metadata.

Structure

ContentView.swift — root state machine (onboarding vs chat)
LLMService.swift — MLX multimodal load, generation, streaming, and runtime lifecycle
MLXModelSupport.swift — model directory validation and Gemma 4 asset contract checks
ModelDownloader.swift — single-repository download, resume, cleanup, and local validation
ConversationStore.swift — chat history persistence
YemmaPromptPlanner.swift — prompt shaping for the chat experience
Gemma4SmokeAutomation.swift — smoke checks for the shipped model path
SettingsView.swift / AdvancedSettingsView.swift — runtime tuning, diagnostics, debug probes
Appearance.swift — theme system
website/ — landing page and brand assets

Gemma 4 MLX Port

Yemma originally ran Gemma 4 through two separate GGUF assets: a text model plus a standalone mmproj vision projector. The current MLX integration replaces that with one Swift-native multimodal bundle and one runtime container.

The important distinction is that MLX Swift already provided the general model-loading, tokenizer, and VLM infrastructure. The missing work was Gemma 4 support on the Swift side, plus Yemma-specific integration around download, validation, prompt shaping, and runtime lifecycle.

Validated upstream baseline:

mlx-swift-lm at 3.31.3 for Gemma 4 model, processor, and parity fixes
mlx-swift-examples at 31b6cf6 for app-side smoke validation and request-shaping patterns

How the current Yemma integration works:

Package.swift pulls in MLX, MLXLMCommon, MLXVLM, Hub, and Tokenizers, so the runtime stays inside Swift instead of bridging through llama.cpp or Objective-C++ vision code.
ModelDownloader pulls one MLX model repository, currently mlx-community/gemma-4-e2b-it-4bit, using *.safetensors, *.json, and *.jinja patterns instead of downloading a text GGUF and a second mmproj file. Yemma also recognizes legacy local bundles from EZCon/gemma-4-E2B-it-4bit-mlx.
ModelDirectoryValidator proves the downloaded bundle is structurally usable by checking required metadata files, processor config, tokenizer files, weight shards, and safetensors index references before the app accepts setup as complete.
Gemma4MLXSupport enforces the Gemma 4 multimodal asset contract in Swift by cross-checking processor and model values like soft-token budgets, patch size, and pooling kernel size. It also normalizes a known compatibility gap when a bundle is missing a top-level pad_token_id.
LLMService converts each conversation turn into structured Chat.Message and UserInput values with optional image URLs, then calls context.processor.prepare(input:) so MLX performs the image and text preprocessing directly inside the same runtime path as inference.
The current implementation uses VLMModelFactory.shared._load(...) to load the entire Gemma 4 VLM from one local directory, so text generation and image understanding live in one ModelContainer instead of separate GGUF and projector runtimes.
Yemma still adds app-side stability logic around the MLX runtime, including prompt shaping, smoke checks, and output filtering for noisy hidden-channel and control-token responses.

What that buys us:

no standalone mmproj download
no Objective-C++ multimodal bridge
one model bundle to download, validate, load, unload, and delete
one Swift runtime path for both text-only and image-assisted turns

Model Bundle

Current default download source: mlx-community/gemma-4-e2b-it-4bit
Legacy-compatible local bundle ID: EZCon/gemma-4-E2B-it-4bit-mlx
Approximate first-download size: 4.2 GB
Downloaded file classes: safetensors weights, tokenizer/config JSON, processor config, and chat template files
Runtime contract: config.json, tokenizer.json, tokenizer_config.json, processor_config.json or preprocessor_config.json, plus one or more readable .safetensors weight files and any referenced safetensors index entries

After the bundle is downloaded, Yemma can load, unload, and run it entirely on device.

Build

Open Yemma4.xcodeproj in a recent Xcode with Swift 6.1 support.
Run on a physical iPhone with iOS 17+ for real MLX inference.
Use ./scripts/sim_run.sh for simulator testing with mocked replies.
Use ./scripts/device_startup_probe.sh when you need a clean first-launch timing probe on device.

Release

App Store Connect deployment via asc-cli.

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 176 Commits
Tests/Yemma4Tests		Tests/Yemma4Tests
Yemma4.xcodeproj		Yemma4.xcodeproj
Yemma4		Yemma4
ci_scripts		ci_scripts
docs		docs
scripts		scripts
website		website
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
METADATA.md		METADATA.md
Package.swift		Package.swift
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What it's good for

Features

Screenshots

Structure

Gemma 4 MLX Port

Model Bundle

Build

Release

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

What it's good for

Features

Screenshots

Structure

Gemma 4 MLX Port

Model Bundle

Build

Release

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages