[codex] Migrate Android to LiteRT-LM and refresh Gemma 4 example#9
[codex] Migrate Android to LiteRT-LM and refresh Gemma 4 example#9riderx wants to merge 8 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds dual-backend LLM support and platform defaults: iOS defaults to Apple Intelligence (optional MediaPipe), Android prefers LiteRT‑LM (.litertlm) with legacy MediaPipe (.task) fallback, and Web remains MediaPipe-based. Implements backend selection, model loading/resolution, streaming/error event emission, docs/examples updates, and adds LiteRT dependency. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant UI as Example App\n(HomePage.vue)
participant Plugin as LLMPlugin\n(Capacitor)
participant Backend as Backend\nSelector
participant LiteRT as LiteRT\nEngine
participant MediaPipe as MediaPipe\nInference
participant Model as Model\nFile
User->>UI: Select model & send message
UI->>Plugin: setModel(path, modelType, options)
Plugin->>Backend: Determine backend by modelType/extension
alt modelType == 'litertlm'
Backend->>Model: Resolve .litertlm path (/android_asset/... or download)
Backend->>LiteRT: Initialize Engine & Conversation
LiteRT-->>Backend: Engine ready
else modelType == 'task' or iOS system model
Backend->>Model: Resolve .task or system model
Backend->>MediaPipe: Initialize LlmInference
MediaPipe-->>Backend: Inference ready
end
Backend-->>Plugin: onModelLoaded()
Plugin-->>UI: notify readiness
User->>UI: sendMessage(chatId, text)
UI->>Plugin: sendMessage(chatId, text)
alt Using LiteRT
Plugin->>LiteRT: conversation.sendMessageAsync()
LiteRT-->>Plugin: streamed chunk events
Plugin-->>UI: streaming updates (aiFinished with chatId)
else Using MediaPipe
Plugin->>MediaPipe: Run inference with prompt/history
MediaPipe-->>Plugin: stream/final text
Plugin-->>UI: streaming updates or final message
end
Plugin->>UI: emit generationError on failure (with chatId if partial output)
UI-->>User: display response or error
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3bdc17f5d8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
android/src/main/java/ee/forgr/capgo_llm/LLM.java (1)
330-401:⚠️ Potential issue | 🟠 MajorClose
LlmInferenceSessionon the error path too.
inferenceSession.close()only runs in thedonebranch. If anything throws aftercreateFromOptions()succeeds but before completion, the catch exits without releasing the native session.🛠️ Suggested fix
- com.google.mediapipe.tasks.genai.llminference.LlmInferenceSession inferenceSession = - com.google.mediapipe.tasks.genai.llminference.LlmInferenceSession.createFromOptions(llmInference, sessionOptions); - - inferenceSession.addQueryChunk(fullPrompt); + com.google.mediapipe.tasks.genai.llminference.LlmInferenceSession inferenceSession = null; + inferenceSession = + com.google.mediapipe.tasks.genai.llminference.LlmInferenceSession.createFromOptions(llmInference, sessionOptions); + + inferenceSession.addQueryChunk(fullPrompt); @@ - } catch (Exception exception) { - callback.onError(exception.getMessage()); + } catch (Exception exception) { + try { + if (inferenceSession != null) { + inferenceSession.close(); + } + } catch (Exception closeException) { + android.util.Log.e("LLM", "Failed to close MediaPipe session", closeException); + } + callback.onError(exception.getMessage()); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@android/src/main/java/ee/forgr/capgo_llm/LLM.java` around lines 330 - 401, The LlmInferenceSession created by com.google.mediapipe.tasks.genai.llminference.LlmInferenceSession.createFromOptions (stored in inferenceSession) is only closed in the done branch, so if an exception occurs after creation the native session leaks; ensure inferenceSession is closed on the error path by tracking the created inferenceSession reference and calling inferenceSession.close() from the catch (or better, a finally) block while null-checking it and swallowing or logging any close exceptions (reference symbols: createFromOptions, inferenceSession, generateResponseAsync, resultListener).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@android/src/main/java/ee/forgr/capgo_llm/LLM.java`:
- Around line 63-95: setModel currently stores request parameters on instance
fields (modelPath, modelType, maxTokens, topk, temperature, randomSeed), calls
resetModelState(), then calls initializeModel which queues work that reads those
mutable fields allowing race between rapid setModel calls; to fix,
capture/snapshot all request parameters into a local immutable request object
(or final local variables) inside setModel and pass that snapshot into
initializeModel (e.g., initializeModel(requestSnapshot, callback)), add a
monotonic load/version token (e.g., long loadId) on the instance that you
increment in setModel and include in the snapshot, and then have initializeModel
and its queued/completion logic verify the loadId matches the instance's current
loadId before committing state or invoking the callback so out-of-order tasks
are discarded.
- Around line 316-321: sendMessageWithMediaPipe creates a new
LlmInferenceSession per send but calls session.buildPrompt(message) which only
returns the latest user text, so prior turns in session.history never reach the
model; fix by constructing the full prompt from the accumulated session history
(e.g., use session.getHistory() or change session.buildPrompt to include
history) and pass that full prompt into the LlmInferenceSession invocation (or
reuse the existing LlmInferenceSession instead of creating a fresh one) so the
model sees prior turns and the .task fallback preserves context.
In `@example-app/src/views/HomePage.vue`:
- Around line 484-488: Currently the UI state (selectedModel.value,
chatId.value, setConversationIntro) is changed before the model is actually
ready; instead, perform downloadModel(), setModel(), and initializeChat() first
inside a try/catch and only update selectedModel.value, chatId.value and call
setConversationIntro(...) after those succeed; on failure restore the previous
selectedModel (capture previousSelected = selectedModel.value), do not clear
chatId or change intro, and surface the error (e.g., via showToast/log) so the
usable conversation isn’t lost and the failed model isn’t shown as selected.
- Around line 87-95: The send button in HomePage.vue is icon-only (ion-button
with ion-icon using sendIcon) and needs an accessible name so screen readers
announce its purpose; add an explicit label by setting an aria-label (e.g.,
"Send message") on the ion-button or include visually hidden text inside the
button, ensuring the existing bindings (`@click`="sendMessage",
:disabled="!canSendMessage", class="send-button") remain unchanged so behavior
and styling are preserved.
---
Outside diff comments:
In `@android/src/main/java/ee/forgr/capgo_llm/LLM.java`:
- Around line 330-401: The LlmInferenceSession created by
com.google.mediapipe.tasks.genai.llminference.LlmInferenceSession.createFromOptions
(stored in inferenceSession) is only closed in the done branch, so if an
exception occurs after creation the native session leaks; ensure
inferenceSession is closed on the error path by tracking the created
inferenceSession reference and calling inferenceSession.close() from the catch
(or better, a finally) block while null-checking it and swallowing or logging
any close exceptions (reference symbols: createFromOptions, inferenceSession,
generateResponseAsync, resultListener).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4309c9c5-3bb9-47a8-bf92-a227c86400fe
📒 Files selected for processing (10)
README.mdandroid/build.gradleandroid/src/main/java/ee/forgr/capgo_llm/LLM.javaandroid/src/main/java/ee/forgr/capgo_llm/LLMPlugin.javaexample-app/README.mdexample-app/src/views/HomePage.vueexample-app/tests/unit/example.spec.tsios/Sources/LLMPlugin/LLMPlugin.swiftpackage.jsonsrc/definitions.ts
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
android/src/main/java/ee/forgr/capgo_llm/LLMPlugin.java (1)
104-116:⚠️ Potential issue | 🟡 MinorAlign Android’s implicit
randomSeedwith the other platforms.Android now starts honoring
randomSeed, but the fallback here is101whilesrc/web.tsuses0. That makes omittedrandomSeedvalues behave differently by platform, which is surprising now that the examples do not set it explicitly.Suggested fix
- Integer randomSeed = call.getInt("randomSeed", 101); + Integer randomSeed = call.getInt("randomSeed", 0);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@android/src/main/java/ee/forgr/capgo_llm/LLMPlugin.java` around lines 104 - 116, The Android code uses call.getInt("randomSeed", 101) which defaults randomSeed to 101 and causes platform divergence; change the default to 0 so the local variable randomSeed and the llm.setModel(...) invocation use 0 when omitted (matching src/web.ts). Update the getInt call for "randomSeed" to use 0 as the fallback so llm.setModel(path, modelType, maxTokens, topk, temperature, randomSeed, ...) behaves consistently across platforms.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@README.md`:
- Around line 372-379: The generated README table marks fields like modelType,
randomSeed, companionUrl, filename, and GenerationErrorEvent.chatId as required
even though they are optional in src/definitions.ts; update the JSDoc/docgen
source so optional properties are emitted as optional (e.g., use the JSDoc
optional syntax with square brackets or the `@property` optional marker in the
typedefs in src/definitions.ts or adjust the docgen config that transforms JSDoc
into markdown) so the generated markdown tables preserve optionality for those
symbols (modelType, randomSeed, companionUrl, filename,
GenerationErrorEvent.chatId).
In `@src/web.ts`:
- Around line 74-80: The catch block currently both emits
notifyListeners('generationError', ...) and rethrows the error, causing web
callers to receive duplicate failure signals; in the method that contains this
block (e.g., sendMessage / the response-generation path), remove the rethrow and
return early after calling notifyListeners('generationError', ...) so post-start
generation failures are only surfaced via the generationError event (ensure the
function returns the appropriate void/Promise resolution consistent with its
signature instead of throwing).
---
Outside diff comments:
In `@android/src/main/java/ee/forgr/capgo_llm/LLMPlugin.java`:
- Around line 104-116: The Android code uses call.getInt("randomSeed", 101)
which defaults randomSeed to 101 and causes platform divergence; change the
default to 0 so the local variable randomSeed and the llm.setModel(...)
invocation use 0 when omitted (matching src/web.ts). Update the getInt call for
"randomSeed" to use 0 as the fallback so llm.setModel(path, modelType,
maxTokens, topk, temperature, randomSeed, ...) behaves consistently across
platforms.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e8a07369-1671-4904-afa3-782f8243c122
📒 Files selected for processing (6)
README.mdandroid/src/main/java/ee/forgr/capgo_llm/LLMPlugin.javaexample-app/src/views/HomePage.vueios/Sources/LLMPlugin/LLMPlugin.swiftsrc/definitions.tssrc/web.ts
🚧 Files skipped from review as they are similar to previous changes (3)
- ios/Sources/LLMPlugin/LLMPlugin.swift
- src/definitions.ts
- example-app/src/views/HomePage.vue
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@android/src/main/java/ee/forgr/capgo_llm/LLM.java`:
- Around line 35-45: Multiple threads read/write shared fields (isReady,
activeBackend, engine, llmInference, modelPath, modelType, currentLoadId) and
the chatSessions map while only applyLoadedModel() is synchronized, causing data
races; fix by making shared state access thread-safe: replace HashMap
chatSessions with a ConcurrentHashMap, mark simple status fields (isReady,
activeBackend, modelPath, modelType, currentLoadId) as volatile, and guard
compound operations and mutations (applyLoadedModel(), resetModelState(),
createChat(), sendMessage(), and any methods referenced around lines 270-292 and
308-328) with a single ReentrantReadWriteLock (use readLock for reads in
createChat()/sendMessage() and writeLock for mutations in
applyLoadedModel()/resetModelState()), ensuring engine and llmInference are read
under the lock or published safely to avoid races.
- Around line 138-141: When a load is superseded after calling
initializeMediaPipeModel, explicitly close the created LlmInference instance
before dropping it and ensure resetModelState() also closes and clears the
llmInference field; update the block after initializeMediaPipeModel(actualPath,
request) (and the notifySupersededLoad path) to call llmInference.close() or
loadedInference.close() if not null when a supersede is detected (use the same
close pattern used for LlmInferenceSession), and modify resetModelState() to
safely close llmInference and set it to null to avoid native resource leaks.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7854fb8b-f7e8-4251-b05f-cd5322f5cd54
📒 Files selected for processing (2)
android/src/main/java/ee/forgr/capgo_llm/LLM.javaexample-app/src/views/HomePage.vue
🚧 Files skipped from review as they are similar to previous changes (1)
- example-app/src/views/HomePage.vue
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
android/src/main/java/ee/forgr/capgo_llm/LLM.java (1)
253-269: Consider skipping asset copy if destination already exists.The method copies the asset unconditionally. For repeated model loads with the same asset path, checking if the destination file exists could avoid redundant I/O.
Proposed optimization
private void copyAssetToFile(String assetPath, File destination) throws Exception { + if (destination.exists()) { + return; + } + File parent = destination.getParentFile(); if (parent != null && !parent.exists()) { parent.mkdirs(); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@android/src/main/java/ee/forgr/capgo_llm/LLM.java` around lines 253 - 269, The copyAssetToFile method currently always copies the asset; modify it to first check if the destination file already exists (destination.exists()) and return early if present to avoid redundant I/O. Keep the existing parent directory creation logic (parent.mkdirs()) but add the existence check before opening the asset stream; if you need stronger validation, compare file length or checksum of destination vs. asset (using context.getAssets().open(assetPath).available() or a quick checksum) and only perform the copy when sizes/checks differ. Ensure this change is applied inside the copyAssetToFile method.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@android/src/main/java/ee/forgr/capgo_llm/LLM.java`:
- Around line 190-202: In initializeLiteRtModel replace the incorrect
EngineConfig positional call that supplies three Backend.CPU() and uses
maxTokens with a call matching LiteRT-LM 0.10.0: construct EngineConfig with
modelPath, a single Backend (new Backend.CPU()), the cache directory string
(context.getCacheDir().getAbsolutePath()), and explicit nulls for visionBackend
and audioBackend (or supply appropriate backends if used), and ensure any
maxTokens usage is renamed/removed to match the library’s maxNumTokens API;
update the EngineConfig invocation in LLM.initializeLiteRtModel and any related
variables to match the com.google.ai.edge.litertlm 0.10.0 constructor signature.
---
Nitpick comments:
In `@android/src/main/java/ee/forgr/capgo_llm/LLM.java`:
- Around line 253-269: The copyAssetToFile method currently always copies the
asset; modify it to first check if the destination file already exists
(destination.exists()) and return early if present to avoid redundant I/O. Keep
the existing parent directory creation logic (parent.mkdirs()) but add the
existence check before opening the asset stream; if you need stronger
validation, compare file length or checksum of destination vs. asset (using
context.getAssets().open(assetPath).available() or a quick checksum) and only
perform the copy when sizes/checks differ. Ensure this change is applied inside
the copyAssetToFile method.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: acb3b933-ac7e-4b0b-9cfd-a55bf6290534
📒 Files selected for processing (1)
android/src/main/java/ee/forgr/capgo_llm/LLM.java
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e28f85f366
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
What
.litertlmmodels while keeping a MediaPipe fallback for legacy.taskfilesWhy
.taskexpectationsHow
com.google.ai.edge.litertlm:litertlm-android:0.10.0and rewrote the Android model loader/chat flow to route.litertlmmodels through LiteRT-LM conversations.taskhandling through the existing MediaPipe path so old assets still loadlitert-communitysendMessageimplementation into helpers so the repo lint gate stays greenTesting
bun run lintbun run check:wiringbun run verifycd example-app && bun install && bun run buildcd example-app && bun run test:unit -- --runNot Tested
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Refactor
Tests
Chores