Flutter Gemma with Gemma 3 Nano & Multimodal Support

Run Gemma AI models locally on Flutter apps with official MediaPipe GenAI v0.10.24

Supports: Gemma 2B, Gemma 7B, Gemma-2 2B, Gemma-3 1B, Gemma 3 Nano 1.5B ✨

✅ What's New in v0.9.0

🛠️ Function Calling - Models can call external functions (Gemma 3 Nano, DeepSeek)
🧠 Thinking Mode - View reasoning process with DeepSeek models
🖼️ Multimodal Support - Text + Image input with vision models
🚀 Full Gemma 3 Nano Support with MediaPipe GenAI v0.10.24
⚡ Official CocoaPods - No custom pods needed!
🎯 GPU/CPU Backend Selection works correctly
📱 iOS & Android Compatible

Quick Start

1. Installation

dependencies:
  flutter_gemma: ^0.9.0

2. iOS Setup

Automatic Dependencies: The plugin automatically uses official MediaPipe CocoaPods:

MediaPipeTasksGenAI: 0.10.24
MediaPipeTasksGenAIC: 0.10.24

Required iOS Configuration:

Set minimum iOS version in ios/Podfile:

platform :ios, '16.0'  # Required for MediaPipe GenAI

Add memory entitlements for large models in ios/Runner/Runner.entitlements:

<key>com.apple.developer.kernel.extended-virtual-addressing</key>
<true/>
<key>com.apple.developer.kernel.increased-memory-limit</key>
<true/>

Enable file sharing in ios/Runner/Info.plist:

<key>UIFileSharingEnabled</key>
<true/>

3. Android Setup (Automatic)

Uses official MediaPipe GenAI:

com.google.mediapipe:tasks-genai:0.10.24
com.google.mediapipe:tasks-vision:0.10.24 (for multimodal support)

4. Basic Text Usage

import 'package:flutter_gemma/flutter_gemma.dart';

// Initialize
final gemma = FlutterGemmaPlugin.instance;

// Set model path
await gemma.modelManager.setModelPath('path/to/gemma-3n-model.task');

// Create model with Gemma 3n optimized settings
final model = await gemma.createModel(
  modelType: ModelType.gemmaIt,
  preferredBackend: PreferredBackend.gpu, // or cpu
  maxTokens: 2048,
);

// Create chat
final chat = await model.createChat(
  temperature: 0.8, // default: 0.8
  randomSeed: 1, // default: 1
  topK: 40, // default: 1
  topP: 0.9, // optional nucleus sampling
  // tokenBuffer: 256, // default: 256
);

// Generate response
await chat.addQueryChunk(Message.text(text: "Hello!", isUser: true));
final response = await chat.generateChatResponse();
if (response is TextResponse) {
  print(response.token);
}

5. 🖼️ Multimodal Usage (NEW!)

import 'dart:typed_data'; // For Uint8List

// Create model with image support
final model = await gemma.createModel(
  modelType: ModelType.gemmaIt,
  supportImage: true, // Enable multimodal support
  maxNumImages: 1,
  maxTokens: 4096,
);

// Create chat with image support
final chat = await model.createChat(
  temperature: 0.8,
  randomSeed: 1,
  topK: 1,
  supportImage: true, // Enable images in chat
);

// Send text + image message
final imageBytes = await loadImageBytes(); // Your image loading method
await chat.addQueryChunk(Message.withImage(
  text: "What's in this image?",
  imageBytes: imageBytes,
  isUser: true,
));

final response = await chat.generateChatResponse();
if (response is TextResponse) {
  print(response.token);
}

6. 🛠️ Function Calling (NEW!)

// 1. Define tools (functions the model can call)
final List<Tool> tools = [
  const Tool(
    name: 'change_color',
    description: 'Changes the background color',
    parameters: {
      'type': 'object',
      'properties': {
        'color': {'type': 'string', 'description': 'Color name'},
      },
      'required': ['color'],
    },
  ),
];

// 2. Create chat with tools (works with Gemma 3 Nano and DeepSeek models)
final chat = await model.createChat(
  temperature: 0.8,
  randomSeed: 1,
  topK: 1,
  tools: tools,
  supportsFunctionCalls: true, // Auto-detected for supported models
  isThinking: true, // Enable thinking mode for DeepSeek
  modelType: ModelType.deepSeek, // Specify model type for DeepSeek
);

// 3. Handle different response types
chat.generateChatResponseAsync().listen((response) {
  if (response is TextResponse) {
    // Regular text from model
    print('Text: ${response.token}');
  } else if (response is FunctionCallResponse) {
    // Model wants to call a function
    print('Function: ${response.name}');
    print('Args: ${response.args}');
    
    // Execute function and send response back
    _handleFunctionCall(response);
  } else if (response is ThinkingResponse) {
    // Model's reasoning process (DeepSeek only)
    print('Thinking: ${response.content}');
    
    // Show thinking bubble in UI
    _showThinkingBubble(response.content);
  }
});

7. 🧠 Thinking Mode (DeepSeek Models)

// Create model with thinking support
final model = await gemma.createModel(
  modelType: ModelType.deepSeek,
  maxTokens: 2048,
);

// Create chat with thinking mode enabled
final chat = await model.createChat(
  temperature: 0.8,
  randomSeed: 1,
  topK: 1,
  isThinking: true, // Enable thinking mode
  modelType: ModelType.deepSeek, // Required for DeepSeek
);

// Handle thinking responses
chat.generateChatResponseAsync().listen((response) {
  if (response is ThinkingResponse) {
    // Show model's reasoning process
    print('Model thinking: ${response.content}');
    _showThinkingBubble(response.content);
    
  } else if (response is TextResponse) {
    // Final answer after thinking
    print('Final answer: ${response.token}');
    _updateResponse(response.token);
  }
});

Thinking Mode Features:

✅ See the model's reasoning process in real-time
✅ Interactive thinking bubbles in UI
✅ Works with function calling
✅ DeepSeek models only

8. 📱 Message Types

// Text only
final textMsg = Message.text(text: "Hello!", isUser: true);

// Text + Image
final multimodalMsg = Message.withImage(
  text: "Describe this image",
  imageBytes: imageBytes,
  isUser: true,
);

// Image only
final imageMsg = Message.imageOnly(imageBytes: imageBytes, isUser: true);

// Check if message has image
if (message.hasImage) {
  print('Message contains an image');
}

🎯 Supported Models

Text-Only Models

Model	Size	Backend	Function Calls	Thinking Mode	Download
Gemma 2B	2B	CPU/GPU	❌	❌	HuggingFace
Gemma 7B	7B	CPU/GPU	❌	❌	HuggingFace
Gemma-2 2B	2B	CPU/GPU	❌	❌	HuggingFace
Gemma-3 1B	1B	CPU/GPU	❌	❌	HuggingFace
DeepSeek R1	1.5B	CPU/GPU	✅	✅	HuggingFace

🖼️ Multimodal Models (Vision + Text)

Model	Size	Backend	Vision Support	Function Calls	Thinking Mode	Download
Gemma 3n E2B	1.5B	CPU/GPU	✅	✅	❌	HuggingFace
Gemma 3n E4B	1.5B	CPU/GPU	✅	✅	❌	HuggingFace

🔧 MediaPipe Dependencies

This plugin uses official MediaPipe libraries:

iOS:

# Automatically included in your Podfile
pod 'MediaPipeTasksGenAI', '0.10.24'
pod 'MediaPipeTasksGenAIC', '0.10.24'

Android:

# Automatically included
implementation 'com.google.mediapipe:tasks-genai:0.10.24'
implementation 'com.google.mediapipe:tasks-vision:0.10.24' # For multimodal support

Web:

// Automatically loaded from CDN
https://cdn.jsdelivr.net/npm/@mediapipe/tasks-genai/wasm

🌐 Platform Support

Feature	Android	iOS	Web
Text Generation	✅	✅	✅
Function Calling	✅	✅	✅
Thinking Mode	✅	✅	✅
Image Input	✅	⚠️	⚠️
GPU Acceleration	✅	✅	✅
Streaming	✅	✅	✅

✅ = Fully supported
⚠️ = Coming soon / Limited support

🚀 Why This Works

✅ Official MediaPipe Support - No custom frameworks
✅ Version 0.10.24 - Includes Gemma 3 Nano + vision support
✅ Function Calling - External function integration (Gemma 3 Nano, DeepSeek)
✅ Thinking Mode - Transparent AI reasoning (DeepSeek models)
✅ Automatic Integration - Flutter handles CocoaPods/Gradle
✅ Cross-Platform - Same API for iOS/Android/Web
✅ Multimodal Ready - Text + Image input support
✅ Simple API - One parameter to enable images/functions/thinking

🖼️ Multimodal Examples

Basic Image Analysis

final model = await gemma.createModel(
  modelType: ModelType.gemmaIt,
  supportImage: true,
);

final session = await model.createSession();
await session.addQueryChunk(Message.withImage(
  text: "What do you see in this image?",
  imageBytes: imageBytes,
  isUser: true,
));

// Note: session.getResponse() returns String directly
final response = await session.getResponse();
print(response);
await session.close();

Chat with Images

final chat = await model.createChat(
  temperature: 0.8,
  supportImage: true,
);

// Add text message
await chat.addQueryChunk(Message.text(text: "Hello!", isUser: true));
final textResponse = await chat.generateChatResponse();
if (textResponse is TextResponse) {
  print(textResponse.token);
}

// Add image message
await chat.addQueryChunk(Message.withImage(
  text: "Can you analyze this image?",
  imageBytes: imageBytes,
  isUser: true,
));
final imageResponse = await chat.generateChatResponse();
if (imageResponse is TextResponse) {
  print(imageResponse.token);
}

📝 Example

Check the example app for a complete implementation with Gemma 3 Nano models and multimodal support.

🛟 Troubleshooting

iOS Setup Issues:

# Clean and reinstall pods
cd ios && pod install --repo-update

# If memory issues occur, ensure entitlements are added
# Check ios/Runner/Runner.entitlements contains memory entitlements

Memory Issues on iOS:

Ensure Runner.entitlements contains memory entitlements
Use smaller models (1B-2B parameters) for devices with <6GB RAM
Enable GPU backend for better performance: PreferredBackend.gpu

Android Build Issues:

flutter clean && flutter pub get

Image Support Issues:

Ensure you're using a multimodal model (Gemma 3n E2B/E4B)
Set supportImage: true when creating model and chat
Images are automatically processed when included in Message.withImage()

Web Platform:

Image support is in development for web platform
Text-only models work fully on web

📄 License

MIT License - Use official MediaPipe libraries with confidence!

BRO-GRRAMMER APPROVED ✅ - Simple, clean, uses official MediaPipe with multimodal support!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flutter Gemma with Gemma 3 Nano & Multimodal Support

✅ What's New in v0.9.0

Quick Start

1. Installation

2. iOS Setup

3. Android Setup (Automatic)

4. Basic Text Usage

5. 🖼️ Multimodal Usage (NEW!)

6. 🛠️ Function Calling (NEW!)

7. 🧠 Thinking Mode (DeepSeek Models)

8. 📱 Message Types

🎯 Supported Models

Text-Only Models

🖼️ Multimodal Models (Vision + Text)

🔧 MediaPipe Dependencies

🌐 Platform Support

🚀 Why This Works

🖼️ Multimodal Examples

Basic Image Analysis

Chat with Images

📝 Example

🛟 Troubleshooting

📄 License

FilesExpand file tree

README_SIMPLIFIED.md

Latest commit

History

README_SIMPLIFIED.md

File metadata and controls

Flutter Gemma with Gemma 3 Nano & Multimodal Support

✅ What's New in v0.9.0

Quick Start

1. Installation

2. iOS Setup

3. Android Setup (Automatic)

4. Basic Text Usage

5. 🖼️ Multimodal Usage (NEW!)

6. 🛠️ Function Calling (NEW!)

7. 🧠 Thinking Mode (DeepSeek Models)

8. 📱 Message Types

🎯 Supported Models

Text-Only Models

🖼️ Multimodal Models (Vision + Text)

🔧 MediaPipe Dependencies

🌐 Platform Support

🚀 Why This Works

🖼️ Multimodal Examples

Basic Image Analysis

Chat with Images

📝 Example

🛟 Troubleshooting

📄 License