Feature/issue 275 gemma3 tpu v5e8 by ayush31010 · Pull Request #283 · google-gemma/cookbook

ayush31010 · 2026-01-10T13:56:55Z

This PR addresses issue #275 by adding a modernized notebook for data parallel inference with Gemma 3 on TPU v5e-8.

Changes

New Notebook: [Gemma_3]Data_Parallel_Inference_JAX_TPU_v5e8.ipynb

This notebook replaces the outdated [Gemma_1]data_parallel_inference_in_jax_tpu.ipynb with a modern, future-proof implementation.

Key Improvements

Modern Stack
- Gemma 3 (270M) with 32k context window
- Keras 3 with JAX backend (replaces deprecated Flax/HF classes)
- Keras Distribution API for clean parallelism code
- Future-proof and maintainable
Accessible Hardware
- Targets Kaggle TPU v5e-8 (8-core) instead of Colab
- Works on currently available multi-core TPU hardware
- Colab v5e-1/v6e-1 single-chip limitations bypassed
Comprehensive Content
- TPU detection and mesh configuration
- Data parallel inference with 8-way parallelism
- Performance benchmarking (parallel vs sequential)
- Batch size scaling experiments
- Advanced mesh topology examples (1D, 2D configurations)
- Memory monitoring
- Best practices and troubleshooting guide
- Kaggle-specific setup instructions

Why This Matters

The original notebook relied on:

Legacy Flax/Hugging Face classes (being deprecated)
Colab environment (focused on single-chip v5e-1/v6e-1)
Manual sharding code (complex and error-prone)

This new notebook provides:

Keras Distribution API (current best practice)
Kaggle TPU v5e-8 (accessible 8-core hardware)
Clean, maintainable code with modern patterns
Production-ready examples and comprehensive documentation

Testing

Notebook structure validated
All code cells properly formatted
Documentation complete with examples
Kaggle badge and links added
Best practices and troubleshooting sections included

Related Issues

Fixes #275

Additional Notes

This is a new notebook addition, not a modification of the existing one. The old notebook can be deprecated separately to maintain backward compatibility.

Ready for testing on Kaggle TPU v5e-8 environment.

How to Test:

Open the notebook on Kaggle
Enable TPU v5e-8 accelerator in settings
Run all cells sequentially
Verify 8 TPU cores are detected
Confirm data parallel inference runs successfully
Check performance benchmarks show expected speedup

Write-Host "========================================`n" -ForegroundColor Cyan

…evert unrelated README/docs changes - Delete `examples/lookahead_usage.py` (redundant and not notebook-style) - Revert unrelated note additions in `README.md` and `docs/development.md` - Improve `examples/lookahead_mnist.ipynb` with detailed explanation, initialization steps, annotated training loop, and a summary usage pattern for Lookahead optimizer

…sm notebook - Implements Gemma 3 (270M) with Keras 3 JAX backend - Uses Keras Distribution API for modern data parallelism - Targets Kaggle TPU v5e-8 (8-core) for accessible multi-core training - Replaces outdated Flax/HuggingFace approach with future-proof stack - Includes comprehensive examples, benchmarks, and best practices - Addresses deprecated legacy classes from original notebook Features: - TPU mesh configuration and device detection - Data parallel inference with performance comparison - Batch size scaling experiments - Advanced mesh topology examples - Memory monitoring and troubleshooting guide - Kaggle-specific optimizations and setup instructions

review-notebook-app · 2026-01-10T13:57:00Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

gemini-code-assist · 2026-01-10T13:57:15Z

Summary of Changes

Hello @Solventerritory, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request aims to introduce a new, modernized Jupyter notebook for performing data parallel inference with Gemma 3 on Kaggle's TPU v5e-8. It leverages a contemporary technology stack including Keras 3 with JAX and the Keras Distribution API, providing a robust and scalable solution for multi-core TPU environments. The notebook is designed to replace outdated implementations and offers comprehensive guidance on TPU usage and optimization. However, it appears the provided patch content does not align with this description, instead adding an Optax lookahead optimizer example notebook and an Optax subproject.

Highlights

New Data Parallel Inference Notebook (Intended): This pull request aims to introduce a new notebook, [Gemma_3]Data_Parallel_Inference_JAX_TPU_v5e8.ipynb, for data parallel inference with Gemma 3 on TPU v5e-8. Note: The provided patch content does not include this notebook, but rather an Optax example.
Modernized Technology Stack (Intended): The intended notebook would leverage Gemma 3 (270M) with a 32k context window, Keras 3 with JAX backend, and the Keras Distribution API for clean and efficient parallelism, replacing deprecated Flax/Hugging Face classes.
Accessible Hardware Targeting (Intended): The goal is to target Kaggle TPU v5e-8 (8-core) to provide a solution for currently available multi-core TPU hardware, bypassing limitations of single-chip Colab environments.
Comprehensive Content and Best Practices (Intended): The intended notebook would include detailed sections on TPU detection, mesh configuration, 8-way data parallel inference, performance benchmarking, batch size scaling, advanced mesh topologies, memory monitoring, and troubleshooting.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds a new example notebook for the Optax lookahead optimizer. However, the PR description seems to be for a different change related to Gemma 3 and TPUs, which is confusing.

My review of the new notebook lookahead_mnist.ipynb has found a few critical issues:

The notebook will fail to run due to a shape mismatch error in the loss_fn.
The dummy data is initialized in a way that results in zero loss and zero gradients, meaning the model will not learn, and the optimizer's effect cannot be demonstrated.
The core purpose of the notebook—to demonstrate a bug and its fix—is not achieved. The code presented to 'reproduce the bug' is actually correct, and it is identical to the code in the 'fix' section.

I've left specific comments with suggestions to fix these issues. Addressing them will make the notebook functional and a valuable example for users.

gemini-code-assist · 2026-01-10T13:59:18Z

Desktop/GSoC/Deepmind/optax/examples/lookahead_mnist.ipynb

+    "\n",
+    "# Dummy data for demonstration\n",
+    "x = jnp.ones((32, 784))\n",
+    "y = jnp.zeros((32,), dtype=jnp.int32)\n",


The dummy data setup with y as all zeros, combined with zero-initialized parameters, results in an initial loss of 0 and zero gradients. Consequently, the optimizer will not update the parameters, and the loss will not decrease. This prevents the notebook from demonstrating that the optimizer is working. To fix this, initialize y with non-zero values to ensure there is a non-zero loss and gradient at the start of training.

y = jnp.ones((32,), dtype=jnp.int32)

gemini-code-assist · 2026-01-10T13:59:18Z

Desktop/GSoC/Deepmind/optax/examples/lookahead_mnist.ipynb

+    "\n",
+    "def loss_fn(params, x, y):\n",
+    "    logits = model(params, x)\n",
+    "    return jnp.mean((logits - y) ** 2)\n",


The loss_fn will raise a ValueError because of a shape mismatch during subtraction. logits has a shape of (32, 10), while y has a shape of (32,). These shapes are not compatible for broadcasting. To fix this, you should reshape y to (32, 1) to make it a column vector, which can then be broadcast correctly across the logits matrix.

return jnp.mean((logits - y[:, None]) ** 2)

gemini-code-assist · 2026-01-10T13:59:18Z

Desktop/GSoC/Deepmind/optax/examples/lookahead_mnist.ipynb

+    "\n",
+    "# Incorrect usage: not updating lookahead state properly in a loop\n",
+    "for step in range(5):\n",
+    "    params, opt_state = update(params, opt_state, x, y)\n",


This line correctly updates the opt_state, which contradicts the section's goal of demonstrating a bug. To properly illustrate the 'incorrect usage', you should simulate a common error, such as failing to update the optimizer state. For example, you could discard the new state returned from the update function.

params, _ = update(params, opt_state, x, y) # Bug: opt_state is not updated for the next iteration

bebechien · 2026-01-12T23:10:24Z

Your CL has an unrelated notebook Desktop/GSoC/Deepmind/optax/examples/lookahead_mnist.ipynb and you might want to remove it.

ayush31010 added 2 commits January 7, 2026 03:12

github-actions bot added the status:awaiting review label Jan 10, 2026

Merge branch 'main' into feature/issue-275-gemma3-tpu-v5e8

234d936

gemini-code-assist bot reviewed Jan 10, 2026

View reviewed changes

Merge branch 'main' into feature/issue-275-gemma3-tpu-v5e8

679ce09

Push local changes to GitHub

831b0b6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/issue 275 gemma3 tpu v5e8#283

Feature/issue 275 gemma3 tpu v5e8#283
ayush31010 wants to merge 5 commits intogoogle-gemma:mainfrom
ayush31010:feature/issue-275-gemma3-tpu-v5e8

ayush31010 commented Jan 10, 2026

Uh oh!

review-notebook-app bot commented Jan 10, 2026

Uh oh!

gemini-code-assist bot commented Jan 10, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 10, 2026

Uh oh!

gemini-code-assist bot Jan 10, 2026

Uh oh!

gemini-code-assist bot Jan 10, 2026

Uh oh!

bebechien commented Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ayush31010 commented Jan 10, 2026

Changes

Key Improvements

Why This Matters

Testing

Related Issues

Additional Notes

Uh oh!

review-notebook-app bot commented Jan 10, 2026

Uh oh!

gemini-code-assist bot commented Jan 10, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

bebechien commented Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants