From 766b4e1089cee6acd09a943fbeb13ad5d397dfd2 Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Sun, 28 Dec 2025 23:06:42 +0000
Subject: [PATCH 1/3] Update dependencies: modernize PyTorch versions and
 separate dev dependencies

This commit addresses outdated dependencies and improves dependency management:

Changes:
- Update PyTorch from 1.6.0 (2020) to latest stable (2.0.0+)
- Update torchvision from 0.7.0 (2020) to latest compatible version (0.15.0+)
- Add minimum version constraints for devito (>=4.8.0)
- Separate runtime and development dependencies:
  * requirements.txt: Core runtime deps (devito, torch)
  * requirements-dev.txt: Test-only deps (pytest, torchvision, numpy)
- Update CI workflow to use modern PyTorch CPU builds
- Update Dockerfile to use latest PyTorch versions
- Update pip cache key to reflect new requirements structure

Benefits:
- Removes 5+ year old security-outdated packages
- Clarifies which dependencies are needed for runtime vs testing
- Allows users to install Joey without test-only bloat
- Maintains compatibility with modern Python ecosystems
- No known security vulnerabilities (verified with pip-audit)

Torchvision moved to dev dependencies as it's only used in test_lenet.py
for MNIST dataset loading.
---
 .github/workflows/tests_cpu_no_docker.yml |  6 +++---
 Dockerfile_CPU                            |  6 +++---
 requirements-dev.txt                      | 10 ++++++++++
 requirements.txt                          |  7 ++++---
 4 files changed, 20 insertions(+), 9 deletions(-)
 create mode 100644 requirements-dev.txt

diff --git a/.github/workflows/tests_cpu_no_docker.yml b/.github/workflows/tests_cpu_no_docker.yml
index 7701a7c..8dcf201 100644
--- a/.github/workflows/tests_cpu_no_docker.yml
+++ b/.github/workflows/tests_cpu_no_docker.yml
@@ -24,15 +24,15 @@ jobs:
         uses: actions/cache@v2
         with:
           path: ~/.cache/pip
-          key: ${{ runner.os }}-pip-${{ hashFiles('requirements.txt') }}
+          key: ${{ runner.os }}-pip-${{ hashFiles('requirements-dev.txt') }}
           restore-keys: |
             ${{ runner.os }}-pip-
             ${{ runner.os }}-
       - name: Install dependencies
         run: |
           python -m pip install --upgrade pip
-          pip install torch==1.6.0+cpu torchvision==0.7.0+cpu -f https://download.pytorch.org/whl/torch_stable.html
-          pip install -r requirements.txt
+          pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu
+          pip install -r requirements-dev.txt
       - name: Install Joey
         run: pip install -e .
       - name: Run pytest (CPU)
diff --git a/Dockerfile_CPU b/Dockerfile_CPU
index 92011c1..3012693 100644
--- a/Dockerfile_CPU
+++ b/Dockerfile_CPU
@@ -1,9 +1,9 @@
 FROM ubuntu:latest
 RUN apt-get update && apt-get -y upgrade && apt-get -y install python3-pip build-essential
 WORKDIR /usr/src/app
-RUN pip3 install torch==1.6.0+cpu torchvision==0.7.0+cpu -f https://download.pytorch.org/whl/torch_stable.html
-COPY requirements.txt joey/
-RUN pip3 install -r joey/requirements.txt
+RUN pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cpu
+COPY requirements.txt requirements-dev.txt joey/
+RUN pip3 install -r joey/requirements-dev.txt
 COPY . joey/
 RUN pip3 install -e joey
 WORKDIR /usr/src/app/joey
diff --git a/requirements-dev.txt b/requirements-dev.txt
new file mode 100644
index 0000000..370fff4
--- /dev/null
+++ b/requirements-dev.txt
@@ -0,0 +1,10 @@
+# Development and testing dependencies for Joey
+# This includes the runtime dependencies plus additional packages needed for testing
+
+# Include runtime dependencies
+-r requirements.txt
+
+# Testing dependencies
+pytest>=7.0.0
+torchvision>=0.15.0
+numpy>=1.20.0
diff --git a/requirements.txt b/requirements.txt
index 08a2554..862d12d 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,3 +1,4 @@
-devito
-torch
-torchvision
+# Core runtime dependencies for Joey
+# Joey is a machine learning framework running on top of Devito
+devito>=4.8.0
+torch>=2.0.0

From 2729000babb2e22d66f2efe12ff31646a4a1b4dd Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Sun, 28 Dec 2025 23:17:37 +0000
Subject: [PATCH 2/3] Add comprehensive Joey vs PyTorch performance comparison
 notebook

This notebook provides a detailed performance benchmark comparing Joey
and PyTorch implementations of LeNet-5 on the MNIST dataset.

Features:
- Complete LeNet-5 implementation in both Joey and PyTorch
- Performance benchmarks for:
  * Forward pass (with statistical analysis)
  * Backward pass (gradient computation)
  * Complete training loop
- Visualization of comparative results
- Numerical correctness verification
- Detailed Portuguese documentation

The notebook demonstrates:
- How to build CNNs with Joey
- Fair performance comparison with identical initial weights
- Statistical analysis with mean and standard deviation
- Visual comparisons through matplotlib charts

This provides a practical example for users to understand Joey's
performance characteristics compared to the industry-standard PyTorch.
---
 examples/mnist_performance_comparison.ipynb | 671 ++++++++++++++++++++
 1 file changed, 671 insertions(+)
 create mode 100644 examples/mnist_performance_comparison.ipynb

diff --git a/examples/mnist_performance_comparison.ipynb b/examples/mnist_performance_comparison.ipynb
new file mode 100644
index 0000000..2d9c8fa
--- /dev/null
+++ b/examples/mnist_performance_comparison.ipynb
@@ -0,0 +1,671 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Comparação de Desempenho: Joey vs PyTorch para MNIST\n",
+    "\n",
+    "Este notebook implementa uma rede neural convolucional (LeNet) para o problema MNIST usando tanto Joey quanto PyTorch, e compara o desempenho computacional de ambas as implementações.\n",
+    "\n",
+    "## Objetivos\n",
+    "1. Implementar LeNet usando Joey\n",
+    "2. Implementar LeNet usando PyTorch\n",
+    "3. Comparar tempo de execução para:\n",
+    "   - Forward pass\n",
+    "   - Backward pass\n",
+    "   - Training completo\n",
+    "4. Analisar os resultados"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 1. Importações e Configuração"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import torch\n",
+    "import torchvision\n",
+    "import torchvision.transforms as transforms\n",
+    "import torch.nn as nn\n",
+    "import torch.nn.functional as F\n",
+    "import torch.optim as optim\n",
+    "import numpy as np\n",
+    "import joey\n",
+    "from joey.activation import ReLU\n",
+    "import time\n",
+    "import matplotlib.pyplot as plt\n",
+    "from devito import logger\n",
+    "\n",
+    "# Configurar logging do Devito para não poluir a saída\n",
+    "logger.set_log_noperf()\n",
+    "\n",
+    "# Configurações\n",
+    "BATCH_SIZE = 4\n",
+    "NUM_WORKERS = 2\n",
+    "SEED = 42\n",
+    "\n",
+    "# Definir seed para reprodutibilidade\n",
+    "np.random.seed(SEED)\n",
+    "torch.manual_seed(SEED)\n",
+    "\n",
+    "print(\"Importações concluídas!\")\n",
+    "print(f\"PyTorch version: {torch.__version__}\")\n",
+    "print(f\"Batch size: {BATCH_SIZE}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 2. Carregar Dataset MNIST"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Transformações para normalização\n",
+    "transform = transforms.Compose([\n",
+    "    transforms.Resize((32, 32)),  # LeNet espera imagens 32x32\n",
+    "    transforms.ToTensor(),\n",
+    "    transforms.Normalize(0.5, 0.5)\n",
+    "])\n",
+    "\n",
+    "# Download e carregamento do dataset\n",
+    "trainset = torchvision.datasets.MNIST(\n",
+    "    root='./mnist',\n",
+    "    train=True,\n",
+    "    download=True,\n",
+    "    transform=transform\n",
+    ")\n",
+    "\n",
+    "testset = torchvision.datasets.MNIST(\n",
+    "    root='./mnist',\n",
+    "    train=False,\n",
+    "    download=True,\n",
+    "    transform=transform\n",
+    ")\n",
+    "\n",
+    "trainloader = torch.utils.data.DataLoader(\n",
+    "    trainset,\n",
+    "    batch_size=BATCH_SIZE,\n",
+    "    shuffle=False,\n",
+    "    num_workers=NUM_WORKERS\n",
+    ")\n",
+    "\n",
+    "testloader = torch.utils.data.DataLoader(\n",
+    "    testset,\n",
+    "    batch_size=BATCH_SIZE,\n",
+    "    shuffle=False,\n",
+    "    num_workers=NUM_WORKERS\n",
+    ")\n",
+    "\n",
+    "print(f\"Dataset carregado!\")\n",
+    "print(f\"Training samples: {len(trainset)}\")\n",
+    "print(f\"Test samples: {len(testset)}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 3. Definir Arquitetura LeNet\n",
+    "\n",
+    "### 3.1 Versão PyTorch"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class PyTorchLeNet(nn.Module):\n",
+    "    \"\"\"LeNet-5 em PyTorch para MNIST.\"\"\"\n",
+    "    \n",
+    "    def __init__(self):\n",
+    "        super(PyTorchLeNet, self).__init__()\n",
+    "        self.conv1 = nn.Conv2d(1, 6, 3)      # 1 canal entrada, 6 filtros 3x3\n",
+    "        self.conv2 = nn.Conv2d(6, 16, 3)     # 6 canais entrada, 16 filtros 3x3\n",
+    "        self.fc1 = nn.Linear(16 * 6 * 6, 120)  # Camada totalmente conectada\n",
+    "        self.fc2 = nn.Linear(120, 84)\n",
+    "        self.fc3 = nn.Linear(84, 10)         # 10 classes (dígitos 0-9)\n",
+    "    \n",
+    "    def forward(self, x):\n",
+    "        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))\n",
+    "        x = F.max_pool2d(F.relu(self.conv2(x)), 2)\n",
+    "        x = x.view(-1, self.num_flat_features(x))\n",
+    "        x = F.relu(self.fc1(x))\n",
+    "        x = F.relu(self.fc2(x))\n",
+    "        x = self.fc3(x)\n",
+    "        return x\n",
+    "    \n",
+    "    def num_flat_features(self, x):\n",
+    "        size = x.size()[1:]\n",
+    "        num_features = 1\n",
+    "        for s in size:\n",
+    "            num_features *= s\n",
+    "        return num_features\n",
+    "\n",
+    "print(\"Arquitetura PyTorch LeNet definida!\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### 3.2 Versão Joey"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def create_joey_lenet(batch_size=4):\n",
+    "    \"\"\"Cria LeNet-5 usando Joey.\"\"\"\n",
+    "    \n",
+    "    # Camada 1: Convolução 6 filtros 3x3 + ReLU\n",
+    "    layer1 = joey.Conv(\n",
+    "        kernel_size=(6, 3, 3),\n",
+    "        input_size=(batch_size, 1, 32, 32),\n",
+    "        activation=ReLU(),\n",
+    "        generate_code=False\n",
+    "    )\n",
+    "    \n",
+    "    # Camada 2: Max pooling 2x2\n",
+    "    layer2 = joey.MaxPooling(\n",
+    "        kernel_size=(2, 2),\n",
+    "        input_size=(batch_size, 6, 30, 30),\n",
+    "        stride=(2, 2),\n",
+    "        generate_code=False\n",
+    "    )\n",
+    "    \n",
+    "    # Camada 3: Convolução 16 filtros 3x3 + ReLU\n",
+    "    layer3 = joey.Conv(\n",
+    "        kernel_size=(16, 3, 3),\n",
+    "        input_size=(batch_size, 6, 15, 15),\n",
+    "        activation=ReLU(),\n",
+    "        generate_code=False\n",
+    "    )\n",
+    "    \n",
+    "    # Camada 4: Max pooling 2x2\n",
+    "    layer4 = joey.MaxPooling(\n",
+    "        kernel_size=(2, 2),\n",
+    "        input_size=(batch_size, 16, 13, 13),\n",
+    "        stride=(2, 2),\n",
+    "        strict_stride_check=False,\n",
+    "        generate_code=False\n",
+    "    )\n",
+    "    \n",
+    "    # Camada de achatamento\n",
+    "    layer_flat = joey.Flat(\n",
+    "        input_size=(batch_size, 16, 6, 6),\n",
+    "        generate_code=False\n",
+    "    )\n",
+    "    \n",
+    "    # Camada 5: Fully Connected (576 -> 120) + ReLU\n",
+    "    layer5 = joey.FullyConnected(\n",
+    "        weight_size=(120, 576),\n",
+    "        input_size=(576, batch_size),\n",
+    "        activation=ReLU(),\n",
+    "        generate_code=False\n",
+    "    )\n",
+    "    \n",
+    "    # Camada 6: Fully Connected (120 -> 84) + ReLU\n",
+    "    layer6 = joey.FullyConnected(\n",
+    "        weight_size=(84, 120),\n",
+    "        input_size=(120, batch_size),\n",
+    "        activation=ReLU(),\n",
+    "        generate_code=False\n",
+    "    )\n",
+    "    \n",
+    "    # Camada 7: Fully Connected (84 -> 10) - Saída\n",
+    "    layer7 = joey.FullyConnected(\n",
+    "        weight_size=(10, 84),\n",
+    "        input_size=(84, batch_size),\n",
+    "        generate_code=False\n",
+    "    )\n",
+    "    \n",
+    "    # Criar rede\n",
+    "    layers = [layer1, layer2, layer3, layer4, layer_flat, layer5, layer6, layer7]\n",
+    "    net = joey.Net(layers)\n",
+    "    \n",
+    "    return net, layers\n",
+    "\n",
+    "print(\"Função para criar Joey LeNet definida!\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 4. Inicializar Redes com Mesmos Pesos\n",
+    "\n",
+    "Para uma comparação justa, vamos inicializar ambas as redes com os mesmos pesos."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Criar redes\n",
+    "joey_net, joey_layers = create_joey_lenet(BATCH_SIZE)\n",
+    "pytorch_net = PyTorchLeNet()\n",
+    "pytorch_net.double()  # Joey trabalha com float64\n",
+    "\n",
+    "# Copiar pesos do Joey para PyTorch para garantir mesmos valores iniciais\n",
+    "with torch.no_grad():\n",
+    "    pytorch_net.conv1.weight[:] = torch.from_numpy(joey_layers[0].kernel.data)\n",
+    "    pytorch_net.conv1.bias[:] = torch.from_numpy(joey_layers[0].bias.data)\n",
+    "    \n",
+    "    pytorch_net.conv2.weight[:] = torch.from_numpy(joey_layers[2].kernel.data)\n",
+    "    pytorch_net.conv2.bias[:] = torch.from_numpy(joey_layers[2].bias.data)\n",
+    "    \n",
+    "    pytorch_net.fc1.weight[:] = torch.from_numpy(joey_layers[5].kernel.data)\n",
+    "    pytorch_net.fc1.bias[:] = torch.from_numpy(joey_layers[5].bias.data)\n",
+    "    \n",
+    "    pytorch_net.fc2.weight[:] = torch.from_numpy(joey_layers[6].kernel.data)\n",
+    "    pytorch_net.fc2.bias[:] = torch.from_numpy(joey_layers[6].bias.data)\n",
+    "    \n",
+    "    pytorch_net.fc3.weight[:] = torch.from_numpy(joey_layers[7].kernel.data)\n",
+    "    pytorch_net.fc3.bias[:] = torch.from_numpy(joey_layers[7].bias.data)\n",
+    "\n",
+    "print(\"Redes inicializadas com os mesmos pesos!\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 5. Benchmark: Forward Pass\n",
+    "\n",
+    "Vamos medir o tempo de execução do forward pass para ambas as implementações."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Obter um batch de teste\n",
+    "test_images, test_labels = next(iter(testloader))\n",
+    "test_images_np = test_images.double().numpy()\n",
+    "\n",
+    "NUM_RUNS = 10\n",
+    "\n",
+    "# Benchmark Joey - Forward Pass\n",
+    "joey_forward_times = []\n",
+    "print(f\"\\nExecutando {NUM_RUNS} forward passes com Joey...\")\n",
+    "for i in range(NUM_RUNS):\n",
+    "    start_time = time.time()\n",
+    "    joey_output = joey_net.forward(test_images_np)\n",
+    "    end_time = time.time()\n",
+    "    joey_forward_times.append(end_time - start_time)\n",
+    "    if i == 0:\n",
+    "        # Salvar primeira saída para comparação\n",
+    "        joey_first_output = joey_layers[7].result.data.copy()\n",
+    "\n",
+    "joey_avg_forward = np.mean(joey_forward_times)\n",
+    "joey_std_forward = np.std(joey_forward_times)\n",
+    "\n",
+    "# Benchmark PyTorch - Forward Pass\n",
+    "pytorch_forward_times = []\n",
+    "print(f\"Executando {NUM_RUNS} forward passes com PyTorch...\")\n",
+    "for i in range(NUM_RUNS):\n",
+    "    start_time = time.time()\n",
+    "    pytorch_output = pytorch_net(test_images.double())\n",
+    "    end_time = time.time()\n",
+    "    pytorch_forward_times.append(end_time - start_time)\n",
+    "    if i == 0:\n",
+    "        # Salvar primeira saída para comparação\n",
+    "        pytorch_first_output = pytorch_output.detach().numpy().T\n",
+    "\n",
+    "pytorch_avg_forward = np.mean(pytorch_forward_times)\n",
+    "pytorch_std_forward = np.std(pytorch_forward_times)\n",
+    "\n",
+    "# Calcular erro relativo entre as saídas\n",
+    "relative_error = np.abs(joey_first_output - pytorch_first_output) / (np.abs(pytorch_first_output) + 1e-10)\n",
+    "max_error = np.nanmax(relative_error)\n",
+    "\n",
+    "print(\"\\n\" + \"=\"*60)\n",
+    "print(\"RESULTADOS - FORWARD PASS\")\n",
+    "print(\"=\"*60)\n",
+    "print(f\"\\nJoey:\")\n",
+    "print(f\"  Tempo médio: {joey_avg_forward*1000:.3f} ms ± {joey_std_forward*1000:.3f} ms\")\n",
+    "print(f\"\\nPyTorch:\")\n",
+    "print(f\"  Tempo médio: {pytorch_avg_forward*1000:.3f} ms ± {pytorch_std_forward*1000:.3f} ms\")\n",
+    "print(f\"\\nSpeedup: {joey_avg_forward/pytorch_avg_forward:.2f}x\")\n",
+    "if joey_avg_forward < pytorch_avg_forward:\n",
+    "    print(f\"Joey é {pytorch_avg_forward/joey_avg_forward:.2f}x mais rápido\")\n",
+    "else:\n",
+    "    print(f\"PyTorch é {joey_avg_forward/pytorch_avg_forward:.2f}x mais rápido\")\n",
+    "print(f\"\\nErro relativo máximo entre saídas: {max_error:.2e}\")\n",
+    "print(\"=\"*60)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 6. Benchmark: Backward Pass\n",
+    "\n",
+    "Agora vamos medir o tempo do backward pass (cálculo de gradientes)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Função de perda para Joey (gradiente manual)\n",
+    "def joey_loss_grad(output_layer, expected_labels):\n",
+    "    \"\"\"Calcula gradientes para cross-entropy loss no Joey.\"\"\"\n",
+    "    gradients = []\n",
+    "    for b in range(BATCH_SIZE):\n",
+    "        row = []\n",
+    "        for j in range(10):\n",
+    "            result = output_layer.result.data[j, b]\n",
+    "            if j == expected_labels[b]:\n",
+    "                result -= 1\n",
+    "            row.append(result)\n",
+    "        gradients.append(row)\n",
+    "    return gradients\n",
+    "\n",
+    "# Benchmark Joey - Backward Pass\n",
+    "joey_backward_times = []\n",
+    "print(f\"\\nExecutando {NUM_RUNS} backward passes com Joey...\")\n",
+    "for i in range(NUM_RUNS):\n",
+    "    # Forward pass primeiro\n",
+    "    joey_net.forward(test_images_np)\n",
+    "    \n",
+    "    # Medir backward pass\n",
+    "    start_time = time.time()\n",
+    "    joey_net.backward(test_labels.numpy(), joey_loss_grad)\n",
+    "    end_time = time.time()\n",
+    "    joey_backward_times.append(end_time - start_time)\n",
+    "\n",
+    "joey_avg_backward = np.mean(joey_backward_times)\n",
+    "joey_std_backward = np.std(joey_backward_times)\n",
+    "\n",
+    "# Benchmark PyTorch - Backward Pass\n",
+    "criterion = nn.CrossEntropyLoss()\n",
+    "pytorch_backward_times = []\n",
+    "print(f\"Executando {NUM_RUNS} backward passes com PyTorch...\")\n",
+    "for i in range(NUM_RUNS):\n",
+    "    # Forward pass primeiro\n",
+    "    pytorch_net.zero_grad()\n",
+    "    outputs = pytorch_net(test_images.double())\n",
+    "    loss = criterion(outputs, test_labels)\n",
+    "    \n",
+    "    # Medir backward pass\n",
+    "    start_time = time.time()\n",
+    "    loss.backward()\n",
+    "    end_time = time.time()\n",
+    "    pytorch_backward_times.append(end_time - start_time)\n",
+    "\n",
+    "pytorch_avg_backward = np.mean(pytorch_backward_times)\n",
+    "pytorch_std_backward = np.std(pytorch_backward_times)\n",
+    "\n",
+    "print(\"\\n\" + \"=\"*60)\n",
+    "print(\"RESULTADOS - BACKWARD PASS\")\n",
+    "print(\"=\"*60)\n",
+    "print(f\"\\nJoey:\")\n",
+    "print(f\"  Tempo médio: {joey_avg_backward*1000:.3f} ms ± {joey_std_backward*1000:.3f} ms\")\n",
+    "print(f\"\\nPyTorch:\")\n",
+    "print(f\"  Tempo médio: {pytorch_avg_backward*1000:.3f} ms ± {pytorch_std_backward*1000:.3f} ms\")\n",
+    "print(f\"\\nSpeedup: {joey_avg_backward/pytorch_avg_backward:.2f}x\")\n",
+    "if joey_avg_backward < pytorch_avg_backward:\n",
+    "    print(f\"Joey é {pytorch_avg_backward/joey_avg_backward:.2f}x mais rápido\")\n",
+    "else:\n",
+    "    print(f\"PyTorch é {joey_avg_backward/pytorch_avg_backward:.2f}x mais rápido\")\n",
+    "print(\"=\"*60)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 7. Benchmark: Training Loop Completo\n",
+    "\n",
+    "Vamos executar um loop de treinamento completo com múltiplos batches."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "NUM_BATCHES = 20  # Número de batches para treinar\n",
+    "LEARNING_RATE = 0.001\n",
+    "MOMENTUM = 0.9\n",
+    "\n",
+    "# Reinicializar redes para treinamento justo\n",
+    "joey_net, joey_layers = create_joey_lenet(BATCH_SIZE)\n",
+    "pytorch_net = PyTorchLeNet()\n",
+    "pytorch_net.double()\n",
+    "\n",
+    "# Sincronizar pesos iniciais\n",
+    "with torch.no_grad():\n",
+    "    pytorch_net.conv1.weight[:] = torch.from_numpy(joey_layers[0].kernel.data)\n",
+    "    pytorch_net.conv1.bias[:] = torch.from_numpy(joey_layers[0].bias.data)\n",
+    "    pytorch_net.conv2.weight[:] = torch.from_numpy(joey_layers[2].kernel.data)\n",
+    "    pytorch_net.conv2.bias[:] = torch.from_numpy(joey_layers[2].bias.data)\n",
+    "    pytorch_net.fc1.weight[:] = torch.from_numpy(joey_layers[5].kernel.data)\n",
+    "    pytorch_net.fc1.bias[:] = torch.from_numpy(joey_layers[5].bias.data)\n",
+    "    pytorch_net.fc2.weight[:] = torch.from_numpy(joey_layers[6].kernel.data)\n",
+    "    pytorch_net.fc2.bias[:] = torch.from_numpy(joey_layers[6].bias.data)\n",
+    "    pytorch_net.fc3.weight[:] = torch.from_numpy(joey_layers[7].kernel.data)\n",
+    "    pytorch_net.fc3.bias[:] = torch.from_numpy(joey_layers[7].bias.data)\n",
+    "\n",
+    "# Training Joey\n",
+    "print(f\"\\nTreinando Joey por {NUM_BATCHES} batches...\")\n",
+    "joey_optimizer = optim.SGD(joey_net.pytorch_parameters, lr=LEARNING_RATE, momentum=MOMENTUM)\n",
+    "\n",
+    "joey_start = time.time()\n",
+    "batch_count = 0\n",
+    "for images, labels in trainloader:\n",
+    "    if batch_count >= NUM_BATCHES:\n",
+    "        break\n",
+    "    \n",
+    "    images_np = images.double().numpy()\n",
+    "    joey_net.forward(images_np)\n",
+    "    joey_net.backward(labels.numpy(), joey_loss_grad, joey_optimizer)\n",
+    "    batch_count += 1\n",
+    "\n",
+    "joey_training_time = time.time() - joey_start\n",
+    "\n",
+    "# Training PyTorch\n",
+    "print(f\"Treinando PyTorch por {NUM_BATCHES} batches...\")\n",
+    "pytorch_optimizer = optim.SGD(pytorch_net.parameters(), lr=LEARNING_RATE, momentum=MOMENTUM)\n",
+    "criterion = nn.CrossEntropyLoss()\n",
+    "\n",
+    "pytorch_start = time.time()\n",
+    "batch_count = 0\n",
+    "for images, labels in trainloader:\n",
+    "    if batch_count >= NUM_BATCHES:\n",
+    "        break\n",
+    "    \n",
+    "    pytorch_optimizer.zero_grad()\n",
+    "    outputs = pytorch_net(images.double())\n",
+    "    loss = criterion(outputs, labels)\n",
+    "    loss.backward()\n",
+    "    pytorch_optimizer.step()\n",
+    "    batch_count += 1\n",
+    "\n",
+    "pytorch_training_time = time.time() - pytorch_start\n",
+    "\n",
+    "print(\"\\n\" + \"=\"*60)\n",
+    "print(\"RESULTADOS - TRAINING COMPLETO\")\n",
+    "print(\"=\"*60)\n",
+    "print(f\"\\nJoey:\")\n",
+    "print(f\"  Tempo total: {joey_training_time:.3f} s\")\n",
+    "print(f\"  Tempo por batch: {joey_training_time/NUM_BATCHES*1000:.3f} ms\")\n",
+    "print(f\"\\nPyTorch:\")\n",
+    "print(f\"  Tempo total: {pytorch_training_time:.3f} s\")\n",
+    "print(f\"  Tempo por batch: {pytorch_training_time/NUM_BATCHES*1000:.3f} ms\")\n",
+    "print(f\"\\nSpeedup: {joey_training_time/pytorch_training_time:.2f}x\")\n",
+    "if joey_training_time < pytorch_training_time:\n",
+    "    print(f\"Joey é {pytorch_training_time/joey_training_time:.2f}x mais rápido\")\n",
+    "else:\n",
+    "    print(f\"PyTorch é {joey_training_time/pytorch_training_time:.2f}x mais rápido\")\n",
+    "print(\"=\"*60)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 8. Visualização dos Resultados"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Criar gráficos de comparação\n",
+    "fig, axes = plt.subplots(1, 3, figsize=(18, 5))\n",
+    "\n",
+    "# Gráfico 1: Forward Pass\n",
+    "ax1 = axes[0]\n",
+    "x_pos = [0, 1]\n",
+    "forward_means = [joey_avg_forward*1000, pytorch_avg_forward*1000]\n",
+    "forward_stds = [joey_std_forward*1000, pytorch_std_forward*1000]\n",
+    "colors = ['#3498db', '#e74c3c']\n",
+    "ax1.bar(x_pos, forward_means, yerr=forward_stds, color=colors, alpha=0.7, capsize=5)\n",
+    "ax1.set_ylabel('Tempo (ms)', fontsize=12)\n",
+    "ax1.set_title('Forward Pass', fontsize=14, fontweight='bold')\n",
+    "ax1.set_xticks(x_pos)\n",
+    "ax1.set_xticklabels(['Joey', 'PyTorch'], fontsize=11)\n",
+    "ax1.grid(axis='y', alpha=0.3)\n",
+    "\n",
+    "# Adicionar valores no topo das barras\n",
+    "for i, (mean, std) in enumerate(zip(forward_means, forward_stds)):\n",
+    "    ax1.text(i, mean + std + 0.5, f'{mean:.2f}±{std:.2f}', \n",
+    "             ha='center', va='bottom', fontsize=9)\n",
+    "\n",
+    "# Gráfico 2: Backward Pass\n",
+    "ax2 = axes[1]\n",
+    "backward_means = [joey_avg_backward*1000, pytorch_avg_backward*1000]\n",
+    "backward_stds = [joey_std_backward*1000, pytorch_std_backward*1000]\n",
+    "ax2.bar(x_pos, backward_means, yerr=backward_stds, color=colors, alpha=0.7, capsize=5)\n",
+    "ax2.set_ylabel('Tempo (ms)', fontsize=12)\n",
+    "ax2.set_title('Backward Pass', fontsize=14, fontweight='bold')\n",
+    "ax2.set_xticks(x_pos)\n",
+    "ax2.set_xticklabels(['Joey', 'PyTorch'], fontsize=11)\n",
+    "ax2.grid(axis='y', alpha=0.3)\n",
+    "\n",
+    "for i, (mean, std) in enumerate(zip(backward_means, backward_stds)):\n",
+    "    ax2.text(i, mean + std + 0.5, f'{mean:.2f}±{std:.2f}', \n",
+    "             ha='center', va='bottom', fontsize=9)\n",
+    "\n",
+    "# Gráfico 3: Training Completo\n",
+    "ax3 = axes[2]\n",
+    "training_times = [joey_training_time, pytorch_training_time]\n",
+    "ax3.bar(x_pos, training_times, color=colors, alpha=0.7)\n",
+    "ax3.set_ylabel('Tempo (s)', fontsize=12)\n",
+    "ax3.set_title(f'Training ({NUM_BATCHES} batches)', fontsize=14, fontweight='bold')\n",
+    "ax3.set_xticks(x_pos)\n",
+    "ax3.set_xticklabels(['Joey', 'PyTorch'], fontsize=11)\n",
+    "ax3.grid(axis='y', alpha=0.3)\n",
+    "\n",
+    "for i, time_val in enumerate(training_times):\n",
+    "    ax3.text(i, time_val + 0.1, f'{time_val:.2f}s', \n",
+    "             ha='center', va='bottom', fontsize=9)\n",
+    "\n",
+    "plt.tight_layout()\n",
+    "plt.savefig('joey_vs_pytorch_performance.png', dpi=300, bbox_inches='tight')\n",
+    "plt.show()\n",
+    "\n",
+    "print(\"Gráfico salvo como 'joey_vs_pytorch_performance.png'\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 9. Resumo Final"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print(\"\\n\" + \"=\"*70)\n",
+    "print(\" \"*20 + \"RESUMO COMPARATIVO FINAL\")\n",
+    "print(\"=\"*70)\n",
+    "print(\"\\nOperação          | Joey (ms)      | PyTorch (ms)   | Speedup\")\n",
+    "print(\"-\"*70)\n",
+    "print(f\"Forward Pass      | {joey_avg_forward*1000:>10.3f}     | {pytorch_avg_forward*1000:>10.3f}     | {joey_avg_forward/pytorch_avg_forward:>6.2f}x\")\n",
+    "print(f\"Backward Pass     | {joey_avg_backward*1000:>10.3f}     | {pytorch_avg_backward*1000:>10.3f}     | {joey_avg_backward/pytorch_avg_backward:>6.2f}x\")\n",
+    "print(f\"Training (total)  | {joey_training_time*1000:>10.3f}     | {pytorch_training_time*1000:>10.3f}     | {joey_training_time/pytorch_training_time:>6.2f}x\")\n",
+    "print(\"=\"*70)\n",
+    "print(\"\\nCONCLUSÕES:\")\n",
+    "print(\"-\"*70)\n",
+    "\n",
+    "if joey_avg_forward < pytorch_avg_forward:\n",
+    "    print(f\"✓ Joey é {pytorch_avg_forward/joey_avg_forward:.2f}x mais rápido no forward pass\")\n",
+    "else:\n",
+    "    print(f\"✗ PyTorch é {joey_avg_forward/pytorch_avg_forward:.2f}x mais rápido no forward pass\")\n",
+    "\n",
+    "if joey_avg_backward < pytorch_avg_backward:\n",
+    "    print(f\"✓ Joey é {pytorch_avg_backward/joey_avg_backward:.2f}x mais rápido no backward pass\")\n",
+    "else:\n",
+    "    print(f\"✗ PyTorch é {joey_avg_backward/pytorch_avg_backward:.2f}x mais rápido no backward pass\")\n",
+    "\n",
+    "if joey_training_time < pytorch_training_time:\n",
+    "    print(f\"✓ Joey é {pytorch_training_time/joey_training_time:.2f}x mais rápido no training completo\")\n",
+    "else:\n",
+    "    print(f\"✗ PyTorch é {joey_training_time/pytorch_training_time:.2f}x mais rápido no training completo\")\n",
+    "\n",
+    "print(f\"\\n✓ Erro numérico máximo: {max_error:.2e} (resultados numericamente equivalentes)\")\n",
+    "print(\"\\nNOTA: Joey gera código otimizado em tempo real usando Devito,\")\n",
+    "print(\"      enquanto PyTorch usa operações pré-compiladas e otimizadas.\")\n",
+    "print(\"      O desempenho pode variar dependendo da arquitetura da rede,\")\n",
+    "print(\"      tamanho do batch e hardware utilizado.\")\n",
+    "print(\"=\"*70)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}

From e323fe7d6e291d08785b7e05e1410e27bafc3258 Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Mon, 29 Dec 2025 22:46:13 +0000
Subject: [PATCH 3/3] Add installation cell to MNIST performance comparison
 notebook
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Added a new section (0. Instalação) with:
- Automatic installation script for Joey and dependencies
- Multiple installation options (GitHub, local clone, manual)
- Dependency checking for torchvision and matplotlib
- Clear instructions in Portuguese for users
- Handles missing Joey package gracefully

This makes the notebook standalone and easier to use for new users.
---
 examples/mnist_performance_comparison.ipynb | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/examples/mnist_performance_comparison.ipynb b/examples/mnist_performance_comparison.ipynb
index 2d9c8fa..b32cc68 100644
--- a/examples/mnist_performance_comparison.ipynb
+++ b/examples/mnist_performance_comparison.ipynb
@@ -18,6 +18,18 @@
     "4. Analisar os resultados"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "source": "## 0. Instalação\n\nExecute esta célula para instalar o Joey e suas dependências:",
+   "metadata": {}
+  },
+  {
+   "cell_type": "code",
+   "source": "# Instalar Joey e dependências\nimport sys\nimport subprocess\n\ndef install_package(package):\n    \"\"\"Instala um pacote usando pip.\"\"\"\n    subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", package])\n\n# Verificar se Joey está instalado, se não, instalar\ntry:\n    import joey\n    print(\"✓ Joey já está instalado!\")\nexcept ImportError:\n    print(\"Joey não encontrado. Instalando...\")\n    \n    # Opção 1: Instalar do repositório GitHub (versão de desenvolvimento)\n    # install_package(\"git+https://github.com/devitocodes/joey.git\")\n    \n    # Opção 2: Instalar de um clone local (se você clonou o repositório)\n    # Descomente e ajuste o caminho se necessário:\n    # install_package(\"-e /caminho/para/joey\")\n    \n    # Opção 3: Instalar dependências manualmente\n    print(\"Instalando dependências do Joey...\")\n    install_package(\"devito>=4.8.0\")\n    install_package(\"torch>=2.0.0\")\n    install_package(\"numpy>=1.20.0\")\n    \n    print(\"\\n⚠️  IMPORTANTE:\")\n    print(\"Para usar Joey, você precisa instalar o pacote completo.\")\n    print(\"Execute uma das seguintes opções:\\n\")\n    print(\"1. Instalar do GitHub:\")\n    print(\"   pip install git+https://github.com/devitocodes/joey.git\\n\")\n    print(\"2. Clonar e instalar localmente:\")\n    print(\"   git clone https://github.com/devitocodes/joey.git\")\n    print(\"   pip install -e joey\\n\")\n    print(\"Após instalar, reinicie o kernel do Jupyter e execute novamente.\")\n\n# Instalar outras dependências necessárias para o notebook\npackages = [\n    \"torchvision>=0.15.0\",\n    \"matplotlib\",\n]\n\nprint(\"\\nVerificando outras dependências...\")\nfor package in packages:\n    try:\n        pkg_name = package.split(\">=\")[0].split(\"==\")[0]\n        __import__(pkg_name)\n        print(f\"✓ {pkg_name} já está instalado\")\n    except ImportError:\n        print(f\"Instalando {package}...\")\n        install_package(package)\n\nprint(\"\\n✅ Instalação concluída!\")",
+   "metadata": {},
+   "execution_count": null,
+   "outputs": []
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -668,4 +680,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 4
-}
+}
\ No newline at end of file