fix(backend): Backend Docker build fixes for AutoModelForImageTextToText#642
fix(backend): Backend Docker build fixes for AutoModelForImageTextToText#642
Conversation
This PR fixes ModuleNotFoundError: AutoModelForImageTextToText that was breaking backend deployments.
## Changes
1. **Add transformers[vision] extra (pyproject.toml + poetry.lock)**
- Changed: transformers (>=4.46.0) → transformers[vision] (>=4.46.0)
- Reason: Docling's CodeFormulaModel requires vision-text model dependencies
2. **Preserve numpy._core.tests (backend/Dockerfile.backend)**
- Added exclusion: ! -path "*/numpy/*" to tests cleanup
- Reason: numpy._core.tests is a required module, not test code
- Was causing cascading import failures:
- numpy.testing → numpy._core.tests._natype
- scipy → numpy
- sklearn → scipy
- transformers → sklearn
- Result: AutoModelForImageTextToText import failed
## Testing
Validated locally with ARM64 build:
```bash
docker build -f backend/Dockerfile.backend -t backend:test .
docker run --rm backend:test python -c \
"from transformers import AutoModelForImageTextToText; print('✓')"
```
Output: ✓
## Fixes
- Resolves AutoModelForImageTextToText import errors
- Fixes 50+ failed deployments caused by missing vision dependencies
- Prevents accidental deletion of required numpy modules
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
🚀 Development Environment OptionsThis repository supports Dev Containers for a consistent development environment. Option 1: GitHub Codespaces (Recommended)Create a cloud-based development environment:
Option 2: VS Code Dev Containers (Local)Use Dev Containers on your local machine:
Option 3: Traditional Local SetupSet up the development environment manually: # Clone the repository
git clone https://github.com/manavgup/rag_modulo.git
cd rag_modulo
git checkout fix/backend-docker-build
# Initialize development environment
make dev-init
make dev-build
make dev-up
make dev-validateAvailable CommandsOnce in your development environment: make help # Show all available commands
make dev-validate # Validate environment setup
make test-atomic # Run atomic tests
make test-unit # Run unit tests
make lint # Run lintingServices AvailableWhen running
This automated message helps reviewers quickly set up the development environment. |
Code Review - PR #642: Backend Docker Build FixesThank you for this well-documented PR! This addresses a critical deployment issue with a focused, surgical fix. Here's my comprehensive review: ✅ Strengths1. Excellent Problem Analysis
2. Minimal, Targeted Changes
3. Strong Documentation
🔍 Technical AnalysisChange 1: Add
|
…tion errors
This PR fixes Pydantic validation errors that were occurring when the SKIP_AUTH secret was empty.
## Problem
When SKIP_AUTH secret is not set or empty, the backend receives an empty string '', causing:
```
Input should be a valid boolean, unable to interpret input
[type=bool_parsing, input_value='', input_type=str]
```
This was causing backend deployments to fail during the Code Engine application startup.
## Solution
Added default value 'false' to SKIP_AUTH environment variable:
**Before**:
```yaml
SKIP_AUTH: ${{ secrets.SKIP_AUTH }}
```
**After**:
```yaml
SKIP_AUTH: ${{ secrets.SKIP_AUTH || 'false' }}
```
Now when the secret is empty, the backend receives 'false' instead of '', which Pydantic can parse as a boolean.
## Testing
This fix will be validated in the next deployment workflow run. Expected behavior:
- If SKIP_AUTH secret is set: uses that value
- If SKIP_AUTH secret is empty/unset: defaults to 'false'
- Backend starts successfully without Pydantic validation errors
## Related
- Part of deployment fixes series (breaking down PR #641)
- Related to PR #642 (backend Docker fixes)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
…odeengine This PR updates the GitHub Actions workflow to use the correct backend Dockerfile. ## Problem The workflow was using `Dockerfile.codeengine` which: - Used `poetry install` that pulled CUDA PyTorch from poetry.lock (6-8GB NVIDIA libs) - Caused massive Docker image bloat - Led to deployment failures ## Solution Changed the workflow to use `backend/Dockerfile.backend` which: - Parses `pyproject.toml` directly with pip - Uses CPU-only PyTorch index `--extra-index-url https://download.pytorch.org/whl/cpu` - Significantly reduces image size - Works with the fixes from PR #642 (transformers[vision] + numpy cleanup) **Before**: ```yaml file: ./Dockerfile.codeengine ``` **After**: ```yaml file: ./backend/Dockerfile.backend ``` ## Changes - `.github/workflows/deploy_complete_app.yml` (line 215): Updated Dockerfile path ## Testing This fix will be validated in the CI pipeline. Expected behavior: ✅ **Builds use correct Dockerfile**: backend/Dockerfile.backend ✅ **CPU-only PyTorch**: No CUDA libraries in image ✅ **Smaller image size**: ~500MB vs 6-8GB ✅ **Successful deployment**: No import errors ## Type of Change - [x] Bug fix (non-breaking change which fixes an issue) - [x] Deployment fix ## Related PRs This is part of the focused PR strategy to replace PR #641: - **PR #642**: Backend Docker fixes (transformers[vision] + numpy cleanup) - **PR #643**: SKIP_AUTH default value fix - **PR #644** (this PR): Workflow Dockerfile path fix ## Checklist - [x] Code follows the style guidelines of this project - [x] Change is focused and addresses a single issue - [x] Commit message follows conventional commits format - [x] No breaking changes introduced - [x] CI workflows will validate the change --- 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Summary
Fixes
ModuleNotFoundError: AutoModelForImageTextToTextthat was causing 50+ failed backend deployments.This PR contains two critical Docker build fixes:
Changes
1. Add transformers[vision] extra
Files:
pyproject.toml,poetry.lockWhy: Docling's CodeFormulaModel requires
transformers[vision]to access vision-text model dependencies (pillow, torchvision, etc.)2. Preserve numpy._core.tests
File:
backend/Dockerfile.backendWhy:
numpy._core.testsis a required module (not test code) that was being deleted by cleanup, causing cascading import failures:numpy.testingimportsnumpy._core.tests._natypescipyimportsnumpysklearnimportsscipytransformersimportssklearnTesting
✅ Local validation with ARM64 build:
Output: ✓
Fixes
Related
Test Plan
🤖 Generated with Claude Code