feat: fully deployed environment with generated handlers#68
Conversation
…dler Replace is_mothership with is_lb_endpoint throughout lb_handler.py. Add _is_lb_endpoint() helper that checks FLASH_ENDPOINT_TYPE=lb first, with backward compat for legacy FLASH_IS_MOTHERSHIP=true (logs deprecation warning). Update all log messages and ping endpoint from "mothership" to "LB endpoint" / "QB endpoint" terminology.
- Add FLASH_ENDPOINT_TYPE check (lb/qb) to is_flash_deployment() - Retain FLASH_IS_MOTHERSHIP for backward compatibility - Update constants.py comments to use LB endpoint terminology - Add tests for new env var, legacy compat, and missing endpoint ID
Add parallel tests using FLASH_ENDPOINT_TYPE=lb for all existing FLASH_IS_MOTHERSHIP tests. Legacy tests preserved as backward compatibility regression tests. Update docstring comment in unpack_volume.py to reflect new env var priority.
Drop legacy FLASH_IS_MOTHERSHIP env var support and deprecation warnings. All detection now uses FLASH_ENDPOINT_TYPE exclusively. Remove duplicate test classes that existed for parallel coverage of both old and new env vars. Remove dead serialization_format/json_result code path in _execute_flash_function and corresponding tests -- these fields do not exist on the current FunctionRequest/FunctionResponse protocol models.
When FLASH_RESOURCE_NAME is set and a generated handler_<name>.py exists at /app (extracted from the build tarball), use it instead of the FunctionRequest handler. This enables deployed QB endpoints to accept plain JSON input without cloudpickle serialization. Falls back gracefully to FunctionRequest handler when: - FLASH_RESOURCE_NAME not set (Live Serverless mode) - No generated handler file found - Generated handler fails to import (with actionable log message)
Replace FLASH_MAIN_FILE/FLASH_APP_VARIABLE env vars with
FLASH_RESOURCE_NAME-based handler discovery. The LB handler now
derives the handler file path as handler_{resource_name}.py,
matching the convention used by flash build's codegen.
- handler.py: deployed QB endpoints (FLASH_ENDPOINT_TYPE=qb) now fail fast instead of silently falling back to FunctionRequest mode. Split broad except into ImportError/SyntaxError/Exception with appropriate log levels. Add warnings for missing handler file and missing handler attribute. - lb_handler.py: extract _discover_lb_app() as testable function with handler_dir parameter. Fix f-string in logger.error to %s-style. - unpack_volume.py: remove premature _UNPACKED=True before extraction succeeds. Fix off-by-one in retry sleep that caused 30s delay after final failed attempt. - manifest_reconciliation.py: split broad except in _fetch_and_save into expected network errors (warning) vs unexpected errors (error with traceback). Add debug logging to OSError in _is_manifest_stale. - test_lb_handler.py: eliminate standalone function copies by importing production code directly with mocked module-level side effects. - test_handler.py: add tests for SyntaxError branch, generic Exception branch, deployed QB hard failures, and missing handler attr warning.
There was a problem hiding this comment.
Pull request overview
This PR updates worker-flash’s deployed-runtime behavior by replacing the legacy FLASH_IS_MOTHERSHIP flag with FLASH_ENDPOINT_TYPE and adding auto-discovery of generated handlers (QB + LB) via FLASH_RESOURCE_NAME, enabling deployed endpoints to accept plain JSON without cloudpickle serialization.
Changes:
- Replace
FLASH_IS_MOTHERSHIPdetection withFLASH_ENDPOINT_TYPE(lb/qb) across runtime logic, docs, and tests. - Add generated-handler loading in
src/handler.py(QB) and generated FastAPI app discovery insrc/lb_handler.py(LB). - Expand unit/integration test coverage around endpoint-type detection, handler discovery, and manifest refresh behavior.
Reviewed changes
Copilot reviewed 13 out of 14 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| uv.lock | Bumps worker-flash version to 1.0.1. |
| src/handler.py | Adds generated QB handler loader and fallback behavior. |
| src/lb_handler.py | Adds LB endpoint detection and generated FastAPI app auto-discovery. |
| src/manifest_reconciliation.py | Switches Flash deployment detection to FLASH_ENDPOINT_TYPE and improves error logging/handling. |
| src/unpack_volume.py | Updates Flash-deployment detection messaging and retry logging/behavior. |
| src/remote_executor.py | Clarifies comments around cloudpickle arg/kwarg deserialization and execution path. |
| src/constants.py | Updates unpack retry constant docstrings to new endpoint terminology. |
| docs/Runtime_Execution_Paths.md | Updates documentation for new deployment-mode detection environment variables. |
| tests/unit/test_handler.py | Adds tests for generated handler loading and fallback/raise behavior. |
| tests/unit/test_lb_handler.py | Adds tests for LB mode detection and generated FastAPI app discovery. |
| tests/unit/test_manifest_reconciliation.py | Updates/extends tests for Flash deployment detection with FLASH_ENDPOINT_TYPE. |
| tests/unit/test_unpack_volume.py | Updates tests to use FLASH_ENDPOINT_TYPE instead of FLASH_IS_MOTHERSHIP. |
| tests/integration/test_manifest_state_manager.py | Updates integration tests to use FLASH_ENDPOINT_TYPE. |
| CLAUDE.md | Refreshes repository guidance content to reflect current architecture and commands. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Generated handler files are a new capability introduced in this PR. Existing deployed QB endpoints have FLASH_ENDPOINT_TYPE=qb and FLASH_RESOURCE_NAME set but no generated handler files yet. The hard failure (FileNotFoundError/RuntimeError) killed the module import at load time, preventing runpod.serverless.start() from executing and leaving workers unhealthy. Revert to warning-and-fallback for all discovery failures. Keep the improved logging (SyntaxError/Exception split, missing-attr warning) but always fall back to the FunctionRequest handler gracefully.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 14 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Validate FLASH_RESOURCE_NAME cannot resolve outside /app (handler.py) or handler_dir (lb_handler.py) using Path.resolve().is_relative_to() - Add callable() check on loaded handler attribute in handler.py - Fix docstrings: ImportError -> FileNotFoundError for missing file, remove RUNPOD_POD_ID reference, correct "LB endpoint" terminology - Add tests for path traversal and non-callable handler scenarios
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 14 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Live Serverless workers never have flash_manifest.json on disk, so eagerly constructing ServiceRegistry caused unnecessary errors. Guard with Path.exists() and only initialize in Flash Deployed mode.
Pin minimum version and update lock file from 1.1.1 to 1.4.0.
Summary
FLASH_IS_MOTHERSHIPtoFLASH_ENDPOINT_TYPEfor clarity, removing backward compatibility shimFLASH_RESOURCE_NAME, enabling deployed endpoints to accept plain JSON (no cloudpickle serialization)FLASH_RESOURCE_NAMEinstead of requiring explicit configurationTest plan
make quality-checkpassesmake test-handlerpasses with all test JSON filesFLASH_RESOURCE_NAMERemoteExecutor