fix: migrate to env var for API key handling and logging improvements#67
Closed
fix: migrate to env var for API key handling and logging improvements#67
Conversation
Implements better retry logging for volume unpacking with clearer distinction between transient and terminal failures: - 1-indexed retry attempts for user-friendliness - WARNING level for expected retry failures (no traceback) - ERROR level only on final failure (with traceback via exc_info=True) - Clearer retry wait messages Adds logger suppression pattern following runpod-python conventions: - Suppress urllib3 and uvicorn loggers to reduce console noise - Set to WARNING level in lb_handler at module import time - Lets uvicorn handle its own logging with defaults This avoids the earlier attempt to configure uvicorn handlers at import time (which failed due to race condition with handler creation). Simple suppression is cleaner and follows proven patterns from runpod-python. Closes #65 (reverted broken changes but kept good retry improvements)
- Create rp_logger_adapter.py with FlashLoggerAdapter wrapping RunPodLogger - Update logger.py with backward-compatible wrapper functions - Rewrite log_streamer.py to capture stdout instead of logging handlers - Replace logging.getLogger() with get_flash_logger() in 10 source files - Add comprehensive unit tests for adapter layer (38 tests) - Enable automatic JSON logging in production (RUNPOD_ENDPOINT_ID) Key benefits: - Visual consistency with runpod-python ecosystem - Simplified output without redundant timestamps - Automatic structured logging in production - Namespace prefixes for better traceability All 474+ tests pass, 76.64% code coverage, quality checks pass.
Update environment variable detection throughout flash-worker to use the current
FLASH_MOTHERSHIP_ID variable instead of the obsolete FLASH_IS_MOTHERSHIP.
Changes:
- src/lb_handler.py: Update detection logic and docstring, clarify log messages
- src/manifest_reconciliation.py: Use new variable for Flash deployment check
- src/unpack_volume.py: Update documentation comment
- Test files: Update all test fixtures to use FLASH_MOTHERSHIP_ID
- docs/Runtime_Execution_Paths.md: Update documentation and examples
Detection logic:
- OLD: os.getenv("FLASH_IS_MOTHERSHIP") == "true"
- NEW: os.getenv("FLASH_MOTHERSHIP_ID") is not None
This ensures the mothership endpoint correctly detects its mode and logs display
accurate deployment information instead of "Queue-based mode" when running on
the mothership.
All 278 tests passing with 76.64% coverage.
Add thread-safe context variable for API key propagation: - api_key_context.py: New module with set/get/clear_api_key functions - lb_handler.py: Extract Bearer token from Authorization header via middleware - remote_executor.py: Check context API key before env var fallback - manifest_reconciliation.py: Skip State Manager queries in preview mode Add comprehensive unit tests for context management and middleware
- Check FLASH_IS_MOTHERSHIP flag (set by provisioner) - Fallback to FLASH_MOTHERSHIP_ID for backwards compatibility - Eliminates timing issues with RUNPOD_ENDPOINT_ID availability Enables deployed mothership endpoints to correctly boot in mothership mode and load user code instead of defaulting to child endpoint mode.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Migrates API key handling from Authorization header to RUNPOD_API_KEY env var per PRD requirements. Previously attempted auth header extraction (OLD IDEA) but PRD specifies env var approach (WORKING IDEA) for cleaner separation between user auth and worker-to-worker communication.
Changes
src/lb_handler.py,src/remote_executor.py) - Remove auth header extraction, use RUNPOD_API_KEY env var for remote calls per PRDsrc/rp_logger_adapter.py,src/log_streamer.py) - Standardize logging format across worker lifecyclesrc/lb_handler.py) - Clean up request processing and error boundariessrc/manifest_reconciliation.py) - Structured error handling for file sync operationssrc/unpack_volume.py) - Expose retry backoff timing for debuggingWhy Env Var Instead of Auth Header
Per PRD: "The API key in the auth header is required to contact a mothership. That API key will not be passed into the LB endpoint hosting the mothership." Workers use RUNPOD_API_KEY env var (injected at deployment) for:
Test Plan
tests/unit/test_remote_executor.pytests/unit/test_rp_logger_adapter.py