Skip to content

fix: resolve .env from CWD and stop _do_deploy env mutation#236

Open
deanq wants to merge 6 commits intomainfrom
deanq/ae-1549-env-vars-from-cwd-first
Open

fix: resolve .env from CWD and stop _do_deploy env mutation#236
deanq wants to merge 6 commits intomainfrom
deanq/ae-1549-env-vars-from-cwd-first

Conversation

@deanq
Copy link
Member

@deanq deanq commented Feb 28, 2026

Summary

  • Fix .env resolution: dotenv_values() and load_dotenv() now use find_dotenv(usecwd=True) to walk up from CWD (user's project) instead of from the package source location. Editable installs were loading the library's dev .env (PYTHONPATH=src, FLASH_HOST=localhost) instead of the user's project .env (HF_TOKEN, RUNPOD_API_KEY, etc).
  • Fix config drift from _do_deploy: Runtime env var injections (RUNPOD_API_KEY, FLASH_MODULE_PATH, FLASH_ENDPOINT_TYPE) now go into self.template.env via a new _inject_template_env() helper instead of mutating self.env. Since env is a hashed field, the old mutation caused false config drift on every subsequent deploy, triggering unnecessary rolling releases.

Test plan

  • 7 new tests for _inject_template_env (adds KV pair, idempotent, no-op with None template, initializes empty env list, non-mutation of self.env, API key injection into template, LB module path + endpoint type injection)
  • Updated 2 existing dotenv tests to match new import pattern
  • All 1266 tests pass via make quality-check
  • Coverage at 73.95% (above 65% threshold)
  • Manual: flash run --auto-provision on flash-examples/03_mixed_workers provisions with correct env vars, no rolling release on first request

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes environment variable loading and deployment-time env injection to ensure .env files are resolved from the user’s working directory and to prevent config drift caused by mutating hashed configuration fields during deploy.

Changes:

  • Resolve .env from CWD via find_dotenv(usecwd=True) for both load_dotenv() and dotenv_values().
  • Stop mutating self.env during _do_deploy; inject runtime env vars into self.template.env via a new _inject_template_env() helper.
  • Add/adjust unit tests to cover the new injection behavior and updated dotenv import/call patterns.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/runpod_flash/__init__.py Loads .env via find_dotenv(usecwd=True) before other imports.
src/runpod_flash/core/resources/environment.py Reads .env values from a CWD-upwards search instead of package-relative resolution.
src/runpod_flash/core/resources/serverless.py Adds _inject_template_env() and switches deploy-time injections to template env to avoid env hash drift.
src/runpod_flash/core/resources/load_balancer_sls_resource.py Injects FLASH_ENDPOINT_TYPE into template env instead of mutating self.env.
tests/unit/test_dotenv_loading.py Updates assertions to match new dotenv import/call pattern.
tests/unit/resources/test_serverless.py Adds tests validating template env injection and non-mutation of self.env during deploy.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +784 to +785
has_explicit_template_env = (
not new_config.env and new_config.template.env is not None
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has_explicit_template_env is computed as not new_config.env and new_config.template.env is not None, but PodTemplate.env defaults to an empty list (not None). That makes has_explicit_template_env effectively always true whenever new_config.env is {}, forcing skip_env=False and causing update_template() to send an empty env list that can wipe platform-injected vars (e.g. PORT/PORT_HEALTH) and trigger rolling releases. Consider detecting whether env was explicitly set on the template (e.g. via Pydantic’s model_fields_set / __pydantic_fields_set__) instead of checking is not None.

Suggested change
has_explicit_template_env = (
not new_config.env and new_config.template.env is not None
template_fields_set = getattr(
new_config.template, "__pydantic_fields_set__", set()
)
has_explicit_template_env = (
not new_config.env and "env" in template_fields_set

Copilot uses AI. Check for mistakes.
@deanq deanq force-pushed the deanq/ae-1549-env-vars-from-cwd-first branch from c5916e6 to 405f046 Compare March 3, 2026 07:24
deanq added 6 commits March 4, 2026 23:09
dotenv_values() and load_dotenv() without arguments use find_dotenv()
which walks up from the calling file's directory. For editable installs,
this resolves the library's dev .env (PYTHONPATH=src, FLASH_HOST=localhost)
instead of the user's project .env (HF_TOKEN, RUNPOD_API_KEY, etc).

Pass find_dotenv(usecwd=True) so the search starts from CWD (the user's
project directory) in both __init__.py and environment.py.
…rift

_do_deploy injected runtime env vars (RUNPOD_API_KEY, FLASH_MODULE_PATH,
FLASH_ENDPOINT_TYPE) directly into self.env, which is a hashed field.
This caused config drift detection on every subsequent deploy, triggering
unnecessary rolling releases.

Add _inject_template_env() helper that appends KeyValuePairs to
self.template.env instead. Runtime injections now go into the template
(which is excluded from hashing) while self.env stays clean for drift
detection.
The old self.env["FLASH_ENDPOINT_TYPE"] = "lb" was dead code — env is
in _input_only, so model_dump excluded it from the API payload. The
refactor to _inject_template_env made it actually reach the worker,
which triggered is_flash_deployment() -> maybe_unpack() -> artifact
not found error for flash run (live serverless) endpoints.

For flash deploy, the runtime resource_provisioner already sets
FLASH_ENDPOINT_TYPE=lb. This injection point is not needed.
When updating an endpoint, the saveTemplate mutation previously always
sent the user's env vars, which overwrote platform-injected vars like
PORT and PORT_HEALTH on LB endpoints. This triggered unnecessary
rolling releases.

Now _build_template_update_payload accepts skip_env; update() compares
old vs new env and omits env from the template payload when unchanged,
letting the platform preserve its injected vars.
The comment said env vars were excluded from the hash, but they are
included. The exclude set only contains RUNTIME_FIELDS and
EXCLUDED_HASH_FIELDS (id), not env.
…v overwrite

Extract _inject_runtime_template_vars() from _do_deploy so both initial
deploy and update() paths inject RUNPOD_API_KEY and FLASH_MODULE_PATH
into template.env. Without this, runtime vars set during _do_deploy
were silently dropped when update() overwrote the template env on
config drift. Also preserve explicit template.env entries when env
dict is empty on both sides.
@deanq deanq force-pushed the deanq/ae-1549-env-vars-from-cwd-first branch from d8c66e9 to 081d553 Compare March 5, 2026 07:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants