Skip to content

fix(notif): online checker false negatives#4075

Merged
seanaye merged 7 commits into
mainfrom
seanaye/fix/offline-checker
Jun 15, 2026
Merged

fix(notif): online checker false negatives#4075
seanaye merged 7 commits into
mainfrom
seanaye/fix/offline-checker

Conversation

@seanaye

@seanaye seanaye commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

The offline checker was reporting false negatives for last online time because it was reading from the wrong redis URL.

This pr adjusts the redis url to read from the same instance that is being written to, and moves the notification_service to use the macro_config paradigm.

  • Route notification last-online checks to connection gateway Redis
  • Migrate notification service config to macro_config

@seanaye seanaye requested a review from a team as a code owner June 15, 2026 16:28
@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 031030eb-2cac-45a4-92e4-8db5f24cc974

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The pull request introduces a dedicated Redis connection for last-online tracking in the notification service. On the infrastructure side, a new Pulumi StackReference to connection-gateway-stack is added, and its connectionGatewayRedisUrl output is injected into the notification service container as LAST_ONLINE_REDIS_URI. On the Rust service side, database_env_vars and macro_config crate dependencies are added, and Config is rewritten from hand-parsed environment variables to a #[derive(macro_config::MacroConfig)] typed struct with new fields including last_online_redis_uri, url_signing_hmac, typed queue wrappers, and platform ARN fields. Config::from_env() is replaced by ConfigLoader::load, and two new accessors (last_online_redis_uri(), sns_apns_voip_platform_arn()) are added. In main.rs, all vars-based field accesses are replaced with typed config values, and a separate last_online_redis_conn is constructed from config.last_online_redis_uri() and passed into RedisLastOnlineRepo.

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title follows conventional commits format with 'fix:' prefix and clearly addresses the main issue of false negatives in the offline checker due to incorrect Redis URL.
Description check ✅ Passed The description comprehensively explains the root cause (reading from wrong Redis instance) and outlines both main changes (routing to connection gateway Redis and migrating to macro_config).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@rust/cloud-storage/notification_service/src/config.rs`:
- Around line 121-125: Add validation to require the LAST_ONLINE_REDIS_URI
configuration to be explicitly set in non-local environments, similar to the
existing SNS_APNS_VOIP_PLATFORM_ARN validation pattern shown in the code block.
Check that last_online_redis_uri (or its configuration option) is set when the
environment is not Local, and bail with an appropriate error message if it is
missing. This prevents the silent fallback to redis_uri that could cause
unintended Redis drift in non-local deployments. Apply this same validation
check at both the location shown (lines 121-125) and at the related location
mentioned (lines 130-134).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3199a083-667f-4678-a935-f995f5b7b902

📥 Commits

Reviewing files that changed from the base of the PR and between 85be656 and 6a60f4c.

⛔ Files ignored due to path filters (1)
  • rust/cloud-storage/Cargo.lock is excluded by !**/*.lock, !**/Cargo.lock
📒 Files selected for processing (5)
  • infra/stacks/notification-service/index.ts
  • rust/cloud-storage/notification_service/Cargo.toml
  • rust/cloud-storage/notification_service/src/config.rs
  • rust/cloud-storage/notification_service/src/env.rs
  • rust/cloud-storage/notification_service/src/main.rs

Comment on lines +121 to +125
if !matches!(config.environment, Environment::Local)
&& !config.sns_apns_voip_platform_arn.is_set()
{
anyhow::bail!("SNS_APNS_VOIP_PLATFORM_ARN must be provided");
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Require LAST_ONLINE_REDIS_URI outside local to prevent silent Redis drift.

Right now, non-local startup validates SNS_APNS_VOIP_PLATFORM_ARN but not LAST_ONLINE_REDIS_URI, and last_online_redis_uri() silently falls back to redis_uri. If that env var is missing in a non-local deploy, last-online reads revert to the wrong Redis and the false-negative behavior can return.

Suggested fix
 impl Config {
     pub fn from_env() -> anyhow::Result<Self> {
         let config = macro_config::ConfigLoader::load::<Config>()
             .context("failed to load notification service config")?;

-        if !matches!(config.environment, Environment::Local)
-            && !config.sns_apns_voip_platform_arn.is_set()
-        {
-            anyhow::bail!("SNS_APNS_VOIP_PLATFORM_ARN must be provided");
+        if !matches!(config.environment, Environment::Local) {
+            if !config.sns_apns_voip_platform_arn.is_set() {
+                anyhow::bail!("SNS_APNS_VOIP_PLATFORM_ARN must be provided");
+            }
+            if !config.last_online_redis_uri.is_set() {
+                anyhow::bail!("LAST_ONLINE_REDIS_URI must be provided");
+            }
         }

         Ok(config)
     }

Also applies to: 130-134

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/cloud-storage/notification_service/src/config.rs` around lines 121 -
125, Add validation to require the LAST_ONLINE_REDIS_URI configuration to be
explicitly set in non-local environments, similar to the existing
SNS_APNS_VOIP_PLATFORM_ARN validation pattern shown in the code block. Check
that last_online_redis_uri (or its configuration option) is set when the
environment is not Local, and bail with an appropriate error message if it is
missing. This prevents the silent fallback to redis_uri that could cause
unintended Redis drift in non-local deployments. Apply this same validation
check at both the location shown (lines 121-125) and at the related location
mentioned (lines 130-134).

@seanaye seanaye force-pushed the seanaye/fix/offline-checker branch from 6a60f4c to 83d5782 Compare June 15, 2026 17:20
@seanaye seanaye force-pushed the seanaye/fix/offline-checker branch from 6831cf4 to 47100e7 Compare June 15, 2026 17:35
@seanaye seanaye requested a review from whutchinson98 June 15, 2026 17:45
@seanaye seanaye merged commit cd55f9e into main Jun 15, 2026
28 checks passed
@seanaye seanaye deleted the seanaye/fix/offline-checker branch June 15, 2026 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants