Skip to content

feat(accounts): surface email-duplicate pairs in /api/accounts#829

Open
ozpool wants to merge 2 commits into
Soju06:mainfrom
ozpool:feat/accounts-duplicate-indicator-787b
Open

feat(accounts): surface email-duplicate pairs in /api/accounts#829
ozpool wants to merge 2 commits into
Soju06:mainfrom
ozpool:feat/accounts-duplicate-indicator-787b

Conversation

@ozpool
Copy link
Copy Markdown
Contributor

@ozpool ozpool commented May 27, 2026

Summary

Addresses #787 (B).

After an OAuth token-invalidation cascade, the dashboard re-add flow can leave the pool with two account rows that share the same email: one carries a stale refresh token that upstream now rejects, the other is the fresh re-import. Both show status=active in the local view, the older one keeps generating 401s through the proxy, and the operator has no way to spot the pair without grouping the /api/accounts payload by email themselves.

This PR adds is_email_duplicate to AccountSummary, defaulted False, set True when another account row in the same response carries the same email. Computed once per request from the loaded account list (no extra queries). Blank/missing emails and the legacy DEFAULT_EMAIL placeholder are excluded from duplicate detection.

The frontend zod schema mirrors the new field as optional so existing test fixtures and direct-literal AccountSummary constructions don't need churn updates; the dashboard can pick this up in a follow-up to render the indicator on the /accounts list / dashboard tile.

Behavior

  • Two or more rows with the same non-empty, non-placeholder email -> all such rows return isEmailDuplicate: true.
  • A unique email -> isEmailDuplicate: false.
  • Blank/None emails -> excluded from the count, never flagged.
  • The legacy DEFAULT_EMAIL placeholder -> excluded from the count, never flagged.
  • Comparison is case-sensitive (matches the storage normalization at OAuth-import time).

Tests

tests/integration/test_accounts_api.py::test_list_accounts_flags_email_duplicates:

  • inserts dup-stale + dup-fresh sharing dup@example.com and solo on solo@example.com
  • inserts two legacy placeholder-email rows using DEFAULT_EMAIL
  • calls GET /api/accounts
  • pins isEmailDuplicate=true on both dup-* rows and false on solo plus both placeholder rows

Existing accounts integration + repo + unit suites stay green (53 passed). Frontend bun run typecheck clean and full bun run test clean (446 passed).

Addresses Soju06#787 (B).

After an OAuth token-invalidation cascade (Codex/ChatGPT 'Revoke all
sessions' or upstream-side mass invalidation), the dashboard re-add
flow can leave the load-balancer pool with two account rows that share
the same email: one carries a stale refresh token that upstream now
rejects, the other is the fresh re-import. Both have status=active in
the local view, the older one keeps generating 401s through the proxy,
and the operator has no way to spot the pair without grouping the
/api/accounts payload by email themselves.

Add 'is_email_duplicate' to AccountSummary, defaulted False, set True
when another account row in the same response carries the same email.
Computed once per request from the loaded account list, no extra
queries. Empty emails are excluded so the legacy DEFAULT_EMAIL
placeholder used by malformed imports doesn't get flagged.

Frontend zod schema mirrors the new field as optional so existing test
fixtures and direct-literal AccountSummary constructions don't need
churn updates; the dashboard can pick this up in a follow-up to render
the indicator on the /accounts list.

Tests:
- tests/integration/test_accounts_api.py::test_list_accounts_flags_email_duplicates
  pins True on both rows of a shared-email pair and False on a solo row.
@Soju06
Copy link
Copy Markdown
Owner

Soju06 commented May 27, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

@Soju06
Copy link
Copy Markdown
Owner

Soju06 commented May 27, 2026

Merge hold from owner review:

  1. The PR body/comment says the legacy DEFAULT_EMAIL placeholder is excluded from duplicate detection, but the implementation only excludes falsy emails. DEFAULT_EMAIL is unknown@example.com, so duplicate placeholder rows would still be flagged. Please either exclude DEFAULT_EMAIL and add a regression test, or update the PR text/comment to state that only blank/missing emails are excluded.
  2. This adds an observable /api/accounts schema field but has no active OpenSpec change delta.
  3. Fresh Codex review is missing because the trigger hit usage limits, so there is no 🤖 codex: ok label.

Core duplicate detection for real non-empty emails looks fine; focused tests/typecheck/CI are green.

@Soju06 Soju06 added the hermes: needs-followup Hermes left a blocker/comment that needs follow-up observation label May 27, 2026
@Soju06
Copy link
Copy Markdown
Owner

Soju06 commented May 27, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

@KakatkarAkshay
Copy link
Copy Markdown

One concern: one of my use cases is the same email being used across multiple workspaces/orgs as separate valid accounts. Since this PR appears to flag duplicates by email alone within the /api/accounts response, those rows would be marked as isEmailDuplicate: true even though they are intentional and not stale/fresh duplicates.

Should duplicate detection include workspace/org context, if available, or otherwise avoid treating same-email-across-workspaces as a stale duplicate pair?

@ozpool
Copy link
Copy Markdown
Contributor Author

ozpool commented May 29, 2026

Following up on the merge-hold points:

  1. Placeholder exclusion is now handled. 3efed34 adds _is_duplicate_detection_email, which excludes both falsy emails and DEFAULT_EMAIL (unknown@example.com), and _emails_appearing_more_than_once only counts emails that pass that check. The PR text was the misleading part, so I also tightened it. Regression coverage is in test_list_accounts_flags_email_duplicates: two DEFAULT_EMAIL placeholder rows are inserted alongside a real stale/fresh pair, and both placeholders assert isEmailDuplicate is False while the real pair asserts True.

  2. The OpenSpec delta is in the same commit: openspec/changes/surface-account-email-duplicates/ with proposal.md, tasks.md, and a frontend-architecture spec delta. The spec adds an is_email_duplicate requirement on AccountSummary with three scenarios (duplicate real emails flagged, placeholder emails ignored, unique emails not flagged).

  3. The missing 🤖 codex: ok is the usage-limit gate on the review trigger, not something I can drive from my side — happy to re-trigger whenever the quota is back.

Focused tests/typecheck/CI are green on the current head. Let me know if anything else needs adjusting.

@ozpool
Copy link
Copy Markdown
Contributor Author

ozpool commented May 29, 2026

@KakatkarAkshay good catch, that is a real gap. As it stands the indicator groups by email alone, so two intentionally separate accounts that happen to share an email across different workspaces would both get isEmailDuplicate: true, which is not what this is meant to signal.

Worth noting the field is advisory only — it does not delete or deduplicate anything, it just lets the dashboard surface "these rows share an email" so an operator can decide. So the downside today is a false-positive hint rather than data loss. But the hint should still be correct.

There is a natural key for the fix already in the model: each account carries a chatgpt_account_id (via account_id), and generate_unique_account_id derives the row id from account_id + email hash. The stale/fresh cascade this PR targets (#787 B) is the same ChatGPT account re-imported, so those rows share both email and account_id. Your multi-workspace case is the opposite: same email, different account_id. So restricting the duplicate check to rows that share email and account_id would both drop the false positive for distinct-workspace accounts and tighten the indicator to the actual re-add cascade.

I'd rather not change the semantics unilaterally since it is the maintainer's call on what the field should mean. @Soju06 if you'd prefer the stricter email+account_id grouping, I'm happy to fold it into this PR (with a regression test for the same-email/different-account_id case) or split it into a follow-up.

@Komzpa
Copy link
Copy Markdown
Collaborator

Komzpa commented May 31, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3efed3407d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@Komzpa Komzpa force-pushed the feat/accounts-duplicate-indicator-787b branch from 3efed34 to 12ad36a Compare May 31, 2026 21:20
@Soju06 Soju06 added the 🤖 codex: needs work [@codex review] raised an issue label May 31, 2026
@Soju06
Copy link
Copy Markdown
Owner

Soju06 commented May 31, 2026

@codex review

@Komzpa Komzpa force-pushed the feat/accounts-duplicate-indicator-787b branch from 12ad36a to 67d0777 Compare May 31, 2026 21:36
@Soju06 Soju06 removed the 🤖 codex: needs work [@codex review] raised an issue label May 31, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 12ad36abc3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread app/modules/accounts/mappers.py Outdated
@Soju06 Soju06 added the 🤖 codex: needs work [@codex review] raised an issue label May 31, 2026
@Komzpa
Copy link
Copy Markdown
Collaborator

Komzpa commented May 31, 2026

@codex review

@Komzpa Komzpa force-pushed the feat/accounts-duplicate-indicator-787b branch from 67d0777 to fc9299a Compare May 31, 2026 21:50
@Soju06 Soju06 removed the 🤖 codex: needs work [@codex review] raised an issue label May 31, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 67d0777339

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread app/modules/accounts/mappers.py Outdated
@Komzpa Komzpa added the 🤖 codex: needs work [@codex review] raised an issue label May 31, 2026
@Soju06 Soju06 removed the 🤖 codex: needs work [@codex review] raised an issue label May 31, 2026
@Komzpa
Copy link
Copy Markdown
Collaborator

Komzpa commented May 31, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Can't wait for the next one!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@Soju06 Soju06 added the 🤖 codex: ok [@codex review] says no issues found. label May 31, 2026
@Komzpa Komzpa removed the hermes: needs-followup Hermes left a blocker/comment that needs follow-up observation label Jun 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🤖 codex: ok [@codex review] says no issues found.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants