[WIP] Fix fail-closed logic in isDisposable function #66
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Thanks for asking me to work on this. I will get started on it and keep this PR's description up to date as I form a plan and make progress.
Original prompt
Improvement 2: The "Fail-Closed" Logic
The Flaw: In your D1 isDisposable function, you have if (!activeTableResult) { return false; }. This is a "fail open" scenario. If your state table fails to read, you return false (meaning "not disposable"), and a disposable email is marked as isValid: true. This is a critical bug.
The "Pro" Fix: We must "fail closed". If anything goes wrong in the check, we must assume the email is disposable to protect your customer.
Action: My new isDisposable function from Improvement 1 already fixes this by having a safe default. But to make your original D1 code safer, the fix would be:
TypeScript
// In your original D1-only code...
if (!activeTableResult) {
console.error("CRITICAL: active_table not found or uninitialized.");
return true; // <-- FAIL CLOSED. Assume disposable if state is broken.
}
Improvement 3: The "Bulletproof" Cron Safety Check
The Flaw: Your safety check in handleScheduled is if (domains.length < 100000). This is good. I am glad you have this. But it's brittle. What if the list shrinks to 99,000 domains? Your cron will fail forever.
The "Pro" Fix: Use a more reasonable threshold to only check for catastrophic failure (e.g., the source URL returning 0 results).
Action:
TypeScript
// myprojectsravi/email_validation_api/myProjectsRavi-email_validation_api-990fb5cbb616a226739b939d82af1c8e240d200a/src/index.ts
// ... inside handleScheduled ...
if (domains.length < 50000) { // Safety check
console.error(
Fetched list is too small (${domains.length}). Aborting update to prevent data loss.);return;
}
The Flaw: Your API (...990fb5c) makes two D1 reads for every single validation request. Look at your isDisposable function:
D1 Read 1: SELECT live_table FROM active_table... (To find out if we should use "blue" or "green")
D1 Read 2: SELECT domain FROM disposable_domains_blue... (To actually check the domain)
This is an unnecessary double-dip. We are storing the state (the pointer) in the same place as the data (the domains). We can make this faster.
The "God Mode" Solution: The data (millions of domains) belongs in D1. The state (a single string: "blue" or "green") belongs in KV.
By combining D1 and KV, we use each for its perfect job. This removes a D1 query from every API call, making your API even faster and more efficient.
Here are the exact changes to make your API truly "god-mode."
We need to re-introduce a KV binding, but only for storing the state.
Ini, TOML
name = "email-validation-api"
main = "src/index.ts"
compatibility_date = "2025-10-31"
[handlers]
fetch = "fetch"
scheduled = "scheduled"
1. The DATA (Millions of domains)
[[d1_databases]]
binding = "DB"
database_name = "email-validation-db"
database_id = "<YOUR_D1_DATABASE_ID_GOES_HERE>"
2. The STATE (A single key: 'live_table' = 'blue' or 'green')
[[kv_namespaces]]
binding = "API_STATE"
id = "<YOUR_NEW_KV_NAMESPACE_ID>"
preview_id = "<YOUR_NEW_KV_PREVIEW_ID>"
[triggers]
crons = ["0 0 * * 0"]
2. Updated schema.sql
We no longer need the active_table in D1. KV is handling that.
SQL
-- schema.sql
DROP TABLE IF EXISTS disposable_domains_blue;
CREATE TABLE disposable_domains_blue (
domain TEXT PRIMARY KEY NOT NULL
);
DROP TABLE IF EXISTS disposable_domains_green;
CREATE TABLE disposable_domains_green (
domain TEXT PRIMARY KEY NOT NULL
);
-- The 'active_table' is NO LONGER NEEDED.
-- We will store this pointer in a KV binding named 'API_STATE'
After you run this, you must manually seed your KV namespace one time: wrangler kv:key put --binding=API_STATE "live_table" "blue"
This code is now faster, cleaner, and uses the best of both worlds.
TypeScript
// src/index.ts
import { Hono } from 'hono';
// === TYPE DEFINITIONS ===
export interface Env {
DB: D1Database;
API_STATE: KVNamespace; // <-- The state pointer
}
// (DnsQueryResponse and ValidationResult are unchanged)
interface DnsQueryResponse {
Status: number;
Answer?: { type: number; data: string }[];
}
interface ValidationResult {
email: string;
isValid: boolean;
reason: 'valid' | 'invalid_syntax' | 'disposable' | 'no_mx_record';
checks: {
syntax: boolean;
disposable: boolean;
mx_record: boolean;
};
}
// === CONFIGURATION ===
const DISPOSABLE_LIST_URL = 'https://raw.githubusercontent.com/disposable-email-domains/disposable-email-domains/master/list.txt';
// === VALIDATION HELPERS ===
function isEmailSyntaxValid(email: string): boolean {
const emailRegex = /^[a-zA-Z0-9.!#$%&'+/=?^_`{|}~-]+@[a-zA-Z0-9-]+(?:.[a-zA-Z0-9-]+)$/;
return emailRegex.test(email);
}
/**
*/
async function isDisposable(domain: string, env: Env): Promise {
// 1. Get pointer from KV
const live...
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.