domain-finder

domain-finder is a Go CLI project for scanning ICANN CZDS zone files and building toward high-scale domain availability checks.

Current status

The repository currently uses stem-based matching across loaded zones:

a thin CLI entrypoint at cmd/domain-finder
internal/zonefile for opening files and detecting gzip by content
streaming line-by-line zone reading
internal/index for exact-match named-zone indexing and deterministic lookup
internal/backend for backend-neutral file and PostgreSQL exact-match lookups
internal/candidates for stem loading, normalization, merge, and dedupe
internal/config for YAML config loading with CLI/env/local/base/default precedence
internal/match for stable per-stem classification across loaded zones
internal/openai for batch stem generation through the OpenAI API
internal/report for filtering and summary statistics
internal/output for deterministic durable text and JSONL rendering
internal/termui for Bubble Tea–based interactive rendering on stderr
a CLI workflow that loads named zones, ingests stems from flags, files, and/or stdin, composes <stem>.<zone> internally, and reports per-zone presence or absence for each stem

Tests intentionally use small deterministic fixtures. They do not depend on full .com or .net CZDS zone files.

Quick examples

Interactive Bubble Tea search with strong hits shown durably:

env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -interactive \
  -zone com=testdata/small/com.zone \
  -zone net=testdata/small/net.zone.slice \
  -candidate example \
  -candidate missing

Interactive search that also keeps partial hits visible:

env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -interactive \
  -interactive-show-partials \
  -zone com=testdata/small/com.zone \
  -zone net=testdata/small/net.zone.slice \
  -candidate example \
  -candidate missing

Deterministic non-interactive text output:

env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -no-interactive \
  -zone com=testdata/small/com.zone \
  -zone net=testdata/small/net.zone.slice \
  -candidate example \
  -candidate missing

Prompt-contract dry run without any API call:

env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -generate "industrial infrastructure names" \
  -generate-dry-run \
  -generate-dry-run-format json \
  -generate-quality-profile industrial \
  -generate-avoid-prefixes "dev,neo" \
  -generate-avoid-suffixes "io,ia,ora,iva,ara"

Machine-readable run artifacts:

env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -no-interactive \
  -audit-log run.jsonl \
  -run-summary run-summary.json \
  -zone com=testdata/small/com.zone \
  -zone net=testdata/small/net.zone.slice \
  -candidate example \
  -candidate missing

Candidate / search model

Candidate inputs are stems such as example, missing, or my-brand
Loaded zones determine which FQDNs are tested
For loaded zones com and net, candidate example checks:
- example.com
- example.net
Zone indexes still store normalized FQDN owner names from the zone files

Backends

-backend file keeps the existing zone-file lookup behavior
-backend postgres checks exact stem presence against PostgreSQL Domain Miner data
The result model stays stem-based and backend-neutral

Zone syntax by backend

File backend:
- -zone com=path/to/com.zone
- -zone net=path/to/net.zone
PostgreSQL backend:
- -zone com
- -zone net

PostgreSQL exact-match semantics

Query shape uses SELECT EXISTS (...)
Exact match keys:
- zone_file = <zone>
- name = <stem>
Assumed schema:
- table dm.zone_records
- columns zone_file, name

Candidate file and stdin format

Plain text, one stem per line
Blank lines are ignored
Lines beginning with # are ignored as comments
Remaining lines are treated as raw candidate stems

Candidate merge and dedupe behavior

Repeated -candidate flags are read first
-candidate-file entries are read second
-candidate-stdin entries are read third
Stems are normalized and deduplicated while preserving first-seen order
Invalid stems are rejected with a clear error

YAML config and OpenAI generation

Optional config files:
- domain-finder.yaml
- domain-finder.local.yaml
Config precedence:
1. CLI flags
2. environment variables
3. domain-finder.local.yaml
4. domain-finder.yaml
5. built-in defaults
OPENAI_API_KEY is the primary secret source
PG_DSN can provide the PostgreSQL connection string
domain-finder.local.yaml may contain a local fallback openai.api_key
domain-finder.yaml must not contain API keys
domain-finder.local.yaml is ignored by git

Committed example config lives at domain-finder.yaml.example.

Generation workflow

-generate "prompt text" requests OpenAI-generated stems
-generate-count sets the total requested stem count
-generate-batch-size sets the per-request batch size
-generate-adaptive-refill shrinks effective batch size after repeated underfilled batches
-generate-min-batch-size sets the minimum effective batch size for adaptive refill
-generate-model overrides the configured OpenAI model
-generate-style adds reusable style guidance such as invented SaaS or developer tool
-generate-quality-profile industrial|off applies a generated-only quality filter after validation and lexical bans
-generate-phonetic-quality normal|strict controls how aggressively the generated-only scoring stage screens stems
-generate-min-length requires generated stems to be at least N letters long
-generate-min-score requires generated stems to clear an internal score before lookup
-generate-max-length prefers stems with no more than N letters
-generate-max-syllables prefers shorter, simpler-sounding stems
-generate-prefix prefers stems that start with specific text
-generate-suffix prefers stems that end with specific text
-generate-avoid-substrings hard-bans low-value lexical families from generated stems
-generate-avoid-prefixes hard-bans generated stems that start with certain prefixes
-generate-avoid-suffixes hard-bans generated stems that end with certain suffixes
-generate-max-cost-usd stops generation once cumulative estimated spend reaches the configured USD cap
-generate-target-available-hits stops generation once enough candidates are available in at least one requested zone
-generate-target-strong-hits stops generation once enough all-zone strong hits have been found
-generate-max-stall-batches stops generation after too many consecutive no-progress batches
-generate-dry-run prints the fully resolved generation contract and exits before any OpenAI call
-generate-dry-run-format text|json chooses human-readable or machine-readable inspection output
-audit-log <path> writes one audit JSONL record per checked stem
-run-summary <path> writes one machine-readable JSON summary object for the run
generate.max_attempts bounds how many attempts each batch gets to satisfy its target
generate.retry_count bounds transient API retries inside one attempt
Generated values are treated as stems, not FQDNs
Generated batches are normalized and deduped through internal/candidates
Manual, file, stdin, and generated stems can all be used together
Matching still composes <stem>.<zone> internally
When the OpenAI response includes usage, text-mode generation runs show compact token and estimated-cost telemetry

Prompt constraints vs validation

Prompt constraints steer the OpenAI request; they do not guarantee compliance
internal/openai now owns a dedicated prompt builder for the generation contract
Generated values still pass through the normal stem validation and dedupe pipeline
Invalid outputs such as FQDNs, spaces, punctuation, duplicates, or empty strings are still rejected after generation
avoid_substrings is stronger than prompt guidance alone:
- it is rendered into the prompt contract as an explicit negative rule
- generated stems containing banned substrings are also hard-rejected after generation
avoid_prefixes and avoid_suffixes extend that same generated-only hard-rejection policy:
- both are rendered into the prompt contract as explicit negative rules
- generated stems starting with banned prefixes or ending with banned suffixes are rejected before lookup
min_length follows that same generated-only pattern:
- it is rendered into the prompt contract as a minimum-length rule
- generated stems shorter than the configured minimum are hard-rejected after normalization
- run-level diagnostics report these rejections as too_short
generated runs now also pass through internal/namescore after normalization and before lexical bans:
- phonetic scoring rejects names with too many syllables, oversized consonant clusters, or overly long vowel-free spans
- structural scoring favors 6-10 character stems and rejects startup-style endings plus obvious low-value tech fragments
- brand scoring rewards industrial or structural concepts such as vector, crux, forge, or atlas
- stems below the configured minimum score are hard-rejected before lookup
- -generate-phonetic-quality strict raises the effective score floor to at least 70
- diagnostics now include score_rejected, phonetic_rejected, structural_rejected, and compact score-bucket counts
generate.quality_profile is a generated-only taste filter:
- industrial now more aggressively favors stronger, harder-edged infrastructure-like name shapes
- compact 5-7 letter forms, denser consonant structure, stronger consonant anchors, and harder endings score positively
- soft pharma/startup-mush patterns such as soft open endings, mushy CV alternation, and weak consonant weight are rejected more aggressively
- manual CLI, file, and stdin stems are not filtered by this profile
generated runs also use a lightweight family-diversity guard:
- accepted stems are limited per crude family signature so one naming basin does not dominate the run
- this is deterministic, explainable, and generated-only
- family rejections appear in diagnostics as family_rejected
-generate-dry-run uses the same resolved config and prompt builder, but does not require an API key and does not touch the network

Budget- and goal-driven stop conditions

Generation can now stop on any configured stop condition:
- accepted-count target
- estimated cost cap
- available-hit target
- strong-hit target
- stall limit
The default fallback still uses -generate-count as the accepted-count stop condition
If additional stop controls are configured, the run ends when any configured condition is reached first
-generate-max-cost-usd uses cumulative estimated spend from OpenAI usage plus the repo pricing table
Cost-cap runs require known pricing for the selected model
- if pricing is unavailable, the run fails clearly instead of silently ignoring the cap
-generate-target-available-hits uses a broader availability definition:
- stems with result state partial or all
- this means available in at least one requested zone
- taken stems do not count
-generate-target-strong-hits tracks the strongest current result class:
- stems absent across all requested zones
- the same all ✓ semantics shown in the interactive table
-generate-max-stall-batches uses a simple, explicit stall definition:
- consecutive batches with zero newly accepted generated stems
- and zero increase in strong all-zone hits
Batch status lines now show compact progress such as:
- available 37/100
- strong 3/25
- stall 2/8
- cost $0.18/1.00
At the end of a generation run, text mode prints a compact generation stop block explaining which condition ended the run
The run-summary artifact also records:
- configured stop-condition settings
- actual stop reason at run end

Adaptive refill policy

-generate-adaptive-refill is an opt-in sparse-search policy for longer generation runs
When enabled:
- generation starts at the configured batch_size
- repeated underfilled batches shrink the effective batch size for later batches
- the first version uses a simple one-way shrink for the run
- after 2 consecutive underfilled batches, the effective batch size is cut in half
- it will not shrink below -generate-min-batch-size
Recovery is intentionally simple in v1:
- batch size does not grow back during the same run
Adaptive refill only changes request sizing
- it does not change acceptance logic
- it does not change lookup semantics
- it does not change stop-condition semantics
When enabled, status lines show the active effective batch size:
- effective_batch 8
- batch_size 8->4
The run summary records:
- whether adaptive refill was enabled
- the configured minimum batch size
- the final effective batch size

Generation dry run

-generate-dry-run is an inspection mode for prompt tuning
It prints the resolved model, generation counts, retry policy, quality profile, theme, style, structural constraints, and the final prompt-builder output
It also prints the resolved generated-name scoring policy such as phonetic_quality and min_score
It also prints the resolved stop-condition policy such as cost cap, strong-hit target, and stall limit
It also prints adaptive-refill policy such as whether it is enabled and the minimum batch size
-generate-dry-run-format text keeps the current readable inspection block
-generate-dry-run-format json emits a stable JSON contract for diffing, archiving, and tooling
It exits before backend loading, OpenAI client creation, or any network call
It is intended for prompt-contract inspection, not candidate lookup

Hardened generation behavior

Each requested batch now has a bounded fulfillment policy:
- request the batch target
- normalize and dedupe through the existing stem pipeline
- if too few usable new stems survive, try again for the remainder
- if refill attempts are exhausted, keep any accepted stems, record the batch as underfilled, and continue to the next batch
Transient OpenAI failures such as rate limits or server errors are retried a bounded number of times
Poor model output such as duplicates, FQDNs, punctuation, empty values, or noisy text is treated as degraded batch quality rather than silently corrupting the candidate pipeline
Generated stems containing banned substrings are rejected before lookup and counted as unusable batch output
Generated stems hitting banned prefixes or banned suffixes are also rejected before lookup
Generated stems below the configured phonetic/structural/brand score are also rejected before lookup
Generated stems can also be rejected by the configured quality profile before lookup, and those rejections are counted separately in generation progress
Interactive and text-mode generation runs emit concise stderr status lines showing batch requests, accepted/rejected counts, retries, and completion/failure
Underfilled batches are now diagnostic, not fatal:
- batch status lines can append underfilled N
- the run keeps going until an actual stop condition ends it
At the end of a generation run, text-mode runs also print a compact generation diagnostics block summarizing dominant rejection categories across the whole run
Those same runs also print a compact generation underfill block when any batches finished short
The same text-mode runs now also print a compact generation usage block with:
- model
- input/output token totals
- cached input token totals when available
- estimated cost from the repo's built-in pricing table
Batch status lines include compact last-call and cumulative estimated cost when pricing is known
If the API omits usage, the run continues and the usage summary reports usage: unavailable
If the configured model is not in the built-in pricing table, token totals are still shown when available but cost is reported as pricing unavailable
JSONL mode stays machine-readable and does not emit live generation progress

Generation diagnostics summary

After a real generation run in text mode, domain-finder prints a compact run-level diagnostics block on stderr
This summary aggregates generated-stem rejection signals across the run, including:
- too_short
- score_rejected
- phonetic_rejected
- structural_rejected
- banned_substring
- banned_prefix
- banned_suffix
- quality.<reason>
- family_rejected
- invalid
- duplicates
It also includes a compact score distribution such as score.30-49 or score.70-84
Quality reasons reuse the same explainable categories used by the generated quality filter, such as quality.pharma_like_suffix or quality.soft_open_ending
The goal is operator tuning:
- identify which failure families are dominating
- adjust prompts, lexical bans, or the active quality profile accordingly
This diagnostics summary is separate from:
- deterministic text result output
- JSONL result output
- the audit log

Audit log

-audit-log <path> creates or truncates a JSONL file for the run
The audit log is separate from:
- the interactive stderr tape
- deterministic text output
- JSONL result output
It records every checked stem, including stems that were:
- filtered out of the interactive table
- suppressed by -interactive-hide-taken
Each record includes:
- stem
- backend
- requested_zones
- per-zone available results
- state (all, partial, taken)
- report_emitted
- interactive_emitted
This is the durable machine-readable truth of what was checked during the run

Run summary artifact

-run-summary <path> creates or truncates one JSON file for the run
The run summary is separate from:
- the per-stem audit log
- the interactive stderr tape
- deterministic text or JSONL result output
It captures run-level context and outcomes, including:
- backend
- requested zones
- filter mode
- whether interactive mode was used
- final checked/emitted/strong-hit counts
- generation settings when generation was used
- configured stop-condition settings and final stop reason when generation was used
- adaptive-refill settings and final effective batch size when generation was used
- generated scoring settings such as phonetic_quality and min_score
- underfilled-batch totals when generation was used
- generation token totals and estimated cost when usage data was available
- aggregated generation diagnostics, rejection categories, and score-bucket distribution
Use it when you want one stable artifact per run for diffing, archiving, or comparing prompt/profile changes over time

Token and cost telemetry

Token telemetry is grounded in actual OpenAI API usage fields when the response includes them
The tool tracks:
- input tokens
- output tokens
- cached input tokens from usage.prompt_tokens_details.cached_tokens when present
Cost is an estimate, not a bill:
- it uses the repo's explicit model pricing table
- unknown model pricing is not guessed
- current pricing assumptions should be updated when the official pricing page changes
This telemetry appears in:
- compact generation status lines during the run
- the end-of-run generation usage block
- the JSON run-summary artifact

Interactive vs fallback text mode

Interactive console is enabled only for text mode when stderr is a TTY
Real TTY interactive runs now use Bubble Tea for stable terminal rendering
-interactive forces the interactive console on
-no-interactive forces the deterministic fallback report path
-interactive-hide-taken suppresses durable taken rows in the interactive compact table only
-interactive-show-partials keeps partial available-zone hits as durable rows in interactive mode
-color forces ANSI styling in interactive mode
-no-color disables ANSI styling in interactive mode
jsonl mode never uses the interactive console

Interactive console behavior

Prints a small startup header showing loaded zones, candidate count, and filter
Uses a Bubble Tea–managed live status area while checking and generating
Shows one reusable live status line instead of ad hoc carriage-return composition
Uses that single live line for low-value batch chatter such as:
- batch requests
- accepted 0
- duplicate-only or no-progress refill attempts
- adaptive refill/effective batch-size updates
Prints durable rows only for meaningful candidate discoveries
Uses a compact table layout tuned for scanability
Header columns are:
- stem: the stem being evaluated
- available_zones: which requested zones are currently available for that stem
- result: compact status semantics
Durable rows use explicit status language:
- all ✓ means all requested zones are available for that stem
- partial means only some requested zones are available
- taken means none of the requested zones are available
By default, interactive mode keeps durable output focused on strongest all-zone hits
-interactive-show-partials adds partial hits back as durable rows
Strongest all-zone hits stay visually strongest with the success marker and optional ANSI styling
-interactive-hide-taken only suppresses durable taken rows on the interactive tape
It does not change matching, filtering, non-interactive text output, or JSONL output
End-of-run summaries remain durable:
- generation diagnostics
- generation usage
- generation underfill
- generation stop
- final checked/emitted/strong line
Leaves the terminal in a clean state on normal completion without fragment collisions

stdout / stderr / file behavior

Interactive text mode:
- compact streaming table on stderr
- no detailed durable result blocks on stdout
- if -out is used, the full deterministic report still goes to the file
- if -audit-log is used, every checked stem is still recorded in JSONL even if the row is not shown on the interactive tape
Non-interactive text mode:
- deterministic text report on stdout, or in -out
- no interactive terminal rendering
JSONL mode:
- deterministic JSON Lines on stdout, or in -out
- no interactive terminal rendering

Manual examples

Stem-based CLI input:

env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -backend file \
  -no-interactive \
  -zone com=testdata/small/com.zone \
  -zone net=testdata/small/net.zone.slice \
  -candidate example \
  -candidate missing

Stem-based candidate-file input:

env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -backend file \
  -interactive \
  -filter absent-in-all \
  -zone com=testdata/small/com.zone \
  -zone net=testdata/small/net.zone.slice \
  -candidate-file testdata/small/candidates.txt

Stem-based stdin input:

printf 'missing\nexample\n' | \
env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -backend file \
  -no-interactive \
  -candidate-stdin \
  -zone com=testdata/small/com.zone \
  -zone net=testdata/small/net.zone.slice

Interactive console with stems:

env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -backend file \
  -interactive \
  -zone com=testdata/small/com.zone \
  -zone net=testdata/small/net.zone.slice \
  -candidate example \
  -candidate missing

Interactive console with taken rows suppressed:

env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -backend file \
  -interactive \
  -interactive-hide-taken \
  -zone com=testdata/small/com.zone \
  -zone net=testdata/small/net.zone.slice \
  -candidate example \
  -candidate missing

Interactive console with partials kept durably:

env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -backend file \
  -interactive \
  -interactive-show-partials \
  -zone com=testdata/small/com.zone \
  -zone net=testdata/small/net.zone.slice \
  -candidate example \
  -candidate missing

Interactive mode with audit logging:

env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -backend file \
  -interactive \
  -interactive-hide-taken \
  -audit-log run.jsonl \
  -zone com=testdata/small/com.zone \
  -zone net=testdata/small/net.zone.slice \
  -candidate example \
  -candidate missing

Manual run with a machine-readable run summary:

env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -backend file \
  -no-interactive \
  -run-summary run-summary.json \
  -zone com=testdata/small/com.zone \
  -zone net=testdata/small/net.zone.slice \
  -candidate example \
  -candidate missing

Non-interactive mode with audit logging:

env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -backend file \
  -no-interactive \
  -audit-log run.jsonl \
  -zone com=testdata/small/com.zone \
  -zone net=testdata/small/net.zone.slice \
  -candidate example \
  -candidate missing

Interactive console with strong-hit styling:

env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -backend file \
  -interactive \
  -color \
  -filter absent-in-all \
  -zone com=testdata/small/com.zone \
  -zone net=testdata/small/net.zone.slice \
  -candidate missing

YAML-configured generation with manual stems:

cp domain-finder.yaml.example domain-finder.yaml
export OPENAI_API_KEY=your-key-here
env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -backend file \
  -interactive \
  -zone com=testdata/small/com.zone \
  -zone net=testdata/small/net.zone.slice \
  -candidate missing \
  -generate "short invented SaaS brand stems" \
  -generate-count 6 \
  -generate-batch-size 3

Constrained generation with prompt builder guidance:

export OPENAI_API_KEY=your-key-here
env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -backend file \
  -interactive \
  -zone com=testdata/small/com.zone \
  -zone net=testdata/small/net.zone.slice \
  -generate "short product name stems" \
  -generate-quality-profile industrial \
  -generate-style "developer tool" \
  -generate-count 8 \
  -generate-batch-size 4 \
  -generate-max-length 12 \
  -generate-max-syllables 3 \
  -generate-prefix dev \
  -generate-suffix io

Constrained generation with hard lexical bans:

export OPENAI_API_KEY=your-key-here
env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -backend file \
  -interactive \
  -zone com=testdata/small/com.zone \
  -zone net=testdata/small/net.zone.slice \
  -generate "short product name stems" \
  -generate-quality-profile industrial \
  -generate-style "developer tool" \
  -generate-count 8 \
  -generate-batch-size 4 \
  -generate-max-length 12 \
  -generate-max-syllables 3 \
  -generate-suffix io \
  -generate-avoid-substrings "dev,code,stack,cloud,sync,ops,grid,craft,build,tool,lab,forge,flow" \
  -generate-avoid-prefixes "dev,neo" \
  -generate-avoid-suffixes "io,ia,ora,iva,ara"

Budget-shaped exploratory generation:

export OPENAI_API_KEY=your-key-here
env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -backend file \
  -interactive \
  -zone com=testdata/small/com.zone \
  -zone net=testdata/small/net.zone.slice \
  -generate "industrial infrastructure names" \
  -generate-quality-profile industrial \
  -generate-count 200 \
  -generate-max-cost-usd 1.00 \
  -generate-target-strong-hits 25 \
  -generate-max-stall-batches 8

Adaptive refill for sparse late-run search:

export OPENAI_API_KEY=your-key-here
env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -backend file \
  -interactive \
  -zone com=testdata/small/com.zone \
  -zone net=testdata/small/net.zone.slice \
  -generate "industrial infrastructure names" \
  -generate-count 200 \
  -generate-adaptive-refill \
  -generate-min-batch-size 2 \
  -generate-max-stall-batches 8

Dry-run prompt inspection without spending API calls:

env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -generate "short product name stems" \
  -generate-dry-run \
  -generate-quality-profile industrial \
  -generate-style "developer tool" \
  -generate-max-length 12 \
  -generate-max-syllables 3 \
  -generate-prefix dev \
  -generate-suffix io \
  -generate-adaptive-refill \
  -generate-min-batch-size 2 \
  -generate-max-cost-usd 1.00 \
  -generate-target-strong-hits 25 \
  -generate-max-stall-batches 8

Machine-readable dry-run contract:

env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -generate "short product name stems" \
  -generate-dry-run \
  -generate-dry-run-format json \
  -generate-quality-profile industrial \
  -generate-style "developer tool" \
  -generate-max-length 12 \
  -generate-max-syllables 3 \
  -generate-prefix dev \
  -generate-suffix io \
  -generate-max-cost-usd 1.00 \
  -generate-target-strong-hits 25 \
  -generate-max-stall-batches 8

Typical operator feedback during generation:

generation: batch 1 attempt 1 requesting 3 stems
generation: batch 1 attempt 1 accepted 2, invalid 1, banned 0, quality_rejected 1, duplicates 0, need 1 more | strong 1/25 | stall 0/8 | cost $0.03/1.00
generation: batch 9 attempt 1 requesting 2 stems | batch_size 8->2
generation diagnostics
banned_prefix: 3
banned_suffix: 4
quality.pharma_like_suffix: 4
family_rejected: 2
duplicates: 2
generation: retrying batch 1 attempt 2 (1/2) after transient error
generation: complete, accepted 6 stems | total $0.18 | stop strong-hit target reached
generation stop
reason: strong-hit target reached
strong_hits: 25/25
stall_batches: 0/8
estimated_cost_usd: $0.18/1.00

Example generation tuning in YAML:

generate:
  count: 20
  batch_size: 10
  adaptive_refill: true
  min_batch_size: 2
  max_attempts: 3
  retry_count: 2
  max_cost_usd: 1.00
  target_strong_hits: 25
  max_stall_batches: 8
  quality_profile: industrial
  max_length: 10
  max_syllables: 3
  prefix: ""
  suffix: ""
  style: industrial infrastructure naming
  avoid_substrings: dev,code,stack,cloud,sync,ops,grid,craft,build,tool,lab,forge,flow
  avoid_prefixes: dev,neo
  avoid_suffixes: io,ia,ora,iva,ara

The run summary JSON complements the audit log:

audit log: one JSONL record per checked stem
run summary: one JSON object per run

PostgreSQL backend example:

export PG_DSN='postgres://user:pass@localhost:5432/domainminer'
env GOCACHE=/tmp/domain-finder-gocache \
go run ./cmd/domain-finder \
  -backend postgres \
  -no-interactive \
  -zone com \
  -zone net \
  -candidate example \
  -candidate missing

This still reports only exact presence or absence in loaded zone files. It is not a registrar availability check. OpenAI generation produces candidate stems only; it does not check registrar or DNS availability.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
cmd/domain-finder		cmd/domain-finder
internal		internal
testdata		testdata
.gitignore		.gitignore
AGENTS.md		AGENTS.md
AI_CONTEXT.md		AI_CONTEXT.md
README.md		README.md
domain-finder.yaml.example		domain-finder.yaml.example
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

domain-finder

Current status

Quick examples

Candidate / search model

Backends

Zone syntax by backend

PostgreSQL exact-match semantics

Candidate file and stdin format

Candidate merge and dedupe behavior

YAML config and OpenAI generation

Generation workflow

Prompt constraints vs validation

Budget- and goal-driven stop conditions

Adaptive refill policy

Generation dry run

Hardened generation behavior

Generation diagnostics summary

Audit log

Run summary artifact

Token and cost telemetry

Interactive vs fallback text mode

Interactive console behavior

stdout / stderr / file behavior

Manual examples

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

domain-finder

Current status

Quick examples

Candidate / search model

Backends

Zone syntax by backend

PostgreSQL exact-match semantics

Candidate file and stdin format

Candidate merge and dedupe behavior

YAML config and OpenAI generation

Generation workflow

Prompt constraints vs validation

Budget- and goal-driven stop conditions

Adaptive refill policy

Generation dry run

Hardened generation behavior

Generation diagnostics summary

Audit log

Run summary artifact

Token and cost telemetry

Interactive vs fallback text mode

Interactive console behavior

stdout / stderr / file behavior

Manual examples

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages