feat: add automatic wildcard detection (--auto-wildcard)#962
feat: add automatic wildcard detection (--auto-wildcard)#962flaggdavid-source wants to merge 1 commit intoprojectdiscovery:devfrom
Conversation
Adds --auto-wildcard / -aw flag that automatically detects and filters wildcard DNS domains across all input, similar to PureDNS. How it works: 1. Before resolution, extracts unique root domains from all inputs 2. Probes each domain with a random subdomain (xid-generated) 3. Compares response IPs against root domain IPs 4. Domains returning the same IPs for random subdomains are marked as wildcard and their results are filtered from output This eliminates the need to manually specify -wd for each domain, making wildcard filtering practical for large multi-domain scans. Changes: - internal/runner/options.go: Add AutoWildcard bool field and -aw flag, block in stream mode (consistent with existing -wd behavior) - internal/runner/wildcard.go: Add auto-detection logic with thread-safe domain tracking (RWMutex), root domain extraction, and per-domain wildcard probing - internal/runner/runner.go: Integrate auto-detection before workers start, add post-processing filter for detected wildcard domains Fixes projectdiscovery#924
Neo - PR Security ReviewNo security issues found Highlights
Hardening Notes
Comment |
WalkthroughThe changes introduce automatic wildcard DNS detection across multiple domains. A new Changes
Sequence DiagramsequenceDiagram
participant Client
participant Runner
participant WildcardDetector as Wildcard<br/>Detector
participant WildcardRegistry as Registry<br/>(global)
participant Output
Client->>Runner: Start with --auto-wildcard flag
activate Runner
alt AutoWildcard enabled
Runner->>WildcardDetector: AutoDetectWildcards()
activate WildcardDetector
WildcardDetector->>WildcardDetector: Scan input domains
WildcardDetector->>WildcardRegistry: Register detected wildcard domains
activate WildcardRegistry
WildcardRegistry->>WildcardRegistry: Store in global registry
deactivate WildcardRegistry
WildcardDetector-->>Runner: Return detection results
deactivate WildcardDetector
end
Runner->>Runner: Process DNS resolutions<br/>(normal flow)
alt AutoWildcard enabled & Wildcards detected
Runner->>Output: Restart output worker
activate Output
Runner->>WildcardRegistry: Query IsAutoWildcardDomain()
WildcardRegistry-->>Runner: Domain wildcard status
Runner->>Runner: Filter & skip<br/>wildcard domains
Runner->>Output: Write non-wildcard hosts
Runner->>Output: Close channel
Output-->>Runner: Worker done
deactivate Output
end
Runner-->>Client: Results (filtered)
deactivate Runner
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
internal/runner/wildcard.go (1)
11-13: Scope detected wildcard domains toRunnerstate.This registry is package-global and survives for the lifetime of the process, so a second
Runnerin the same process inherits detections from the previous run. Keeping it onRunner(likewildcards) or clearing it at the start ofAutoDetectWildcards()would avoid cross-run leakage and make tests more predictable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/runner/wildcard.go` around lines 11 - 13, The global registry autoWildcardDomains and its mutex autoWildcardDomainsMutex leak state across Runner instances; move this state into the Runner struct (e.g., add a wildcards/autoWildcardDomains field and its mutex) and update AutoDetectWildcards(), any callers, and checks to use r.autoWildcardDomains / r.autoWildcardDomainsMutex (or clear autoWildcardDomains at the start of AutoDetectWildcards() if moving is impractical) so each Runner has its own scoped wildcard registry and tests/runs no longer inherit prior detections.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/runner/options.go`:
- Around line 312-314: Add a validation to reject using --auto-wildcard together
with --wildcard-domain: in the options validation block where
options.AutoWildcard is checked, also check options.WildcardDomain and call
gologger.Fatal().Msgf(...) if both are true, with a clear message like
"auto-wildcard and wildcard-domain are mutually exclusive"; reference the
options.AutoWildcard and options.WildcardDomain flags so the runner never allows
both (Runner.run executes manual wildcard post-processing and auto-wildcard
independently and will otherwise emit duplicate/inconsistent host output).
In `@internal/runner/runner.go`:
- Around line 559-594: The auto-wildcard filtering runs too late and only
post-processes already-emitted results; to fix, change the worker flow to buffer
outputs when r.options.AutoWildcard is true (same approach used for
options.WildcardDomain) so wildcard detection happens before any results are
written. Specifically, in worker() ensure startOutputWorker()/outputchan
buffering is enabled earlier when r.options.AutoWildcard is set, make the
auto-wildcard scan occur before closing outputchan, and have lookupAndOutput()
write into the buffered store (or reuse the same buffering mechanism) so
JSON/response-mode entries can be reconstructed and suppressed correctly; update
any related wait/close logic (wgoutputworker, close(outputchan)) to match the
wildcard-domain path.
In `@internal/runner/wildcard.go`:
- Around line 98-123: The wildcard detection currently only inspects DNSData.A
(in detectWildcardForDomain via r.dnsx.QueryOne and DNSData.A), causing failures
for non-A query types; update detectWildcardForDomain so that when auto-wildcard
is enabled it either (a) explicitly performs A-record lookups for the random
test subdomain and the root domain regardless of the user query type, or (b)
dynamically inspects the response field matching the requested record type
(e.g., DNSData.AAAA, DNSData.CNAME, DNSData.MX, etc.) instead of only DNSData.A;
change the comparisons and map construction (rootIPs and in.A iteration) to use
the selected record slice based on the active query type to correctly detect
wildcards for non-A queries.
---
Nitpick comments:
In `@internal/runner/wildcard.go`:
- Around line 11-13: The global registry autoWildcardDomains and its mutex
autoWildcardDomainsMutex leak state across Runner instances; move this state
into the Runner struct (e.g., add a wildcards/autoWildcardDomains field and its
mutex) and update AutoDetectWildcards(), any callers, and checks to use
r.autoWildcardDomains / r.autoWildcardDomainsMutex (or clear autoWildcardDomains
at the start of AutoDetectWildcards() if moving is impractical) so each Runner
has its own scoped wildcard registry and tests/runs no longer inherit prior
detections.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7f7e5963-bc4d-4dda-829a-f81cdb8b8649
📒 Files selected for processing (3)
internal/runner/options.gointernal/runner/runner.gointernal/runner/wildcard.go
| if options.AutoWildcard { | ||
| gologger.Fatal().Msgf("auto-wildcard not supported in stream mode") | ||
| } |
There was a problem hiding this comment.
Reject --auto-wildcard together with --wildcard-domain.
While you're validating unsupported modes here, please also make those two flags mutually exclusive. Runner.run() executes the manual wildcard post-processing block and the auto-wildcard block independently, so enabling both re-emits surviving hosts a second time and produces inconsistent output.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/runner/options.go` around lines 312 - 314, Add a validation to
reject using --auto-wildcard together with --wildcard-domain: in the options
validation block where options.AutoWildcard is checked, also check
options.WildcardDomain and call gologger.Fatal().Msgf(...) if both are true,
with a clear message like "auto-wildcard and wildcard-domain are mutually
exclusive"; reference the options.AutoWildcard and options.WildcardDomain flags
so the runner never allows both (Runner.run executes manual wildcard
post-processing and auto-wildcard independently and will otherwise emit
duplicate/inconsistent host output).
| // Auto wildcard filtering - filter results from detected wildcard domains | ||
| if r.options.AutoWildcard && len(autoWildcardDomains) > 0 { | ||
| gologger.Print().Msgf("Starting to filter auto-detected wildcard domains\n") | ||
|
|
||
| // we need to restart output | ||
| r.startOutputWorker() | ||
|
|
||
| seen := make(map[string]struct{}) | ||
| numRemovedSubdomains := 0 | ||
|
|
||
| r.hm.Scan(func(k, v []byte) error { | ||
| host := string(k) | ||
| rootDomain := getRootDomain(host) | ||
|
|
||
| // Skip if this domain was detected as wildcard | ||
| if IsAutoWildcardDomain(rootDomain) { | ||
| if _, ok := seen[host]; !ok { | ||
| numRemovedSubdomains++ | ||
| seen[host] = struct{}{} | ||
| } | ||
| return nil | ||
| } | ||
|
|
||
| // Output non-wildcard results | ||
| if _, ok := seen[host]; !ok { | ||
| seen[host] = struct{}{} | ||
| _ = r.lookupAndOutput(host) | ||
| } | ||
| return nil | ||
| }) | ||
|
|
||
| close(r.outputchan) | ||
| // waiting output worker | ||
| r.wgoutputworker.Wait() | ||
| gologger.Print().Msgf("%d wildcard subdomains removed\n", numRemovedSubdomains) | ||
| } |
There was a problem hiding this comment.
--auto-wildcard filters too late to affect the real output.
This block runs after Lines 475-476 have already closed the original output worker, so wildcard matches have already been printed/written once. Unlike the --wildcard-domain flow, worker() only buffers results when options.WildcardDomain != "", so --auto-wildcard never suppresses the first pass. The current behavior is an extra filtered pass appended to the unfiltered output, and lookupAndOutput() cannot reconstruct JSON/response-mode results because nothing was stored for this path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/runner/runner.go` around lines 559 - 594, The auto-wildcard
filtering runs too late and only post-processes already-emitted results; to fix,
change the worker flow to buffer outputs when r.options.AutoWildcard is true
(same approach used for options.WildcardDomain) so wildcard detection happens
before any results are written. Specifically, in worker() ensure
startOutputWorker()/outputchan buffering is enabled earlier when
r.options.AutoWildcard is set, make the auto-wildcard scan occur before closing
outputchan, and have lookupAndOutput() write into the buffered store (or reuse
the same buffering mechanism) so JSON/response-mode entries can be reconstructed
and suppressed correctly; update any related wait/close logic (wgoutputworker,
close(outputchan)) to match the wildcard-domain path.
| in, err := r.dnsx.QueryOne(testHost) | ||
| if err != nil || in == nil || len(in.A) == 0 { | ||
| return false | ||
| } | ||
|
|
||
| // If we got a response, query the root domain | ||
| rootResult, err := r.dnsx.QueryOne(domain) | ||
| if err != nil || rootResult == nil { | ||
| // Root domain doesn't resolve but subdomain does - likely wildcard | ||
| return true | ||
| } | ||
|
|
||
| // Check if the same IPs are returned (indicating wildcard) | ||
| rootIPs := make(map[string]struct{}) | ||
| for _, a := range rootResult.A { | ||
| rootIPs[a] = struct{}{} | ||
| } | ||
|
|
||
| for _, a := range in.A { | ||
| if _, ok := rootIPs[a]; !ok { | ||
| // Different IP for random subdomain - not a wildcard at root level | ||
| return false | ||
| } | ||
| } | ||
|
|
||
| // Same IP returned - likely a wildcard |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n internal/runner/wildcard.go | sed -n '90,130p'Repository: projectdiscovery/dnsx
Length of output: 1503
🏁 Script executed:
# Search for DNSData struct definition
rg -A 20 "type DNSData struct"Repository: projectdiscovery/dnsx
Length of output: 47
🏁 Script executed:
# Search for DNSData more broadly
rg "DNSData" internal/runner/wildcard.go -A 2 -B 2Repository: projectdiscovery/dnsx
Length of output: 47
🏁 Script executed:
# Check imports and structure of wildcard.go
head -40 internal/runner/wildcard.goRepository: projectdiscovery/dnsx
Length of output: 1237
🏁 Script executed:
# Search for runner struct definition
rg -A 30 "type Runner struct"Repository: projectdiscovery/dnsx
Length of output: 1942
🏁 Script executed:
# Search for query type configuration in runner.go
rg "QueryType|queryType|QueryRecord|Type" internal/runner/runner.go | head -20Repository: projectdiscovery/dnsx
Length of output: 1197
🏁 Script executed:
# Get more context around the QuestionTypes configuration
rg -B 5 -A 10 "QuestionTypes = questionTypes"Repository: projectdiscovery/dnsx
Length of output: 1031
🏁 Script executed:
# Search for detectWildcardForDomain function calls
rg "detectWildcardForDomain" internal/runner/Repository: projectdiscovery/dnsx
Length of output: 316
🏁 Script executed:
# Get the call site for detectWildcardForDomain
rg -B 5 -A 5 "r.detectWildcardForDomain"Repository: projectdiscovery/dnsx
Length of output: 757
🏁 Script executed:
# Search for AutoWildcard option handling
rg -B 3 -A 3 "AutoWildcard"Repository: projectdiscovery/dnsx
Length of output: 3773
🏁 Script executed:
# Check how query types are determined based on options
rg -B 20 "QuestionTypes = questionTypes" internal/runner/runner.go | head -60Repository: projectdiscovery/dnsx
Length of output: 676
🏁 Script executed:
# Get the full context of QuestionTypes building - go further back
rg -B 50 "QuestionTypes = questionTypes" internal/runner/runner.go | head -80Repository: projectdiscovery/dnsx
Length of output: 1475
🏁 Script executed:
# Check if there's any special configuration when AutoWildcard is triggered
rg -B 10 "AutoDetectWildcards()" internal/runner/runner.goRepository: projectdiscovery/dnsx
Length of output: 449
🏁 Script executed:
# Let's check what fields DNSData might have - search for field accesses on the response
rg "in\." internal/runner/wildcard.goRepository: projectdiscovery/dnsx
Length of output: 402
🏁 Script executed:
# Search for imports to see what dnsx package is being used
head -20 internal/runner/runner.go | grep -E "import|dnsx"Repository: projectdiscovery/dnsx
Length of output: 119
🏁 Script executed:
# Search in the repo for any type definitions or additional field accesses on response objects
rg "rootResult\.|in\." internal/runner/wildcard.go -A 1Repository: projectdiscovery/dnsx
Length of output: 704
Non-A query types silently fail wildcard detection.
When --auto-wildcard is combined with non-A query type flags (-aaaa, -cname, -mx, etc.), detectWildcardForDomain() always checks only DNSData.A records (lines 99, 112-121), causing false negatives if the domain resolves for the specified record type but not for A records. The wildcard detection should either enforce A-record queries explicitly during auto-detection, or adapt to inspect the appropriate record type being queried.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/runner/wildcard.go` around lines 98 - 123, The wildcard detection
currently only inspects DNSData.A (in detectWildcardForDomain via
r.dnsx.QueryOne and DNSData.A), causing failures for non-A query types; update
detectWildcardForDomain so that when auto-wildcard is enabled it either (a)
explicitly performs A-record lookups for the random test subdomain and the root
domain regardless of the user query type, or (b) dynamically inspects the
response field matching the requested record type (e.g., DNSData.AAAA,
DNSData.CNAME, DNSData.MX, etc.) instead of only DNSData.A; change the
comparisons and map construction (rootIPs and in.A iteration) to use the
selected record slice based on the active query type to correctly detect
wildcards for non-A queries.
|
Implemented randomized wildcard probing to improve the accuracy of DNS wildcard detection. By using a unique prefix for each probe, we can bypass potential DNS caching issues and more reliably identify wildcard records. |
|
Thanks for the suggestion! We're using |
Summary
Fixes #924
Adds
--auto-wildcard/-awflag that automatically detects and filters wildcard DNS domains across all input, similar to how PureDNS handles wildcard detection.How It Works
xid-generated, consistent with existing wildcard code)This eliminates the need to manually specify
-wdfor each domain, making wildcard filtering practical for large multi-domain scans.Changes
internal/runner/options.goAutoWildcardfield,-awflag, stream mode validationinternal/runner/wildcard.goAutoDetectWildcards(),detectWildcardForDomain(),getRootDomain(),IsAutoWildcardDomain()with thread-safeRWMutexinternal/runner/runner.goUsage
Testing
go build ./...passes clean-houtput under Configurations groupKnown Limitations
getRootDomain()uses a simple two-label heuristic — works for.com,.org, etc. but not multi-part TLDs like.co.uk. Documented in code comments. A public suffix list library could be integrated in a follow-up if needed.Summary by CodeRabbit
--auto-wildcard(-aw) flag to automatically detect and filter wildcard subdomains from enumeration results.