Skip to content

feat(search): allow custom DuckDuckGo endpoint#2510

Draft
cyq1017 wants to merge 3 commits into
Hmbown:mainfrom
cyq1017:codex/2436-custom-search-url
Draft

feat(search): allow custom DuckDuckGo endpoint#2510
cyq1017 wants to merge 3 commits into
Hmbown:mainfrom
cyq1017:codex/2436-custom-search-url

Conversation

@cyq1017
Copy link
Copy Markdown
Contributor

@cyq1017 cyq1017 commented Jun 1, 2026

Refs #2436

Problem

  • Private search services that expose DuckDuckGo-compatible HTML results cannot be wired in today because DuckDuckGo always uses the public default endpoint.

Change

  • Add optional [search].base_url / DEEPSEEK_SEARCH_BASE_URL for DuckDuckGo-compatible endpoints.
  • Apply network policy to the configured endpoint host.
  • Disable public Bing fallback when a custom endpoint is configured, so private search does not unexpectedly leak to a public provider.

Verification

  • cargo test -p codewhale-tui custom_duckduckgo --all-features --locked -- --nocapture
  • cargo test -p codewhale-tui search_provider --all-features --locked -- --nocapture
  • cargo check -p codewhale-tui --all-features --locked
  • cargo fmt --all -- --check
  • cargo clippy -p codewhale-tui --all-features --locked -- -D warnings
  • git diff --check origin/main..HEAD

Greptile Summary

This PR adds an optional [search].base_url / DEEPSEEK_SEARCH_BASE_URL setting that lets users point the DuckDuckGo provider at any DuckDuckGo-compatible HTML search endpoint (e.g. a private SearXNG instance), applies network policy to the configured host, and disables the public Bing fallback when a custom endpoint is active.

  • web_search.rs: extracts duckduckgo_search_url, configured_search_base_url, and duckduckgo_allows_bing_fallback helpers; bot-challenge detection now runs unconditionally and returns an actionable error for custom endpoints; non-DDG providers paired with base_url get an explicit invalid_input error instead of silent ignore.
  • Config plumbing: base_url: Option<String> added to SearchConfig, EngineConfig, and ToolContext; propagated through all four EngineConfig construction sites (main.rs, runtime_threads.rs, tui/ui.rs, and engine.rs).
  • Tests: four new tests cover URL construction, Bing-fallback flag, bot-challenge error on custom endpoints, and the incompatible-provider guard.

Confidence Score: 5/5

Safe to merge. The new custom-endpoint path is well-isolated, network policy is correctly applied to the derived host, and the Bing-fallback suppression prevents data leaking to public search when a private endpoint is configured.

All changed code is additive and guarded: the incompatible-provider guard fires before any network activity, URL parsing errors surface as actionable tool errors, and the bot-challenge detection now runs unconditionally regardless of fallback mode. The four call sites that construct EngineConfig are consistently updated. Tests cover every new code path.

No files require special attention. The only note is a cosmetic one in web_search.rs where the source field in the JSON response stays duckduckgo for custom endpoints.

Important Files Changed

Filename Overview
crates/tui/src/tools/web_search.rs Core change: adds duckduckgo_search_url, configured_search_base_url, and duckduckgo_allows_bing_fallback helpers; wires search_base_url through execute; correctly gates network policy against the configured host; surfaces bot-challenge errors for custom endpoints; hard-errors on base_url + non-DDG provider. Well-tested.
crates/tui/src/config.rs Adds base_url: Option to SearchConfig and DEEPSEEK_SEARCH_BASE_URL env-var override. Two new unit tests validate TOML deserialization and env-var path. No issues found.
crates/tui/src/core/engine.rs Threads search_base_url through EngineConfig and populates it in Engine::run. Straightforward plumbing, no issues.
crates/tui/src/tools/spec.rs Adds search_base_url: Option to ToolContext and initializes it to None in all three constructor stubs. Consistent, no issues.
crates/tui/src/main.rs Propagates search_base_url into EngineConfig in run_exec_agent; adds base_url: None to two existing test structs so they compile cleanly.
crates/tui/src/runtime_threads.rs Propagates search_base_url into EngineConfig in RuntimeThreadManager. Mirrors the main.rs change correctly.
crates/tui/src/tui/ui.rs Propagates search_base_url into EngineConfig in build_engine_config. Consistent with other call sites.
docs/CONFIGURATION.md Documents the new base_url field and its interaction with private endpoints and Bing fallback. Accurate and concise.
config.example.toml Adds base_url example comment and documents the DEEPSEEK_SEARCH_BASE_URL env-var override. No issues.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[web_search execute] --> B{base_url set\n+ provider ≠ DuckDuckGo?}
    B -->|yes| ERR0[ToolError::invalid_input]
    B -->|no| C{provider?}
    C -->|Tavily/Bocha/\nMetaso/Baidu/\nVolcengine| D[API-backed search\nreturn early]
    C -->|Bing| E[run_bing_search]
    E -->|results| RET1[return bing results]
    E -->|empty| F[bing_was_empty=true\nfall through]
    C -->|DuckDuckGo| F
    F --> G[duckduckgo_search_url\nbase_url OR default DDG]
    G --> H[check_policy on host]
    H --> I[HTTP GET]
    I --> J[parse_duckduckgo_results]
    J --> K{results empty?}
    K -->|no| RET2[return results]
    K -->|yes| L{bot challenge?}
    L -->|yes + no Bing fallback| ERR1[ToolError: bot challenge\nat custom endpoint]
    L -->|yes/no + allow Bing fallback| M[run_bing_search fallback]
    M -->|results| RET3[return bing fallback results]
    M -->|empty + challenged| ERR2[ToolError: DDG challenged\nBing also empty]
    M -->|empty unchallenged| RET4[No results found]
    L -->|no + no Bing fallback| RET4
Loading

Comments Outside Diff (1)

  1. crates/tui/src/tools/web_search.rs, line 309-336 (link)

    P2 Bot-challenge detection bypassed for custom endpoints

    is_duckduckgo_challenge(&body) is only evaluated inside the allow_bing_fallback branch. When a custom base_url is configured and the private endpoint returns a challenge/captcha page, allow_bing_fallback is false, so the check is never reached — the caller gets a generic "No results found" instead of an actionable error. Extracting is_duckduckgo_challenge before the guard, and returning a descriptive error when the custom endpoint triggers it, would make misconfigured or blocked private search endpoints much easier to diagnose.

    Fix in Codex Fix in Claude Code Fix in Cursor

Fix All in Codex Fix All in Claude Code Fix All in Cursor

Reviews (3): Last reviewed commit: "fix(search): surface custom endpoint con..." | Re-trigger Greptile

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for a custom, DuckDuckGo-compatible HTML search endpoint via a new base_url configuration option and DEEPSEEK_SEARCH_BASE_URL environment variable. This allows routing web searches to private or internal search services, disabling the public Bing fallback when a custom endpoint is specified. The review feedback highlights two compatibility issues: the use of unstable let-chains in crates/tui/src/config.rs which will fail on stable Rust, and the use of Option::is_none_or in crates/tui/src/tools/web_search.rs which requires Rust 1.82.0 and may violate the project's Minimum Supported Rust Version (MSRV).

Comment thread crates/tui/src/config.rs Outdated
Comment on lines +3348 to +3355
if let Ok(value) = std::env::var("DEEPSEEK_SEARCH_BASE_URL")
&& !value.trim().is_empty()
{
config
.search
.get_or_insert_with(SearchConfig::default)
.base_url = Some(value);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The use of let-chains (if let ... && ...) is currently an unstable feature in Rust (RFC 2497) and will fail to compile on stable Rust compilers. To ensure compatibility with stable Rust, please rewrite this using nested if statements or combinators.

    if let Ok(value) = std::env::var("DEEPSEEK_SEARCH_BASE_URL") {
        if !value.trim().is_empty() {
            config
                .search
                .get_or_insert_with(SearchConfig::default)
                .base_url = Some(value);
        }
    }

Comment on lines +1359 to +1361
fn duckduckgo_allows_bing_fallback(base_url: Option<&str>) -> bool {
base_url.is_none_or(|value| value.trim().is_empty())
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Option::is_none_or method was stabilized in Rust 1.82.0. If the project targets or supports older Rust versions (MSRV < 1.82.0), this will cause compilation errors. Using map_or is fully compatible with older Rust versions and is equally idiomatic.

fn duckduckgo_allows_bing_fallback(base_url: Option<&str>) -> bool {
    base_url.map_or(true, |value| value.trim().is_empty())
}

Comment thread crates/tui/src/tools/web_search.rs
@Hmbown
Copy link
Copy Markdown
Owner

Hmbown commented Jun 1, 2026

Thanks @cyq1017. I did a quick release-triage read. This is a good match for #2436, and I like the privacy-preserving choice to disable public Bing fallback when base_url is set.

I am not harvesting this into #2504 because it is still draft and there are a couple of correctness/diagnostic issues to close first:

  • Please fix the custom-endpoint challenge path Greptile flagged. If the configured private/DDG-compatible endpoint returns a challenge or bot-block page, the user should get an actionable error, not No results found.
  • Please make base_url with a non-DuckDuckGo provider explicit. Either reject it during resolution or surface a warning/error; silently ignoring it with provider = "tavily"/bocha/metaso/baidu/volcengine will be confusing.
  • The Gemini MSRV warning looks stale for this repo: root Cargo.toml currently has rust-version = "1.88", so let-chains and Option::is_none_or are not blockers here. No need to contort the patch for an older MSRV unless the repo policy changes.

Once the first two are fixed, this looks like a strong follow-up for #2436.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants