feat(search): allow custom DuckDuckGo endpoint#2510
Conversation
There was a problem hiding this comment.
Code Review
This pull request adds support for a custom, DuckDuckGo-compatible HTML search endpoint via a new base_url configuration option and DEEPSEEK_SEARCH_BASE_URL environment variable. This allows routing web searches to private or internal search services, disabling the public Bing fallback when a custom endpoint is specified. The review feedback highlights two compatibility issues: the use of unstable let-chains in crates/tui/src/config.rs which will fail on stable Rust, and the use of Option::is_none_or in crates/tui/src/tools/web_search.rs which requires Rust 1.82.0 and may violate the project's Minimum Supported Rust Version (MSRV).
| if let Ok(value) = std::env::var("DEEPSEEK_SEARCH_BASE_URL") | ||
| && !value.trim().is_empty() | ||
| { | ||
| config | ||
| .search | ||
| .get_or_insert_with(SearchConfig::default) | ||
| .base_url = Some(value); | ||
| } |
There was a problem hiding this comment.
The use of let-chains (if let ... && ...) is currently an unstable feature in Rust (RFC 2497) and will fail to compile on stable Rust compilers. To ensure compatibility with stable Rust, please rewrite this using nested if statements or combinators.
if let Ok(value) = std::env::var("DEEPSEEK_SEARCH_BASE_URL") {
if !value.trim().is_empty() {
config
.search
.get_or_insert_with(SearchConfig::default)
.base_url = Some(value);
}
}| fn duckduckgo_allows_bing_fallback(base_url: Option<&str>) -> bool { | ||
| base_url.is_none_or(|value| value.trim().is_empty()) | ||
| } |
There was a problem hiding this comment.
The Option::is_none_or method was stabilized in Rust 1.82.0. If the project targets or supports older Rust versions (MSRV < 1.82.0), this will cause compilation errors. Using map_or is fully compatible with older Rust versions and is equally idiomatic.
fn duckduckgo_allows_bing_fallback(base_url: Option<&str>) -> bool {
base_url.map_or(true, |value| value.trim().is_empty())
}|
Thanks @cyq1017. I did a quick release-triage read. This is a good match for #2436, and I like the privacy-preserving choice to disable public Bing fallback when I am not harvesting this into #2504 because it is still draft and there are a couple of correctness/diagnostic issues to close first:
Once the first two are fixed, this looks like a strong follow-up for #2436. |
Refs #2436
Problem
Change
[search].base_url/DEEPSEEK_SEARCH_BASE_URLfor DuckDuckGo-compatible endpoints.Verification
cargo test -p codewhale-tui custom_duckduckgo --all-features --locked -- --nocapturecargo test -p codewhale-tui search_provider --all-features --locked -- --nocapturecargo check -p codewhale-tui --all-features --lockedcargo fmt --all -- --checkcargo clippy -p codewhale-tui --all-features --locked -- -D warningsgit diff --check origin/main..HEADGreptile Summary
This PR adds an optional
[search].base_url/DEEPSEEK_SEARCH_BASE_URLsetting that lets users point the DuckDuckGo provider at any DuckDuckGo-compatible HTML search endpoint (e.g. a private SearXNG instance), applies network policy to the configured host, and disables the public Bing fallback when a custom endpoint is active.web_search.rs: extractsduckduckgo_search_url,configured_search_base_url, andduckduckgo_allows_bing_fallbackhelpers; bot-challenge detection now runs unconditionally and returns an actionable error for custom endpoints; non-DDG providers paired withbase_urlget an explicitinvalid_inputerror instead of silent ignore.base_url: Option<String>added toSearchConfig,EngineConfig, andToolContext; propagated through all fourEngineConfigconstruction sites (main.rs,runtime_threads.rs,tui/ui.rs, andengine.rs).Confidence Score: 5/5
Safe to merge. The new custom-endpoint path is well-isolated, network policy is correctly applied to the derived host, and the Bing-fallback suppression prevents data leaking to public search when a private endpoint is configured.
All changed code is additive and guarded: the incompatible-provider guard fires before any network activity, URL parsing errors surface as actionable tool errors, and the bot-challenge detection now runs unconditionally regardless of fallback mode. The four call sites that construct EngineConfig are consistently updated. Tests cover every new code path.
No files require special attention. The only note is a cosmetic one in web_search.rs where the source field in the JSON response stays duckduckgo for custom endpoints.
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[web_search execute] --> B{base_url set\n+ provider ≠ DuckDuckGo?} B -->|yes| ERR0[ToolError::invalid_input] B -->|no| C{provider?} C -->|Tavily/Bocha/\nMetaso/Baidu/\nVolcengine| D[API-backed search\nreturn early] C -->|Bing| E[run_bing_search] E -->|results| RET1[return bing results] E -->|empty| F[bing_was_empty=true\nfall through] C -->|DuckDuckGo| F F --> G[duckduckgo_search_url\nbase_url OR default DDG] G --> H[check_policy on host] H --> I[HTTP GET] I --> J[parse_duckduckgo_results] J --> K{results empty?} K -->|no| RET2[return results] K -->|yes| L{bot challenge?} L -->|yes + no Bing fallback| ERR1[ToolError: bot challenge\nat custom endpoint] L -->|yes/no + allow Bing fallback| M[run_bing_search fallback] M -->|results| RET3[return bing fallback results] M -->|empty + challenged| ERR2[ToolError: DDG challenged\nBing also empty] M -->|empty unchallenged| RET4[No results found] L -->|no + no Bing fallback| RET4Comments Outside Diff (1)
crates/tui/src/tools/web_search.rs, line 309-336 (link)is_duckduckgo_challenge(&body)is only evaluated inside theallow_bing_fallbackbranch. When a custombase_urlis configured and the private endpoint returns a challenge/captcha page,allow_bing_fallbackisfalse, so the check is never reached — the caller gets a generic"No results found"instead of an actionable error. Extractingis_duckduckgo_challengebefore the guard, and returning a descriptive error when the custom endpoint triggers it, would make misconfigured or blocked private search endpoints much easier to diagnose.Reviews (3): Last reviewed commit: "fix(search): surface custom endpoint con..." | Re-trigger Greptile