Skip to content

feat: add proxy discovery foundation#2

Merged
LING71671 merged 1 commit into
mainfrom
codex/discovery-architecture-cli
May 5, 2026
Merged

feat: add proxy discovery foundation#2
LING71671 merged 1 commit into
mainfrom
codex/discovery-architecture-cli

Conversation

@LING71671
Copy link
Copy Markdown
Owner

@LING71671 LING71671 commented May 5, 2026

Summary

  • initialize the Go CLI and lightweight HTTP API foundation
  • add proxy discovery models, GitHub/Raw URL discovery, AI provider abstraction, validation workers, and tests
  • add Chinese-first docs, architecture diagram, roadmap, source inventory, CI, and agent conventions

Validation

  • go test ./...
  • go build -o bin\plugproxy.exe ./cmd/plugproxy

Refs #1

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced CLI tool for proxy management with fetch, check, list, get, run, and discover subcommands
    • Added HTTP API server with /health, /proxies, and /proxy endpoints for proxy access
    • Implemented proxy validation and health checking with configurable targets and timeouts
    • Added automated proxy source discovery system supporting multiple formats and optional AI-powered search
  • Documentation

    • Comprehensive project roadmap and contribution guidelines
    • API endpoint documentation
    • Architecture overview with component descriptions
    • Source discovery strategy and proxy source catalog
  • Chores

    • Go module initialization
    • CI/CD workflow configuration for automated testing

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 5, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR establishes the initial architecture for plugproxy, a Go-based proxy source discovery, collection, and validation system. It introduces a proxy model, in-memory pool management, source abstraction with fetching, proxy validation via HTTP checking, HTTP API exposure, a CLI entry point with subcommands, and a comprehensive source discovery subsystem with GitHub/HTTP/OpenAI integrations. Supporting infrastructure includes CI/CD workflows, documentation, and project conventions.

Changes

Proxy Management Pipeline

Layer / File(s) Summary
Data Model
pkg/model/proxy.go
Defines Protocol enum (http/https/socks4/socks5) and Proxy struct with address, latency, success/failure counters, timestamps, and Healthy() / URL() methods.
Pool Abstraction & Implementation
internal/pool/pool.go, internal/pool/memory.go
Introduces Pool interface with Add, Get(strategy, filter), and List(filter) methods; implements MemoryPool backed by RWMutex-protected map with support for StrategyAny and StrategyFastest retrieval.
Source Interface & Static Implementation
internal/source/source.go, internal/source/static.go
Defines Source interface with Name() and Fetch(ctx) methods; implements StaticSource for in-memory proxy lists.
Proxy Validation
internal/checker/checker.go, internal/checker/http.go
Introduces Checker interface with Check(ctx, proxy) Result method; implements HTTPChecker that validates proxies via configurable HTTP GET with latency measurement and status code evaluation.
Data Collection
internal/fetcher/fetcher.go
Implements FetchAll to concurrently fetch proxies from multiple sources and return results with per-source error tracking.
HTTP API Server
internal/server/server.go
Exposes /health, /proxies, and /proxy endpoints; supports query-based filtering by protocol and health status, with strategy parameter for proxy selection.
Application Wiring
internal/app/app.go
Coordinates pool, sources, fetcher, and checker; implements Fetch (adds proxies to pool), Check (validates via worker pool, updates metrics), and Serve (starts HTTP API).
CLI Entry Point & Subcommands
cmd/plugproxy/main.go
Dispatches version, fetch, check, list, get, run (optional startup check + server), and discover subcommands; includes flag parsing, context management, and discovery workflow coordination.
Configuration & Documentation
.github/workflows/ci.yml, .gitignore, go.mod, README.md, AGENTS.md, docs/ci-cd.md, docs/project-conventions.md, docs/roadmap.md, docs/proxy-sources.md
Establishes Go 1.25.0 module, GitHub Actions CI (gofmt/test/build checks), project conventions (Google Go style, GitHub CLI usage, PR linking), architecture roadmap, and curated proxy source list with ingestion strategy.

Source Discovery Subsystem

Layer / File(s) Summary
Discovery Domain Types
internal/discover/types.go
Defines SourceFormat (text/json/html), SourceKind (raw\_text/api/json/html\_table/crawler\_code\_reference/source\_list), CandidateStatus, CandidateSource, SourceRecipe, and DiscoveryReport.
URL & Content Analysis
internal/discover/extract.go, internal/discover/extract_test.go
Extracts and normalizes URLs from text; infers format/kind/protocol from content; provides heuristics for proxy lists, source lists, crawler code, and proxy-source URL likelihood with blocking rules for non-proxy hosts.
Candidate Analysis & Deduplication
internal/discover/analyzer.go
Analyzes URL/content pairs and produces scored CandidateSource entries; detects adapter requirements for HTML tables and crawler code; deduplicates by keeping highest confidence per URL.
GitHub Source Discovery
internal/discover/github.go
Discovers candidate sources by scanning repository README/source/proxy directories and GitHub search API; concurrently fetches and analyzes files via worker pool with auth token support.
HTTP Sample Fetching
internal/discover/http.go
Fetches bounded samples (default 128 KB) from URLs with configurable timeout and Range header support for efficient sampling.
AI-Backed Search
internal/discover/openai.go, internal/discover/openai_test.go
Queries OpenAI API with web\_search tool to discover proxy sources; extracts JSON candidates from response, sanitizes confidence/format/kind, and handles multiple provider configurations (Responses-compatible support).
Candidate Validation
internal/discover/validate.go
Concurrently validates candidates by fetching sample content, analyzing format/kind/protocol, and marking status as valid/invalid with error tracking; includes confidence boost on successful validation.
Discovery CLI Integration
cmd/plugproxy/main.go (discover subcommands)
Implements discover repo, discover url, discover validate (loads and enriches prior results), and discover search (GitHub + optional AI); includes helper readDiscoveryInput and reorderFlagArgs for CLI parsing.
Documentation
docs/source-discovery.md
Defines discovery goals, input/output constraints, strategy layers (search→candidate extraction→sampling validation), security boundaries, and AI provider role; includes example commands and configuration guidance.

Sequence Diagram

sequenceDiagram
    participant User
    participant CLI as cmd/plugproxy
    participant App as internal/app
    participant Fetcher as internal/fetcher
    participant Source as internal/source
    participant Pool as internal/pool
    participant Checker as internal/checker
    participant Server as internal/server
    
    User->>CLI: plugproxy fetch
    CLI->>App: app.Fetch(ctx)
    App->>Fetcher: FetchAll(ctx, sources)
    loop Per source
        Fetcher->>Source: src.Fetch(ctx)
        Source-->>Fetcher: proxies, error
    end
    Fetcher-->>App: []Result
    App->>Pool: pool.Add(proxy)
    App-->>User: count added
    
    User->>CLI: plugproxy check --workers=8 --target=http://example.com
    CLI->>App: app.Check(ctx, workers, targetURL, timeout)
    App->>Pool: pool.List(filter)
    App->>Checker: check via worker pool
    loop Per proxy
        Checker->>Checker: HTTP GET via proxy
        Checker-->>App: Result{OK, Latency}
    end
    App->>Pool: pool.Add(updated_proxy)
    App-->>User: count healthy
    
    User->>CLI: plugproxy run
    CLI->>App: app.Serve(addr)
    App->>Server: server.Handler()
    Server-->>User: HTTP API listening
    
    loop Client requests
        User->>Server: GET /proxies
        Server->>Pool: pool.List(filter)
        Pool-->>Server: proxies
        Server-->>User: JSON response
    end
    
    User->>CLI: plugproxy discover repo owner/repo
    CLI->>CLI: runDiscover(discover repo)
    CLI->>Discover: GitHubClient.DiscoverRepo(ctx, repo)
    Discover->>GitHub: scan files (README, sources/*, proxies/*)
    GitHub-->>Discover: file contents
    Discover->>Discover: AnalyzeURLContent per file
    Discover-->>CLI: DiscoveryReport{Candidates}
    CLI-->>User: JSON candidates
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related issues

  • 收集并接入第一批免费代理源 #1: The PR implements source interfaces, the Fetch mechanism, the StaticSource implementation, and docs/proxy-sources.md with proxy source ingestion strategy, directly fulfilling the plumbing needed to integrate raw TXT/API proxy sources described in that issue.

🐰 Hops with glee across freshly seeded ground—
A proxy pool sprouted, in memory now bound,
GitHub sources whisper, AI hums its song,
The checking chain strengthens, the pipeline grows strong!
From CLI to server, one binary runs true,
Early design bears fruit—plugproxy v0 debut! 🌱✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: add proxy discovery foundation' clearly summarizes the main change: adding core discovery infrastructure, models, and CLI/API foundation for the plugproxy project.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/discovery-architecture-cli

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@LING71671 LING71671 merged commit 215e809 into main May 5, 2026
1 of 2 checks passed
@LING71671 LING71671 deleted the codex/discovery-architecture-cli branch May 5, 2026 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant