Skip to content

Auth page cli#371

Closed
bfzha wants to merge 4 commits intospider-rs:mainfrom
bfzha:auth-page-cli
Closed

Auth page cli#371
bfzha wants to merge 4 commits intospider-rs:mainfrom
bfzha:auth-page-cli

Conversation

@bfzha
Copy link
Copy Markdown

@bfzha bfzha commented Mar 15, 2026

No description provided.

j-mendez and others added 4 commits March 14, 2026 08:16
… to v2.47.22

Replace tokio-uring (pre-1.0, forces separate runtime) with the raw
io-uring 0.7 crate. Adds kernel probe at init — fails gracefully on
AWS Amazon Linux, ECS, Lambda, seccomp-filtered containers, and
kernels < 5.1. Worker thread runs a synchronous io_uring submit/reap
loop with no async runtime, no mutexes, and no deadlock paths.

- Swap dep: tokio-uring 0.5 → io-uring 0.7
- Rewrite uring_fs.rs io_uring inner: raw SQE/CQE ops (OpenAt, Write,
  Read, Close) with short-write/short-read loops and guaranteed fd cleanup
- Unify connect.rs init_background_runtime into single impl (both paths
  use standard tokio runtime; io_uring only for file I/O)
- Remove .expect() panic site in send_to_background_runtime
- StreamingWriter always uses tokio::fs fallback (io_uring adds no value
  for sequential streaming)
- 18 tests: 13 cross-platform fallback + 5 Linux-only io_uring path

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@j-mendez
Copy link
Copy Markdown
Member

Hi the PR has some conflicts and it looks like it has domain specific logic instead of being generic.

@j-mendez
Copy link
Copy Markdown
Member

Closing this PR — we cherry-picked the generic CLI improvements into main (v2.47.47):

Pulled in:

  • --chrome-connection-url flag (connect to remote CDP endpoints)
  • --cookie flag (inject session cookies)
  • --stealth flag (bot-detection evasion)
  • cookies feature enabled on spider dep in CLI

Not pulled in:

  • The authenticated-page subcommand — too specialized for the generic crawler CLI (hardcoded locale defaults, local Chrome process management, image downloading, text extraction). This would be better as a separate tool or example.
  • zhihu_spider workspace member and Chinese-language report file
  • Making --url optional (breaking change for existing commands)
  • io-uring changes (already merged in earlier releases)

Thank you for the contribution! The core idea of exposing spider's chrome connection and cookie configs to the CLI was the right call.

@j-mendez j-mendez closed this Mar 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants