fix(zhihu/download): author extraction, zhida links, redirect detection by yrom · Pull Request #54 · nashsu/AutoCLI

yrom · 2026-05-22T15:24:46Z

Summary

Fix multiple issues in the zhihu download adapter.

Fixes #40 — Author extraction returns "unknown"

Root cause: The combined CSS selector .AuthorInfo-name, .UserLink-link matches elements in DOM order. A .UserLink-link element appearing before .AuthorInfo-name in the DOM returns empty text, causing the fallback chain to skip the real author name.

Fix: Split into separate querySelector calls with proper fallback order:

.AuthorInfo-name (primary)
.UserLink-link (secondary)
meta[itemprop="author"] (stable meta tag fallback)
meta[name="author"]
js-initialData JSON parsing (last resort)

Fixes #42 — Template variable not filled + wrong article

Root cause 1: On failure (content element not found), the evaluate step returned an array [{...}] instead of a plain object {...}. The download step's ${{ data.title }} template couldn't resolve, leaving the literal string in the output.

Fix: Return a plain object with all required fields (filename, imageUrls, content, output) on error, matching the success case structure.

Root cause 2: When an article URL is removed/not found, Zhihu redirects to the homepage. The adapter would then scrape the homepage content instead of reporting an error.

Fix: Detect redirect by checking location.hostname and location.pathname after navigation. Return a clear "Article not found" error.

Additional improvements

Strip zhida.zhihu.com links: These are Zhihu's internal AI search links that add noise to the markdown output. Now only the display text is kept.
Increase settleMs from 3000 to 5000 for more reliable page loading.
Add path column to output table so users can see the downloaded file path.

Test plan

autocli zhihu download "https://www.zhihu.com/question/351504112/answer/2027391723035275294" — author correctly shows "NGINX洪志道" (was "unknown")
autocli zhihu download "https://zhuanlan.zhihu.com/p/60954299" — shows "Article not found" error (was ${{ data.title }} literal)
autocli zhihu download "https://zhuanlan.zhihu.com/p/61154299" — correct article fetched
autocli zhihu download "https://www.zhihu.com/question/2040832649963508039/answer/2041160812719498582" — new URL format works

Fixes nashsu#40, Fixes nashsu#42

yrom added 2 commits May 22, 2026 23:24

fix(zhihu): fix author extraction, strip zhida links, detect redirect

51fcaf6

Fixes nashsu#40, Fixes nashsu#42

fix: restore license attribution header

14af196

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(zhihu/download): author extraction, zhida links, redirect detection#54

fix(zhihu/download): author extraction, zhida links, redirect detection#54
yrom wants to merge 2 commits into
nashsu:mainfrom
yrom:fix/zhihu-download

yrom commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yrom commented May 22, 2026

Summary

Fixes #40 — Author extraction returns "unknown"

Fixes #42 — Template variable not filled + wrong article

Additional improvements

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant