BookWeaver is a document translation pipeline for long books (.epub, .pdf, .docx) using Gemini CLI or Gemini API, with EPUB-first output and bilingual merge support. The codebase has been refactored into a hexagonal architecture: ai.core (domain), ai.ports (interfaces), and ai.adapters (providers/sources). The main runtime entrypoint is ai.cli (python -m ai.cli); translatebook.sh delegates to ai.cli and provides convenient wrappers.
- Converts input files to markdown chunks
- Translates chunks with Gemini models
- Merges source + translation into bilingual markdown
- Renders HTML and exports final formats (EPUB/DOCX/PDF or HTML-only)
| Input type | Goal | Command |
|---|---|---|
| EPUB | Preserve package structure/navigation | python -m ai.cli book.epub --output out/translated.epub |
| PDF/DOCX | Convert then translate (via shell wrapper) | ./translatebook.sh --workflow markdown /path/to/book.pdf |
Default: EPUB input auto-detects to EPUB workflow; non-EPUB defaults to markdown workflow.
Choose one translation provider:
Option A: Gemini CLI (recommended, default)
geminiCLI (authenticated)ebook-convert(Calibre)pandoc
which gemini
which ebook-convert
which pandocOption B: Gemini API (alternative, experimental)
- Google AI API key (set via env var, config file, or CLI parameter)
ebook-convert(Calibre)pandoc
# Option B1: Use environment variable
export GEMINI_API_KEY="your-api-key-here"
# Option B2: Use config file (recommended for persistent setup)
# Edit config/config.json:
# {
# "gemini_api": {
# "enabled": true,
# "api_key": "your-api-key-here",
# "model": "gemini-2.5-flash"
# }
# }
which ebook-convert
which pandocuv syncDirect with uv run (recommended — no venv activation needed):
# Basic EPUB translation → Chinese
uv run bookweaver book.epub --output book_translated.epub
# Disable default resume for this run
uv run bookweaver book.epub --output book_translated.epub --no-resume
# With automatic glossary extraction (uses Pro model for extraction)
uv run bookweaver book.epub --output book_translated.epub --extract-glossary --model pro
# With pre-extracted glossary and priority filtering
uv run bookweaver book.epub --output book_translated.epub \
--glossary glossary.json --glossary-min-priority high
# Using flash model (faster, lower quality)
uv run bookweaver book.epub --output book_translated.epub --model flash
# Using Gemini API instead of CLI
uv run bookweaver book.epub --output book_translated.epub --provider apiOutput is written to the path you specify with --output.
Via shell wrapper (handles venv, needed for PDF/DOCX):
# EPUB — wrapper constructs output path as <basename>_temp/translated_roundtrip.epub
./translatebook.sh book.epub
# With glossary extraction + Pro model
./translatebook.sh --extract-glossary --model pro book.epub
# Dry-run: show config without executing
./translatebook.sh --dry-run book.epub
# EPUB baseline roundtrip (no translation, zero text mutation)
./translatebook.sh --epub-baseline book.epubShell wrapper output: <input_basename>_temp/translated_roundtrip.epub.
EPUB workflow now enables checkpoint resume by default; pass --no-resume to opt out.
EPUB workflow (default for .epub input):
ai.cli handles the full pipeline — reads EPUB, translates via engine, writes bilingual EPUB directly:
python -m ai.cli book.epub --output translated.epub [flags]
Markdown workflow (PDF/DOCX input, via shell only for now):
The shell orchestrates multiple steps:
- Steps 1-2: Convert PDF/DOCX → markdown chunks (Calibre)
- Step 3:
python -m ai.clitranslates chunks - Steps 5-7: Render HTML, add TOC, export final format
Note (SPEC-013): Steps 5-7 are not yet absorbed into
ai.cli. PDF/DOCX-to-EPUB currently requirestranslatebook.sh. Seedocs/architecture/specs/SPEC-013-pipeline-completion-shell-replacement.md.
ai.cli emits concise structured progress lines:
[progress:model]— model resolution result (requested,resolved,tier,explicit)[progress:input]— resolved format and IO paths[progress:resume]— checkpoint context (restored_segments, checkpoint path, force mode)[progress:translate]— translation stage start[progress:source]— source load summary (segments,resumed,pending,batches)[progress:batch]— per-batch progress (index,translated,batch_segments)- EPUB includes
docs=...when doc identity is available
- EPUB includes
[progress:batch_sample]— heartbeat sample after each batch (batch=N/Total,doc=,src=,tgt=); disabled by--no-sanity-probe[progress:save]— save stage before writing output[progress:done]— completion summary[progress:error]— failure localization withstage+ error type/message (stderr)
These logs are intentionally operational (not verbose) and designed for quick diagnosis.
If you encounter persistent AbortError or capacity issues with Gemini CLI, you can use the direct Gemini API as an alternative:
# Set your API key
export GEMINI_API_KEY="your-api-key-here"
# Use API provider instead of CLI
uv run bookweaver book.epub --output book_translated.epub --provider api
# Keep CLI as primary but allow fallback to API on CLI failures
uv run bookweaver book.epub --output book_translated.epub --provider cli --cli-api-fallbackAdvantages of API provider:
- More stable and predictable error handling
- Better rate limit recovery with explicit retry delays
- Programmatic control over timeouts and retries
- Clear distinction between temporary (429) and permanent quota errors
Current status:
- Gemini API provider is now available via
--provider api - CLI provider remains the default (
--provider cli) - API provider requires
GEMINI_API_KEYorgemini_api.api_keyin config - Optional fallback is available via
--cli-api-fallback(only when--provider cli)- Triggered on transient/transport CLI translation failures (for example AbortError/timeouts)
- Not used for rate-limit failures (those stay on CLI retry logic)
- Requires
GEMINI_API_KEYorgemini_api.api_keyfor the API provider
Common commands:
# Continue from translation to render/export
./translatebook.sh --start-step 3 --output-format epub /path/to/book.epub
# Continue from Step 4 bridge (no-op) to render/export if output.md already exists
./translatebook.sh --start-step 4 --output-format epub /path/to/book.epub
# Re-run translation step with explicit model/prompt overrides
./translatebook.sh --start-step 3 --output-format epub /path/to/book.epubBefore expensive translation runs, check EPUB quality first:
# If epubcheck is installed
epubcheck /path/to/book.epubIf preflight reports structural/link issues (for example broken href#fragment),
clean the book manually in tools like Sigil/Calibre first, then run BookWeaver.
Note: EPUB workflow now tolerates pre-existing source broken fragments (it only blocks newly introduced broken links), but source-quality cleanup is still recommended for better reader compatibility.
python3 01_convert_to_htmlz.py /path/to/book.epub
# prepare a sample temp dir with page0001~page0003.md
python3 -u -m ai.cli <sample_temp_dir> --input-format markdown --output <sample_temp_dir>/output.md --model gemini-2.5-flash --output-lang zh01_convert_to_htmlz.py(normalize and split)ai.clitranslation step (invoked bytranslatebook.shstep 3)- Step 4 bridge (no-op; merged
output.mdalready produced byai.cli) 05_md_to_html.py(HTML rendering)06_add_toc.py(TOC)07_generate_formats.py(EPUB/DOCX/PDF generation)
Run the same checks locally before pushing:
uv run ruff check .
uv run ruff format --check .
uv run tach check
uv run pytest -qCI uses workflow lint-and-test with a strict order:
- lint (
ruff check+ruff format --check) - test (
pytest) after lint passes
If lint or tests fail, the CI gate is blocking and the change is not merge-ready.
See docs/architecture/specs/SPEC-003-lint-quality-gates.md for the formal policy.
If you encounter persistent AbortError: The user aborted a request or similar capacity errors:
Immediate solutions:
- Wait and retry - Gemini CLI has traffic prioritization limits that vary by time of day
- Use checkpoint resume for EPUB workflow - continue interrupted runs safely:
# Resume with compatibility checks (recommended) uv run bookweaver book.epub --output book_translated.epub --resume # Force resume when model/config changed uv run bookweaver book.epub --output book_translated.epub --force-resume
- Switch model and retry:
# Initial run with flash ./translatebook.sh --workflow epub --model flash book.epub # If flash fails, retry with pro ./translatebook.sh --workflow epub --model pro book.epub
- Use Gemini API (experimental) - More stable alternative to CLI:
export GEMINI_API_KEY="your-api-key" ./translatebook.sh --workflow epub --provider api book.epub
- Use CLI + API fallback (opt-in) - Keep CLI first, switch to API on transient CLI failures:
export GEMINI_API_KEY="your-api-key" ./translatebook.sh --workflow epub --provider cli --fallback-provider api book.epub
Root cause: Gemini CLI 0.35.0+ has strict traffic prioritization and internal loop recovery logic that aborts requests if they exceed internal timeout thresholds. This is a known limitation documented in Gemini CLI updates.
Workarounds:
- Avoid peak traffic hours (9 AM - 6 PM Pacific time usually has higher limits)
- Try early morning or late night runs
- Use
--model proif you have higher tier access - Consider using Gemini API provider for production workloads
You can tune retry/split/circuit-breaker behavior in config/config.json:
{
"epub_resilience": {
"rate_limit_backoff_seconds": [60, 120],
"timeout_backoff_seconds": [60],
"transient_backoff_seconds": [45],
"max_split_depth": null,
"doc_failure_budget": null,
"failed_docs_path": null,
"cli_api_fallback_enabled": false,
"pro_timeout_seconds": 300,
"non_pro_timeout_seconds": 180
}
}Notes:
- Defaults preserve current behavior.
doc_failure_budgetenables a document-level circuit breaker.failed_docs_pathwrites failed-document diagnostics as JSON.- CLI argument
--cli-api-fallbackenables runtime CLI→API transport fallback for that run. Requires API credentials (GEMINI_API_KEYorgemini_api.api_key).
Prompt rendering is profile-driven:
config/prompts/default_prompt.txtconfig/prompts/ebook_prompt.txt- selected via
config/config.json.example(prompt_profile,prompt_templates) {GLOSSARY_BLOCK}placeholder injected with terminology constraints (see glossary section below)
You can append extra instructions with:
./translatebook.sh -p "Your custom translation constraints" /path/to/book.epubBookWeaver can automatically extract terminology from EPUB index/TOC and inject it as translation constraints via prompt injection.
Automatic extraction + injection (single command):
./translatebook.sh --workflow epub --extract-glossary --output-format epub /path/to/book.epubNote: --extract-glossary is supported for EPUB workflows and is available when invoking ai.cli directly (run python -m ai.cli <book.epub> --extract-glossary --output <out_dir>). The extractor writes the glossary to <input_basename>_temp/extracted_glossary.json and the CLI injects it into the translation prompt. Use --glossary <file> to supply a pre-generated glossary JSON file if preferred.
Manual extraction (pre-run glossary preparation):
# Extract glossary JSON from EPUB
uv run python3 00_extract_glossary.py /path/to/book.epub --output glossary.json --model pro
# View extracted terms
cat glossary.json | jq '.[0:3]' # First 3 terms
# Translate with glossary
./translatebook.sh --workflow epub --glossary glossary.json --output-format epub /path/to/book.epubBy default, glossary injection includes all terms. Use --glossary-min-priority to reduce prompt bloat:
# Only inject critical + high priority terms (excludes medium, reduces prompt by ~60%)
./translatebook.sh --workflow epub --glossary glossary.json --glossary-min-priority high --output-format epub /path/to/book.epub
# Only inject critical terms (most aggressive filtering, ~85% reduction)
./translatebook.sh --workflow epub --glossary glossary.json --glossary-min-priority critical --output-format epub /path/to/book.epubPriority levels (extracted by AI analysis):
critical— Core domain concepts essential for accurate translationhigh— Important terms that appear frequentlymedium— Supporting vocabulary with lower frequency (default inclusion, can be filtered)
BookWeaver supports multiple glossary extraction strategies via --glossary-mode:
auto (default with --extract-glossary):
- Tier 1: If strong index signals detected → extract from index/TOC directly
- Tier 2: If no strong index → build local terminology shortlist, then refine with AI
- Automatically adapts to EPUB structure (technical books with indexes vs. fiction)
- Balance of quality and token cost
deep-scan (whole-book AI extraction):
- Explicit whole-book terminology extraction using AI analysis
- Higher token cost, comprehensive coverage
- Never automatic — requires explicit
--glossary-mode deep-scan
Manual glossary (skip extraction):
- Use
--glossary <path>to provide pre-built glossary JSON - Skips all extraction, directly injects from file
Backward compatibility:
--extract-glossaryis a backward-compatible alias for--glossary-mode auto- Existing workflows continue working unchanged
Examples:
# Auto mode (adaptive tier-based extraction)
uv run bookweaver book.epub --output out.epub --extract-glossary
uv run bookweaver book.epub --output out.epub --glossary-mode auto
# Deep-scan mode (whole-book AI extraction, higher cost)
uv run bookweaver book.epub --output out.epub --glossary-mode deep-scan
# Manual glossary (no extraction)
uv run bookweaver book.epub --output out.epub --glossary glossary.json- Index detection: Two-pass approach (filename hints first, then content heuristics for Kindle-format EPUBs)
- Default model: Pro model (slower but more reliable terminology selection)
- CLI→API fallback: If Gemini CLI unavailable, automatically falls back to Gemini API (requires
GEMINI_API_KEYor config) - API-only extraction: Use
--api-keyto force direct API extraction:export GEMINI_API_KEY="your-api-key" uv run python3 00_extract_glossary.py /path/to/book.epub --output glossary.json --api-key $GEMINI_API_KEY
To test glossary on specific chapters:
# Extract glossary
uv run python3 00_extract_glossary.py book.epub -o glossary.json
# Translate with glossary
./translatebook.sh --workflow epub --glossary glossary.json --output-format epub book.epub--epub-baselineruns a dedicated roundtrip path and exits early from translation/rendering steps.- Baseline output is
<temp_dir>/baseline_roundtrip.epuband preserves source package content (no text mutation). - Baseline parser extracts OPF path, cover metadata pointer, and spine order for structural validation.
--workflow epubruns the EPUB package-preserving translation workflow and exits early from the legacy markdown pipeline.--epub-translate-roundtripis a deprecated alias for--workflow epub.- EPUB workflow output is
<temp_dir>/translated_roundtrip.epub. - EPUB workflow currently supports only
alternatingbilingual output and enforces strict integrity checks (fail-fast on errors). - EPUB workflow uses per-document batch translation (
%%segment separator) to reduce API call count versus per-segment calls. - If batch output segment count mismatches, it automatically falls back to binary split retry for that document.
- In EPUB workflow, table cells are source-only (no
th/tdbilingual injection) for layout stability. - Model fallback chain is intentionally out of scope for this phase.
- In markdown workflow, Step 3 (
ai.cli) writes final bilingualoutput.mddirectly. - Step 4 in markdown workflow is a no-op bridge for legacy step numbering.
- Legacy markdown pipeline scripts (
01_convert_to_htmlz.py,05_md_to_html.py,06_add_toc.py,07_generate_formats.py) are still required for PDF/DOCX workflow and are not safe to remove yet. - Legacy pipeline removal is deferred until SPEC-013 parity is complete (ai.cli-owned output rendering for non-EPUB workflows).
- Step 5 renders markdown image syntax (
) into<img>and keeps source-side#headings as real document headings. --bilingual-stylecurrently supports onlyalternating.- Step 6 can build TOC from markdown-style heading lines in HTML paragraphs, auto-creates a TOC container when missing, and defaults TOC entries to chapter-level (
h1) headings. - Step 7 resolves HTML input in order:
book_doc.html->book.html-> newest*.htmlin temp dir. - Step 7 format conversion uses Calibre
ebook-convertdirectly (no external publish script required). - Model selection in Step 3 uses: requested model (or alias) ->
fallback_chain-> probe availability. - Gemini provider no longer uses a hardcoded static allow-list.
- Cover metadata pointer (
meta name="cover") remains resolvable after roundtrip. - TOC/nav links stay clickable (no broken fragment links in validated docs).
- Manifest asset paths remain present (no missing image/css/font files).
- OPF spine order remains unchanged.
See docs/architecture/specs/SPEC-005-epub-roundtrip-baseline.md for baseline policy and limits.
- Generated EPUB exists at
<temp_dir>/translated_roundtrip.epub. - Spine XHTML content contains both source and translated text in alternating order.
- TOC/nav resource files remain unmodified in package-aware translation mode.
- Cover/toc/spine pointers remain resolvable.
- Fragment links and manifest asset references pass strict validation.
See docs/architecture/specs/SPEC-006-epub-translate-roundtrip.md for EPUB workflow scope and limits.
This mode is kept as a separate feature path and currently translates spine XHTML with per-document batching (%% separators), while preserving segment alignment.
Design reference for future prompt/segmentation optimization:
- Immersive Translate
1.26.6(paragraph-structure-preserving prompt style).
- Runtime config template:
config/config.json.example - Model aliases:
model_aliases(pro|flash|lite) - Probe cache config:
model_probe.cache_path/model_probe.cache_ttl_seconds - Ordered fallback:
fallback_chain - Quota DB:
~/.config/translatebook/quota.db - Batch sanity probe (
sanity_probe): runs after every batch; halts on empty output, runaway length ratio, or wrong-language (non-CJK) output:Disable via CLI:"sanity_probe": { "enabled": true, "max_length_ratio": 2.0, "min_length_ratio": 0.15, "min_source_length": 10, "min_cjk_density": 0.30, "heartbeat_chars": 60 }
--no-sanity-probe
For using Gemini API provider instead of CLI, configure in config/config.json:
{
"gemini_api": {
"enabled": true,
"api_key": "your-google-ai-api-key-here",
"model": "gemini-2.5-flash"
}
}API Key Priority (highest to lowest):
api_keyparameter (if passed to GeminiAPIProvider constructor)GEMINI_API_KEYenvironment variableconfig['gemini_api']['api_key']in config.json
Recommended approach:
- Development: Use
GEMINI_API_KEYenv var for quick testing - Production: Use config file to persist settings
- CI/CD: Use env var to keep secrets out of version control
This project was forked and reworked from: https://github.com/wizlijun/claude_translater
Thanks to the original author and contributors for the foundation.
MIT (see LICENSE)