feat: docsfy - AI-powered documentation generator#2
Conversation
…ently of code changes
…ix HTML rendering
…t, llms.txt fixes, minimal README
|
Report bugs in Issues Welcome! 🎉This pull request will be automatically processed with the following features: 🔄 Automatic Actions
📋 Available CommandsPR Status Management
Review & Approval
Testing & Validation
Container Operations
Cherry-pick Operations
Label Management
✅ Merge RequirementsThis PR will be automatically approved when the following conditions are met:
📊 Review ProcessApprovers and ReviewersApprovers: Reviewers: Available Labels
💡 Tips
For more information, please refer to the project documentation or contact the maintainers. |
📝 WalkthroughWalkthroughThis pull request introduces docsfy, a complete new AI-powered documentation generator project. It establishes infrastructure (Docker, configuration tools), a FastAPI backend with database storage, AI provider integration, repository handling, markdown-to-HTML rendering with static assets, Jinja templating, and comprehensive test coverage across all modules. Changes
Sequence DiagramsequenceDiagram
participant Client
participant FastAPI
participant Repository
participant Generator
participant AI_CLI
participant Renderer
participant Storage
participant Database
Client->>FastAPI: POST /api/generate<br/>(repo_url or repo_path)
FastAPI->>Storage: save_project(name, status="generating")
Storage->>Database: INSERT project
Database-->>Storage: ✓
Storage-->>FastAPI: ✓
FastAPI->>Repository: clone_repo or get_local_repo_info
Repository-->>FastAPI: (repo_path, commit_sha)
FastAPI->>Generator: run_planner(repo_path, ...)
Generator->>AI_CLI: call_ai_cli(planner_prompt)
AI_CLI-->>Generator: JSON plan response
Generator->>Generator: parse_json_response
Generator-->>FastAPI: plan dict
FastAPI->>Generator: generate_all_pages(repo_path, plan, ...)
Generator->>AI_CLI: call_ai_cli(page_prompt) [concurrent]
AI_CLI-->>Generator: markdown content
Generator->>Generator: _strip_ai_preamble
Generator-->>FastAPI: {slug: markdown}
FastAPI->>Renderer: render_site(plan, pages, output_dir)
Renderer->>Renderer: render_index, render_page
Renderer->>Renderer: build_search_index, build_llms_txt
Renderer-->>FastAPI: ✓ (HTML files, CSS, JS)
FastAPI->>Storage: update_project_status(status="ready", page_count, ...)
Storage->>Database: UPDATE project
Database-->>Storage: ✓
Storage-->>FastAPI: ✓
Client->>FastAPI: GET /docs/project/index.html
FastAPI-->>Client: rendered HTML with navigation & search
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 18
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/plans/2026-03-04-docsfy-design.md (1)
112-116:⚠️ Potential issue | 🟡 MinorSame terminal escape code artifact in AI_MODEL default.
The
claude-opus-4-6[1m]value contains what appears to be a terminal escape sequence. This should be corrected to match the intended model name.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/plans/2026-03-04-docsfy-design.md` around lines 112 - 116, The AI_MODEL default contains a stray terminal escape artifact "claude-opus-4-6[1m]"; update the default value for AI_MODEL to the correct model identifier (e.g., "claude-opus-4-6") wherever it's defined in this document so the table entry reads `AI_MODEL | claude-opus-4-6` and remove the "[1m]" sequence.
🟡 Minor comments (8)
src/docsfy/static/codelabels.js-5-5 (1)
5-5:⚠️ Potential issue | 🟡 MinorLanguage regex is too restrictive for real-world class names.
/language-(\w+)/misses identifiers containing-,+, or#, so some code blocks won’t get labels.Suggested fix
- var match = classes.match(/language-(\w+)/); + var match = classes.match(/language-([a-z0-9#+-]+)/i);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/docsfy/static/codelabels.js` at line 5, The regex in the classes.match call is too narrow and misses language identifiers with characters like '-', '+', or '#'; update the pattern used in the classes.match invocation (the variable match assignment in src/docsfy/static/codelabels.js) to allow those characters (e.g., include - + # alongside word chars in the capture group) so code blocks with names containing hyphens or symbols are correctly detected and labeled.README.md-17-25 (1)
17-25:⚠️ Potential issue | 🟡 MinorQuick Start is platform-specific and slightly misleading.
docker compose upruns foreground by default, andopenis macOS-only. Considerdocker compose up -dplus cross-platform browser guidance.Suggested doc tweak
-# Run -docker compose up +# Run +docker compose up -d @@ -# Browse docs -open http://localhost:8000/docs/repo/ +# Browse docs +# macOS: open, Linux: xdg-open, Windows: start +http://localhost:8000/docs/repo/🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` around lines 17 - 25, Update the Quick Start commands to avoid platform-specific/misleading instructions: change the instructions that currently use "docker compose up" to recommend "docker compose up -d" (or note foreground behavior) and replace the macOS-only "open http://localhost:8000/docs/repo/" with cross-platform guidance (e.g., mention using the URL directly or platform commands like "xdg-open" on Linux and "start" on Windows), and keep the existing curl POST example as-is for generating docs.src/docsfy/static/search.js-74-76 (1)
74-76:⚠️ Potential issue | 🟡 MinorGuard search against malformed index entries.
If an entry is missing
titleorcontent,toLowerCase()throws and search stops rendering.💡 Proposed fix
- var matches = index.filter(function(item) { - return item.title.toLowerCase().includes(q) || item.content.toLowerCase().includes(q); - }).slice(0, 10); + var matches = index.filter(function(item) { + var titleText = item && typeof item.title === 'string' ? item.title.toLowerCase() : ''; + var contentText = item && typeof item.content === 'string' ? item.content.toLowerCase() : ''; + return titleText.includes(q) || contentText.includes(q); + }).slice(0, 10); @@ - var contentIdx = m.content.toLowerCase().indexOf(q); + var rawContent = typeof m.content === 'string' ? m.content : ''; + var contentIdx = rawContent.toLowerCase().indexOf(q); if (contentIdx >= 0) { var start = Math.max(0, contentIdx - 40); - var end = Math.min(m.content.length, contentIdx + q.length + 60); - var snippet = (start > 0 ? '...' : '') + m.content.substring(start, end) + (end < m.content.length ? '...' : ''); + var end = Math.min(rawContent.length, contentIdx + q.length + 60); + var snippet = (start > 0 ? '...' : '') + rawContent.substring(start, end) + (end < rawContent.length ? '...' : ''); preview.textContent = snippet; }Also applies to: 91-95
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/docsfy/static/search.js` around lines 74 - 76, The current index.filter callback uses item.title.toLowerCase() and item.content.toLowerCase() which will throw if title or content are missing; update the filter in the search logic (the index.filter callback that produces matches) to safely coerce title/content to strings (e.g., let title = (item.title || '').toLowerCase(); let content = (item.content || '').toLowerCase()) and then use title.includes(q) || content.includes(q); apply the same defensive check to the other search occurrence around the code that performs the second filter (the similar block at lines 91-95) so all searches tolerate malformed entries.src/docsfy/static/github.js-9-13 (1)
9-13:⚠️ Potential issue | 🟡 MinorRegex truncates valid GitHub repo names with dots.
Line 9 captures repo as
([^/.]+), soowner/my.repobecomesmy. That breaks the API request for a valid repo name.💡 Proposed fix
- var match = repoUrl.match(/github\.com[/:]([^/]+)\/([^/.]+)/); + var match = repoUrl.match(/github\.com[/:]([^/]+)\/([^/?#]+?)(?:\.git)?(?:[/?#]|$)/i);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/docsfy/static/github.js` around lines 9 - 13, The current regex in the repoUrl.match call uses ([^/.]+) which stops at a dot and truncates valid repo names; update the pattern used in the github.com match (the repoUrl.match invocation and its resulting match handling for owner and repo) to allow dots in repo names and optionally strip a trailing .git (e.g. match the owner with ([^/]+) and the repo with ([^/]+)(?:\.git)?), then assign owner = match[1] and repo = match[2] as before so full repo names like owner/my.repo are preserved.tests/test_config.py-42-42 (1)
42-42:⚠️ Potential issue | 🟡 MinorUse a specific exception type in
pytest.raises.Line 42 should assert
ValidationErrorinstead ofExceptionto avoid hiding unrelated failures.💡 Proposed fix
import pytest +from pydantic import ValidationError @@ - with pytest.raises(Exception): + with pytest.raises(ValidationError): Settings()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_config.py` at line 42, Replace the generic exception assertion in the test (the "with pytest.raises(Exception):" block) with a specific ValidationError by changing it to "with pytest.raises(ValidationError):" and ensure the test file imports ValidationError (e.g., "from pydantic import ValidationError" or the project's ValidationError class) so the test only catches the intended validation failure.docs/plans/2026-03-04-docsfy-implementation-plan.md-18-21 (1)
18-21:⚠️ Potential issue | 🟡 MinorLocal filesystem paths won't work for other developers.
Lines 18-21 and 79-81 reference paths like
/home/myakove/git/pr-test-oracle/...which are specific to one developer's machine. Consider either embedding the actual content or referencing a public repository/URL.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/plans/2026-03-04-docsfy-implementation-plan.md` around lines 18 - 21, The plan references local filesystem paths for artifacts (.pre-commit-config.yaml, .flake8, tox.toml, .gitleaks.toml) which won't resolve for other devs; replace those local path references in the docs/plans entry with either the actual content (inline the files or paste their contents into the repo under the same names) or point to a stable public location (a project repo URL or gist) where the files can be fetched, and update the bullet lines to reference the new repository/URLs or the relative paths within this repo instead of /home/... paths.docs/plans/2026-03-04-docsfy-implementation-plan.md-131-132 (1)
131-132:⚠️ Potential issue | 🟡 MinorTerminal escape code artifact in AI_MODEL default value.
The value
claude-opus-4-6[1m]appears to contain a terminal escape code artifact ([1m]is ANSI bold). This appears in multiple places in the plan (lines 131, 265, 322) and will cause the AI CLI to use an invalid model name.🐛 Proposed fix
-AI_MODEL=claude-opus-4-6[1m] +AI_MODEL=claude-opus-4-20250514Or use the intended model name without the escape sequence.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/plans/2026-03-04-docsfy-implementation-plan.md` around lines 131 - 132, The AI_MODEL default contains a terminal escape artifact ("claude-opus-4-6[1m]"); remove the ANSI fragment and replace all occurrences with the intended model name (e.g., "claude-opus-4-6") wherever AI_MODEL is defined or referenced in this document (notably the instances matching the shown diff), ensuring other related variables like AI_CLI_TIMEOUT remain unchanged; search for "AI_MODEL" and replace any value containing "[1m]" with the clean model string.tests/test_models.py-35-36 (1)
35-36:⚠️ Potential issue | 🟡 MinorUse
ValidationErrorinstead of broadExceptionassertions in validation tests.
pytest.raises(Exception)can pass on unrelated errors and hide regressions. TheGenerateRequestmodel uses Pydantic v2 validators that raiseValueError, which Pydantic wraps inValidationErrorbefore propagating to the caller. UseValidationErrorspecifically for these assertions.Suggested tightening
+from pydantic import ValidationError ... - with pytest.raises(Exception): + with pytest.raises(ValidationError): GenerateRequest(repo_url="not-a-url") ... - with pytest.raises(Exception): + with pytest.raises(ValidationError): GenerateRequest() ... - with pytest.raises(Exception): + with pytest.raises(ValidationError): GenerateRequest( repo_url="https://github.com/org/repo.git", repo_path="/some/path" )Also applies to: 78-79, 85-88
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_models.py` around lines 35 - 36, Replace broad Exception assertions in the tests with Pydantic's ValidationError: change pytest.raises(Exception) to pytest.raises(ValidationError) when instantiating GenerateRequest (and the other failing cases noted for the same model), and import ValidationError from pydantic at the top of tests/test_models.py so the tests assert the specific validation error raised by GenerateRequest's Pydantic validators.
🧹 Nitpick comments (8)
tox.toml (1)
13-26: Avoid nestinguv runinside tox unless isolation tradeoff is intentional.The use of
uv run --extra devat lines 17-24 shifts environment management to uv rather than tox. Whilepytest-xdistis confirmed present in the dev extra (supporting-n auto), this architecture creates a dependency on project extras rather than explicit tox-managed deps, which can complicate environment reproducibility. Consider a tox-native approach:Suggested tox-native approach
[env.unittests] -deps = ["uv"] +deps = [".[dev]"] commands = [ [ - "uv", - "run", - "--extra", - "dev", "pytest", "-n", "auto", "tests", ], ]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tox.toml` around lines 13 - 26, The tox env 'env.unittests' currently shells out to "uv run --extra dev pytest -n auto" which delegates dependency management to uv; change it to a tox-native setup by removing the "uv run" invocation and instead declare explicit deps like "pytest" and "pytest-xdist" in the env.unittests deps list and run "pytest -n auto tests" directly in the commands array; update the env.unittests block (look for the deps/commands entries) so tox installs and controls test runner dependencies rather than relying on the project's dev extra.tests/test_repository.py (1)
23-27: Strengthen mock assertions for subprocess call arguments.Current success test checks return values only. Consider asserting expected subprocess args (including timeout and
--once added) to lock in safety behavior.💡 Suggested assertion pattern
-from unittest.mock import MagicMock, patch +from unittest.mock import ANY, MagicMock, call, patch @@ with patch("docsfy.repository.subprocess.run") as mock_run: @@ repo_path, sha = clone_repo("https://github.com/org/repo.git", tmp_path) + assert mock_run.call_args_list[0] == call( + ["git", "clone", "--depth", "1", "--", "https://github.com/org/repo.git", str(tmp_path / "repo")], + capture_output=True, + text=True, + timeout=ANY, + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_repository.py` around lines 23 - 27, Add assertions that the patched subprocess.run was called with the exact expected arguments to lock behavior: after the test exercise of docsfy.repository functions that invokes subprocess.run, assert mock_run.assert_any_call(...) (or inspect mock_run.call_args_list) includes the expected argv list containing the command elements and the '--' separator and that timeout kwarg is present with the expected value; reference the patched symbol mock_run (from patch("docsfy.repository.subprocess.run")) and the subprocess invocation in the repository code to check both positional argv contents and keyword args (timeout) rather than only return values..pre-commit-config.yaml (1)
34-34: Pin the VCS dependency to an immutable ref.
git+https://github.com/RedHatQE/flake8-plugins.gittracks a moving target on the default branch. Pin a commit SHA or tag to ensure reproducible builds and reduce supply-chain risk.💡 Proposed fix
- [git+https://github.com/RedHatQE/flake8-plugins.git, flake8-mutable] + [git+https://github.com/RedHatQE/flake8-plugins.git@<commit-sha>#egg=flake8-plugins, flake8-mutable]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.pre-commit-config.yaml at line 34, The listed VCS dependency git+https://github.com/RedHatQE/flake8-plugins.git for the hook flake8-mutable is unpinned and should be fixed to an immutable ref; update the entry in .pre-commit-config.yaml that contains the string "git+https://github.com/RedHatQE/flake8-plugins.git" (and the hook name "flake8-mutable") to reference a specific tag or commit SHA (e.g., append @<tag-or-sha> or set rev: to a SHA) so the pre-commit hook is pinned and reproducible.src/docsfy/storage.py (2)
49-78: SQL query construction is safe but pattern could be improved.Ruff flags potential SQL injection (S608), but this is a false positive since
fieldsonly contains hardcoded column names. However, the dynamic SQL pattern could be clearer. Consider adding a comment to document this safety guarantee for future maintainers.💡 Optional: Add safety comment
async def update_project_status( name: str, status: str, last_commit_sha: str | None = None, page_count: int | None = None, error_message: str | None = None, plan_json: str | None = None, ) -> None: async with aiosqlite.connect(DB_PATH) as db: + # Fields list contains only hardcoded column names - safe from SQL injection + # All user values are parameterized via the values list fields = ["status = ?", "updated_at = CURRENT_TIMESTAMP"] values: list[str | int | None] = [status]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/docsfy/storage.py` around lines 49 - 78, The dynamic SQL in update_project_status builds the fields list from hardcoded column names which is safe but trigged an S608 false positive; add a brief comment above the fields/values construction (near the symbols fields, values and DB_PATH in update_project_status) stating that fields are only populated with predetermined column names (no user-controlled input) and therefore safe from SQL injection, or alternatively replace the implicit appends with an explicit allowed_columns/column-to-placeholder mapping to make the guarantee obvious to future maintainers and linters.
8-10: Module-level variables may cause issues in tests.
DB_PATH,DATA_DIR, andPROJECTS_DIRare computed at module import time from environment variables. The test fixture directly mutates these module attributes, which works but is fragile. This is acceptable for the current test setup, but be aware of potential issues if tests run in parallel or if the module is reimported.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/docsfy/storage.py` around lines 8 - 10, DB_PATH, DATA_DIR, and PROJECTS_DIR are computed at import time which makes tests that mutate module attributes fragile; change them to be computed lazily by replacing the module-level constants with accessor functions (e.g., get_db_path(), get_data_dir(), get_projects_dir()) or properties that read os.getenv() on each call, update all call sites and tests to use these accessors, and ensure tests set environment variables (or monkeypatch the accessors) before calling the accessors so parallel/reimport scenarios no longer rely on mutable module state.src/docsfy/static/style.css (1)
628-633: Consider using complex:not()pseudo-class notation for Stylelint compliance.Stylelint flags the chained simple
:not()selectors. The complex notation is more readable and future-proof.♻️ Proposed refactor
-blockquote:not(.callout-note):not(.callout-warning):not(.callout-tip) { +blockquote:not(.callout-note, .callout-warning, .callout-tip) { border-left: 4px solid var(--border-primary); padding: 1rem 1.25rem; margin: 1.5rem 0; color: var(--text-secondary); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/docsfy/static/style.css` around lines 628 - 633, The selector "blockquote:not(.callout-note):not(.callout-warning):not(.callout-tip)" is flagged by Stylelint for chained :not() usage; replace the chained simple :not() pseudo-classes with a single complex :not() containing the comma-separated list of the three callout classes so Stylelint passes and the rule remains equivalent, updating the block where the selector is defined in style.css (the blockquote selector) and ensuring spacing and variable usage (border-left, padding, margin, color) remain unchanged.tests/test_renderer.py (1)
56-73: Strengthensearch-index.jsontest beyond file existence.Consider asserting valid JSON structure and at least one expected entry (
slug,title, or searchable content) so regressions in index serialization are caught.Suggested test hardening
+import json ... def test_search_index_generated(tmp_path: Path) -> None: @@ render_site(plan=plan, pages=pages, output_dir=output_dir) - assert (output_dir / "search-index.json").exists() + index_path = output_dir / "search-index.json" + assert index_path.exists() + payload = json.loads(index_path.read_text()) + assert isinstance(payload, list) + assert any(item.get("slug") == "intro" for item in payload)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_renderer.py` around lines 56 - 73, Update the test_search_index_generated test to not only check existence of search-index.json but to open and json.load the file produced by render_site, assert it is valid JSON (list or dict as expected by your renderer), and assert at least one entry contains the expected page data (e.g., an entry with slug "intro" and/or title "Intro" or searchable content substring). Use the same output_dir / "search-index.json" path, call json.loads or json.load on that file, and add assertions on the structure and presence of the expected keys/values to catch serialization regressions.src/docsfy/templates/index.html (1)
43-44: Expose sidebar state to assistive technologies.The toggle should maintain
aria-expandedand reference the controlled element for better accessibility.Suggested accessibility update
- <button class="sidebar-toggle" id="sidebar-toggle" aria-label="Toggle sidebar"> + <button class="sidebar-toggle" id="sidebar-toggle" aria-label="Toggle sidebar" aria-controls="sidebar" aria-expanded="false"> @@ toggle.addEventListener('click', function() { - sidebar.classList.toggle('open'); + var isOpen = sidebar.classList.toggle('open'); + toggle.setAttribute('aria-expanded', isOpen ? 'true' : 'false'); if (overlay) overlay.classList.toggle('open'); });Also applies to: 123-126
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/docsfy/templates/index.html` around lines 43 - 44, The sidebar toggle button with id "sidebar-toggle" must expose its state to ATs: add an aria-controls attribute pointing to the controlled sidebar element id (e.g., "sidebar") and ensure the button maintains an accurate aria-expanded boolean that is updated when the toggle runs; locate the toggle element (id "sidebar-toggle") and the sidebar element (class or id "sidebar") and update the toggle handler (e.g., the click listener or function that shows/hides the sidebar) to set button.setAttribute('aria-expanded', String(isOpen)) and button.setAttribute('aria-controls', sidebarId) whenever the visibility changes so screen readers can detect the relationship and current state.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.env.example:
- Line 3: The AI_MODEL value in .env.example is malformed
("claude-opus-4-6[1m"); replace it with a valid Claude model ID (for example
"claude-opus-4-20250514" or the newer "claude-opus-4-1-20250805") so the
AI_MODEL environment variable uses a correct, complete model identifier without
ANSI sequences.
In @.gitleaks.toml:
- Around line 4-7: The allowlist entry under the [allowlist] section currently
exempts all Python test files via the paths list ('''tests/.*\.py''') which is
too broad; instead, narrow the scope by removing the blanket paths pattern and
migrate exemptions to rule-level allowlists for specific known false positives
(or list explicit filenames rather than a directory-wide regex). Update the
config to remove or tighten the '''tests/.*\.py''' path entry and add
rule-specific allowlist entries referencing the offending rule IDs or exact test
fixture filenames so only known benign files are exempted.
In `@src/docsfy/config.py`:
- Line 17: The default ai_model string is invalid; remove the ANSI suffix and
replace the unsupported model id by setting the ai_model variable (ai_model:
str) to a supported Anthropic model identifier such as
"claude-sonnet-4-20250514" so API calls will succeed (i.e., change ai_model from
"claude-opus-4-6[1m]" to "claude-sonnet-4-20250514").
In `@src/docsfy/generator.py`:
- Around line 67-68: Reject or sanitize unsafe AI-controlled slugs before using
them to construct filesystem paths: validate the `slug` used when creating
`cache_file = cache_dir / f"{slug}.md"` (and the other occurrences around the
`use_cache` checks at the later blocks) to ensure it contains only allowed
characters (e.g., alphanumerics, hyphen, underscore), does not contain path
separators like "..", "/", or "\" and does not start with a dot; alternatively
resolve the resulting path and assert it is inside `cache_dir` (e.g., compare
resolved parents) before any open/write operations, and raise/return an error
for invalid slugs so no file can be written outside `cache_dir`.
In `@src/docsfy/main.py`:
- Around line 275-283: The current code builds the entire tar.gz in memory using
BytesIO and returns a StreamingResponse, which risks high memory usage; instead,
create the archive on disk (e.g., using a temporary Path) in a background thread
(use loop.run_in_executor or FastAPI's async_to_sync pattern) by calling
tarfile.open(out_path, mode="w:gz") and tar.add(site_dir, arcname=name), then
return a FileResponse pointing at that out_path with the same
Content-Disposition header; ensure you delete the temp file after transfer or
use a temp file that is cleaned up automatically.
- Around line 75-108: The generate route is adding project_name into the
_generating set before validating it and not removing it if downstream
save_project or task creation fails; update the generate function to first
validate/normalize project_name (e.g., disallow spaces/special chars or run
existing route-name validation) and reject with HTTPException if invalid, then
add to _generating only after validation, and wrap the save_project call and
asyncio.create_task invocation in try/except/finally so that on any exception
you remove project_name from _generating and re-raise or return a
500/appropriate HTTPException; reference the generate function, the _generating
set, save_project, and _run_generation/asyncio.create_task when implementing
these changes.
In `@src/docsfy/renderer.py`:
- Around line 170-174: The render_site function currently uses
output_dir.mkdir(..., exist_ok=True) and assets_dir.mkdir(..., exist_ok=True)
which leaves previous build artifacts; modify render_site to remove any existing
output_dir contents before creating directories (e.g., check if
output_dir.exists() and call shutil.rmtree(output_dir) then recreate output_dir
and assets_dir), ensuring you import shutil at the top and keep the same
variable names (render_site, output_dir, assets_dir) so stale .html/.md/assets
are cleaned prior to rendering.
- Around line 188-208: The loop in render_page usage writes files using slug
directly (variables slug and output_dir) which permits path traversal; fix by
validating/sanitizing slug before any write: reject or normalize slugs that are
absolute, contain path separators or ".." (e.g. ensure slug == Path(slug).name
and matches a safe regex like r'^[A-Za-z0-9._-]+$'), then compute target =
(output_dir / safe_slug).resolve() and assert
str(target).startswith(str(output_dir.resolve())) before calling write_text for
both f"{slug}.html" and f"{slug}.md"; update the code around render_page and the
two write_text calls to use safe_slug/target and raise/log an error for invalid
slugs.
- Around line 31-42: The _md_to_html function returns HTML created by
python-markdown which can contain raw unsafe HTML; update _md_to_html to
sanitize both content_html and toc_html with Bleach before returning: add bleach
to dependencies, then call bleach.clean on content_html and toc_html using a
whitelist that preserves expected markdown HTML (allow tags needed for headings,
paragraphs, lists, links, images, code blocks, pre, span with classes for
codehilite) and allow attributes like href/src/alt/class/title and rel on links;
also use bleach.linkify or set rel="noopener noreferrer" on links if desired;
ensure the sanitized strings are what _md_to_html returns so the template's {{
content | safe }} no longer exposes stored XSS.
In `@src/docsfy/repository.py`:
- Line 23: The log currently prints the raw repo_url which can contain userinfo
(credentials) — update the logging in repository.py so logger.info does not
include sensitive userinfo from repo_url; parse repo_url (e.g., via
urllib.parse.urlparse) and redact or remove the username:password portion before
logging (keep repo_path in the message), then log the sanitized URL instead of
the raw repo_url in the logger.info call.
In `@src/docsfy/static/copy.js`:
- Around line 18-26: The Clipboard API call can throw synchronously when
unavailable; before calling navigator.clipboard.writeText(text) check that
navigator.clipboard exists and window.isSecureContext is true, and if not,
invoke the existing fallback copy routine instead of calling writeText;
otherwise proceed to call navigator.clipboard.writeText(text). Ensure you
reference the same button variable (btn) and preserve the success and error
handling (setting btn.textContent and btn.classList) in the Promise path while
routing to the fallback path immediately when the guard fails.
In `@src/docsfy/static/style.css`:
- Around line 1055-1070: The CSS rules for .copy-btn, pre:hover .copy-btn and
.copy-btn:hover reference undefined custom properties (--border-color and
--accent-color); update those rules to use the existing variables
(--border-primary and --accent) or add matching variable definitions. Locate the
selectors .copy-btn, pre:hover .copy-btn and .copy-btn:hover in
src/docsfy/static/style.css and either replace --border-color with
--border-primary and --accent-color with --accent, or add :root declarations for
--border-color and --accent-color mapping to the existing values so the button
borders, background and color resolve correctly.
- Around line 1075-1117: The CSS uses undefined variables (--border-color,
--accent-color, and --text-secondary) in selectors like .page-nav,
.page-nav-link, .page-nav-link:hover, .page-nav-label, and .page-nav-title; fix
by adding default fallbacks or defining those variables at the root (e.g.,
:root) so the styles render predictably — update the stylesheet to either
declare --border-color, --accent-color, and --text-secondary with appropriate
values or change usages to var(--border-color, <fallback>), var(--accent-color,
<fallback>), and var(--text-secondary, <fallback>) in the .page-nav and related
rules (page-nav, page-nav-link, page-nav-link:hover, page-nav-label,
page-nav-title).
In `@src/docsfy/static/theme.js`:
- Around line 3-15: Wrap all accesses to localStorage in try-catch to avoid
SecurityError/QuotaExceededError: when reading the initial theme, guard the call
to localStorage.getItem('theme') (the variable stored) with try-catch and treat
failures as "no stored theme" so the existing prefers-color-scheme fallback
runs; likewise, inside the toggle click handler, wrap
localStorage.setItem('theme', next) in try-catch so toggling still updates
data-theme even if storage write fails. Update the code around the stored
variable and the toggle.addEventListener callback to catch and ignore storage
exceptions (optionally log) without breaking theme application.
In `@src/docsfy/templates/page.html`:
- Line 77: The template currently renders AI-generated HTML with the Jinja2
|safe filter for variables content and toc, which bypasses autoescaping and
permits XSS; to fix, sanitize the markdown-generated HTML before passing it to
the template by updating the markdown-to-HTML pipeline (e.g., the function
_md_to_html or whichever converter is used in generate_page) to call a sanitizer
like bleach.clean on both the converted content_html and toc_html (specifying
allowed_tags/attributes and strip=True), then either remove |safe from the
template or rename the sanitized values to content_sanitized/toc_sanitized and
use those in page.html to ensure only cleaned HTML is rendered.
In `@tests/test_config.py`:
- Around line 12-13: The tests are nondeterministic because Settings() still
reads .env files even when patch.dict(os.environ, {}, clear=True) is used;
update each test that currently calls Settings() (the instances created
alongside patch.dict(os.environ, ..., clear=True)) to instantiate Settings with
_env_file=None (e.g., Settings(_env_file=None)) so the Settings class won’t load
any .env file and the environment is fully controlled by the patched os.environ;
keep the patch.dict(...) usage but replace bare Settings() calls with
Settings(_env_file=None).
In `@tox.toml`:
- Around line 1-3: Replace the legacy tox keys: change the "skipsdist" setting
to the canonical "no_package" and rename "envlist" to "env_list" so the tox.toml
uses tox 4 standard keys (update the entries for "skipsdist" and "envlist"
accordingly).
---
Outside diff comments:
In `@docs/plans/2026-03-04-docsfy-design.md`:
- Around line 112-116: The AI_MODEL default contains a stray terminal escape
artifact "claude-opus-4-6[1m]"; update the default value for AI_MODEL to the
correct model identifier (e.g., "claude-opus-4-6") wherever it's defined in this
document so the table entry reads `AI_MODEL | claude-opus-4-6` and remove the
"[1m]" sequence.
---
Minor comments:
In `@docs/plans/2026-03-04-docsfy-implementation-plan.md`:
- Around line 18-21: The plan references local filesystem paths for artifacts
(.pre-commit-config.yaml, .flake8, tox.toml, .gitleaks.toml) which won't resolve
for other devs; replace those local path references in the docs/plans entry with
either the actual content (inline the files or paste their contents into the
repo under the same names) or point to a stable public location (a project repo
URL or gist) where the files can be fetched, and update the bullet lines to
reference the new repository/URLs or the relative paths within this repo instead
of /home/... paths.
- Around line 131-132: The AI_MODEL default contains a terminal escape artifact
("claude-opus-4-6[1m]"); remove the ANSI fragment and replace all occurrences
with the intended model name (e.g., "claude-opus-4-6") wherever AI_MODEL is
defined or referenced in this document (notably the instances matching the shown
diff), ensuring other related variables like AI_CLI_TIMEOUT remain unchanged;
search for "AI_MODEL" and replace any value containing "[1m]" with the clean
model string.
In `@README.md`:
- Around line 17-25: Update the Quick Start commands to avoid
platform-specific/misleading instructions: change the instructions that
currently use "docker compose up" to recommend "docker compose up -d" (or note
foreground behavior) and replace the macOS-only "open
http://localhost:8000/docs/repo/" with cross-platform guidance (e.g., mention
using the URL directly or platform commands like "xdg-open" on Linux and "start"
on Windows), and keep the existing curl POST example as-is for generating docs.
In `@src/docsfy/static/codelabels.js`:
- Line 5: The regex in the classes.match call is too narrow and misses language
identifiers with characters like '-', '+', or '#'; update the pattern used in
the classes.match invocation (the variable match assignment in
src/docsfy/static/codelabels.js) to allow those characters (e.g., include - + #
alongside word chars in the capture group) so code blocks with names containing
hyphens or symbols are correctly detected and labeled.
In `@src/docsfy/static/github.js`:
- Around line 9-13: The current regex in the repoUrl.match call uses ([^/.]+)
which stops at a dot and truncates valid repo names; update the pattern used in
the github.com match (the repoUrl.match invocation and its resulting match
handling for owner and repo) to allow dots in repo names and optionally strip a
trailing .git (e.g. match the owner with ([^/]+) and the repo with
([^/]+)(?:\.git)?), then assign owner = match[1] and repo = match[2] as before
so full repo names like owner/my.repo are preserved.
In `@src/docsfy/static/search.js`:
- Around line 74-76: The current index.filter callback uses
item.title.toLowerCase() and item.content.toLowerCase() which will throw if
title or content are missing; update the filter in the search logic (the
index.filter callback that produces matches) to safely coerce title/content to
strings (e.g., let title = (item.title || '').toLowerCase(); let content =
(item.content || '').toLowerCase()) and then use title.includes(q) ||
content.includes(q); apply the same defensive check to the other search
occurrence around the code that performs the second filter (the similar block at
lines 91-95) so all searches tolerate malformed entries.
In `@tests/test_config.py`:
- Line 42: Replace the generic exception assertion in the test (the "with
pytest.raises(Exception):" block) with a specific ValidationError by changing it
to "with pytest.raises(ValidationError):" and ensure the test file imports
ValidationError (e.g., "from pydantic import ValidationError" or the project's
ValidationError class) so the test only catches the intended validation failure.
In `@tests/test_models.py`:
- Around line 35-36: Replace broad Exception assertions in the tests with
Pydantic's ValidationError: change pytest.raises(Exception) to
pytest.raises(ValidationError) when instantiating GenerateRequest (and the other
failing cases noted for the same model), and import ValidationError from
pydantic at the top of tests/test_models.py so the tests assert the specific
validation error raised by GenerateRequest's Pydantic validators.
---
Nitpick comments:
In @.pre-commit-config.yaml:
- Line 34: The listed VCS dependency
git+https://github.com/RedHatQE/flake8-plugins.git for the hook flake8-mutable
is unpinned and should be fixed to an immutable ref; update the entry in
.pre-commit-config.yaml that contains the string
"git+https://github.com/RedHatQE/flake8-plugins.git" (and the hook name
"flake8-mutable") to reference a specific tag or commit SHA (e.g., append
@<tag-or-sha> or set rev: to a SHA) so the pre-commit hook is pinned and
reproducible.
In `@src/docsfy/static/style.css`:
- Around line 628-633: The selector
"blockquote:not(.callout-note):not(.callout-warning):not(.callout-tip)" is
flagged by Stylelint for chained :not() usage; replace the chained simple :not()
pseudo-classes with a single complex :not() containing the comma-separated list
of the three callout classes so Stylelint passes and the rule remains
equivalent, updating the block where the selector is defined in style.css (the
blockquote selector) and ensuring spacing and variable usage (border-left,
padding, margin, color) remain unchanged.
In `@src/docsfy/storage.py`:
- Around line 49-78: The dynamic SQL in update_project_status builds the fields
list from hardcoded column names which is safe but trigged an S608 false
positive; add a brief comment above the fields/values construction (near the
symbols fields, values and DB_PATH in update_project_status) stating that fields
are only populated with predetermined column names (no user-controlled input)
and therefore safe from SQL injection, or alternatively replace the implicit
appends with an explicit allowed_columns/column-to-placeholder mapping to make
the guarantee obvious to future maintainers and linters.
- Around line 8-10: DB_PATH, DATA_DIR, and PROJECTS_DIR are computed at import
time which makes tests that mutate module attributes fragile; change them to be
computed lazily by replacing the module-level constants with accessor functions
(e.g., get_db_path(), get_data_dir(), get_projects_dir()) or properties that
read os.getenv() on each call, update all call sites and tests to use these
accessors, and ensure tests set environment variables (or monkeypatch the
accessors) before calling the accessors so parallel/reimport scenarios no longer
rely on mutable module state.
In `@src/docsfy/templates/index.html`:
- Around line 43-44: The sidebar toggle button with id "sidebar-toggle" must
expose its state to ATs: add an aria-controls attribute pointing to the
controlled sidebar element id (e.g., "sidebar") and ensure the button maintains
an accurate aria-expanded boolean that is updated when the toggle runs; locate
the toggle element (id "sidebar-toggle") and the sidebar element (class or id
"sidebar") and update the toggle handler (e.g., the click listener or function
that shows/hides the sidebar) to set button.setAttribute('aria-expanded',
String(isOpen)) and button.setAttribute('aria-controls', sidebarId) whenever the
visibility changes so screen readers can detect the relationship and current
state.
In `@tests/test_renderer.py`:
- Around line 56-73: Update the test_search_index_generated test to not only
check existence of search-index.json but to open and json.load the file produced
by render_site, assert it is valid JSON (list or dict as expected by your
renderer), and assert at least one entry contains the expected page data (e.g.,
an entry with slug "intro" and/or title "Intro" or searchable content
substring). Use the same output_dir / "search-index.json" path, call json.loads
or json.load on that file, and add assertions on the structure and presence of
the expected keys/values to catch serialization regressions.
In `@tests/test_repository.py`:
- Around line 23-27: Add assertions that the patched subprocess.run was called
with the exact expected arguments to lock behavior: after the test exercise of
docsfy.repository functions that invokes subprocess.run, assert
mock_run.assert_any_call(...) (or inspect mock_run.call_args_list) includes the
expected argv list containing the command elements and the '--' separator and
that timeout kwarg is present with the expected value; reference the patched
symbol mock_run (from patch("docsfy.repository.subprocess.run")) and the
subprocess invocation in the repository code to check both positional argv
contents and keyword args (timeout) rather than only return values.
In `@tox.toml`:
- Around line 13-26: The tox env 'env.unittests' currently shells out to "uv run
--extra dev pytest -n auto" which delegates dependency management to uv; change
it to a tox-native setup by removing the "uv run" invocation and instead declare
explicit deps like "pytest" and "pytest-xdist" in the env.unittests deps list
and run "pytest -n auto tests" directly in the commands array; update the
env.unittests block (look for the deps/commands entries) so tox installs and
controls test runner dependencies rather than relying on the project's dev
extra.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a047a6c9-c6ca-44e9-bdf8-c9a414ed8eeb
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (44)
.env.example.flake8.gitleaks.toml.pre-commit-config.yamlDockerfileREADME.mddocker-compose.yamldocs/plans/2026-03-04-docsfy-design.mddocs/plans/2026-03-04-docsfy-implementation-plan.mdpyproject.tomlsrc/docsfy/__init__.pysrc/docsfy/ai_client.pysrc/docsfy/config.pysrc/docsfy/generator.pysrc/docsfy/json_parser.pysrc/docsfy/main.pysrc/docsfy/models.pysrc/docsfy/prompts.pysrc/docsfy/renderer.pysrc/docsfy/repository.pysrc/docsfy/static/callouts.jssrc/docsfy/static/codelabels.jssrc/docsfy/static/copy.jssrc/docsfy/static/github.jssrc/docsfy/static/scrollspy.jssrc/docsfy/static/search.jssrc/docsfy/static/style.csssrc/docsfy/static/theme.jssrc/docsfy/storage.pysrc/docsfy/templates/index.htmlsrc/docsfy/templates/page.htmltests/__init__.pytests/test_ai_client.pytests/test_config.pytests/test_generator.pytests/test_integration.pytests/test_json_parser.pytests/test_main.pytests/test_models.pytests/test_prompts.pytests/test_renderer.pytests/test_repository.pytests/test_storage.pytox.toml
| @@ -0,0 +1,21 @@ | |||
| # AI Configuration | |||
| AI_PROVIDER=claude | |||
| AI_MODEL=claude-opus-4-6[1m] | |||
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Anthropic Claude model IDs supported environment variable 2025 2026
💡 Result:
Environment variables (Claude Code) and what model IDs you can set (2025–2026)
1) ANTHROPIC_MODEL (directly sets the model)
Claude Code supports setting a full Anthropic model ID via ANTHROPIC_MODEL. The Claude Help Center lists these supported model IDs (at least): claude-opus-4-1-20250805, claude-opus-4-20250514, claude-sonnet-4-20250514, claude-3-7-sonnet-20250219, claude-3-5-haiku-20241022. [1]
2) Alias-mapping env vars (set what sonnet / opus / haiku mean)
Claude Code also supports environment variables that control which full model names the aliases map to: ANTHROPIC_DEFAULT_OPUS_MODEL, ANTHROPIC_DEFAULT_SONNET_MODEL, ANTHROPIC_DEFAULT_HAIKU_MODEL, plus CLAUDE_CODE_SUBAGENT_MODEL for subagents. [2]
(And ANTHROPIC_SMALL_FAST_MODEL is deprecated in favor of ANTHROPIC_DEFAULT_HAIKU_MODEL.) [2]
3) If you mean “supported model IDs” for the Anthropic API (not just Claude Code)
Use the Models API (GET /v1/models) to retrieve the current list of available model IDs; the response returns model id values (example shown: claude-sonnet-4-20250514). [3]
4) Deprecations/retirements that affect whether an ID still works
For example, Claude Sonnet 3.5 model IDs claude-3-5-sonnet-20240620 and claude-3-5-sonnet-20241022 were deprecated and scheduled for retirement (Oct 22, 2025), with a recommendation to migrate to claude-sonnet-4-20250514. [4]
Sources: [1] [2] [3] [4]
Replace the malformed default AI_MODEL value with a valid Claude model ID.
claude-opus-4-6[1m] contains a trailing ANSI escape sequence ([1m) and an incomplete model identifier. Claude model IDs require date suffixes; the correct format is claude-opus-4-20250514 (or claude-opus-4-1-20250805 for the latest version).
Suggested fix
-AI_MODEL=claude-opus-4-6[1m]
+AI_MODEL=claude-opus-4-20250514📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| AI_MODEL=claude-opus-4-6[1m] | |
| AI_MODEL=claude-opus-4-20250514 |
🧰 Tools
🪛 dotenv-linter (4.0.0)
[warning] 3-3: [UnorderedKey] The AI_MODEL key should go before the AI_PROVIDER key
(UnorderedKey)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.env.example at line 3, The AI_MODEL value in .env.example is malformed
("claude-opus-4-6[1m"); replace it with a valid Claude model ID (for example
"claude-opus-4-20250514" or the newer "claude-opus-4-1-20250805") so the
AI_MODEL environment variable uses a correct, complete model identifier without
ANSI sequences.
| [allowlist] | ||
| paths = [ | ||
| '''tests/.*\.py''', | ||
| ] |
There was a problem hiding this comment.
Narrow the GitLeaks allowlist scope.
This allowlist exempts all Python test files from secret scanning, which can mask real leaked credentials in test code/fixtures. Prefer rule-level allowlists for specific known false positives instead of path-wide exclusions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.gitleaks.toml around lines 4 - 7, The allowlist entry under the [allowlist]
section currently exempts all Python test files via the paths list
('''tests/.*\.py''') which is too broad; instead, narrow the scope by removing
the blanket paths pattern and migrate exemptions to rule-level allowlists for
specific known false positives (or list explicit filenames rather than a
directory-wide regex). Update the config to remove or tighten the
'''tests/.*\.py''' path entry and add rule-specific allowlist entries
referencing the offending rule IDs or exact test fixture filenames so only known
benign files are exempted.
| ) | ||
|
|
||
| ai_provider: str = "claude" | ||
| ai_model: str = "claude-opus-4-6[1m]" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n src/docsfy/config.py | head -30Repository: myk-org/docsfy
Length of output: 800
🌐 Web query:
Anthropic Claude valid model IDs API documentation latest 2024 2025
💡 Result:
Where to get the currently valid Claude model IDs (authoritative)
- Use the Models API:
GET /v1/modelsto list all models your API key can use; the returneddata[].idvalues are the valid model IDs. Example docs + sample response are in Anthropic’s “List Models” reference. [1] - To validate/resolve a specific ID (or alias), use
GET /v1/models/{model_id}(“Get a Model”). [2]
2024–2025 model IDs explicitly referenced in Anthropic docs / release notes
- Claude Sonnet 3.5 (deprecated Aug 13, 2025; retired Oct 22, 2025):
claude-3-5-sonnet-20240620claude-3-5-sonnet-20241022[3]
- Claude Sonnet 4 (recommended migration target in the same deprecation notice):
claude-sonnet-4-20250514[3]
- Claude Opus 3 (snapshot ID from 2024; retired Jan 5, 2026—so no longer valid today):
claude-3-opus-20240229[4]
“Latest” note about documentation support in late 2024
- Anthropic’s release notes state the Models API became generally available on Dec 17, 2024, enabling querying available models, validating model IDs, and resolving aliases. [3]
Sources:
[1] Anthropic docs — List Models (GET /v1/models) https://platform.claude.com/docs/en/api/models/list
[2] Anthropic docs — Get a Model (GET /v1/models/{model_id}) https://platform.claude.com/docs/en/api/models/retrieve
[3] Anthropic docs — API release notes (includes Sonnet 3.5 deprecation + retirement date; migration to Sonnet 4 claude-sonnet-4-20250514; Models API GA Dec 17, 2024) https://platform.claude.com/docs/en/release-notes/overview
[4] Claude Developer Platform release notes — Opus 3 retirement (claude-3-opus-20240229) https://platform.claude.com/docs/en/release-notes/overview
Default ai_model is invalid and will fail API calls.
Line 17 contains "claude-opus-4-6[1m]" which has two problems:
- The
[1m]suffix is an ANSI escape sequence, not a valid model identifier - Even without it,
claude-opus-4-6is not a supported Anthropic Claude model ID
Current valid model IDs per Anthropic documentation include claude-sonnet-4-20250514. Update to a supported model identifier.
💡 Proposed fix
- ai_model: str = "claude-opus-4-6[1m]"
+ ai_model: str = "claude-sonnet-4-20250514"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/docsfy/config.py` at line 17, The default ai_model string is invalid;
remove the ANSI suffix and replace the unsupported model id by setting the
ai_model variable (ai_model: str) to a supported Anthropic model identifier such
as "claude-sonnet-4-20250514" so API calls will succeed (i.e., change ai_model
from "claude-opus-4-6[1m]" to "claude-sonnet-4-20250514").
| cache_file = cache_dir / f"{slug}.md" | ||
| if use_cache and cache_file.exists(): |
There was a problem hiding this comment.
Reject unsafe page slugs before filesystem writes.
slug is AI-controlled and used directly in file paths. A value like ../../tmp/pwn can escape cache_dir and write outside intended storage.
🛡️ Proposed fix
+import re as _re
...
MAX_CONCURRENT_PAGES = 5
+_SLUG_RE = _re.compile(r"^[A-Za-z0-9][A-Za-z0-9._-]*$")
+def _cache_file_for_slug(cache_dir: Path, slug: str) -> Path:
+ if not _SLUG_RE.fullmatch(slug):
+ msg = f"Invalid page slug: {slug!r}"
+ raise ValueError(msg)
+ cache_root = cache_dir.resolve()
+ cache_file = (cache_root / f"{slug}.md").resolve()
+ cache_file.relative_to(cache_root)
+ return cache_file
...
- cache_file = cache_dir / f"{slug}.md"
+ cache_file = _cache_file_for_slug(cache_dir, slug)Also applies to: 87-88, 107-109
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/docsfy/generator.py` around lines 67 - 68, Reject or sanitize unsafe
AI-controlled slugs before using them to construct filesystem paths: validate
the `slug` used when creating `cache_file = cache_dir / f"{slug}.md"` (and the
other occurrences around the `use_cache` checks at the later blocks) to ensure
it contains only allowed characters (e.g., alphanumerics, hyphen, underscore),
does not contain path separators like "..", "/", or "\" and does not start with
a dot; alternatively resolve the resulting path and assert it is inside
`cache_dir` (e.g., compare resolved parents) before any open/write operations,
and raise/return an error for invalid slugs so no file can be written outside
`cache_dir`.
| @app.post("/api/generate", status_code=202) | ||
| async def generate(request: GenerateRequest) -> dict[str, str]: | ||
| settings = get_settings() | ||
| ai_provider = request.ai_provider or settings.ai_provider | ||
| ai_model = request.ai_model or settings.ai_model | ||
| project_name = request.project_name | ||
|
|
||
| if project_name in _generating: | ||
| raise HTTPException( | ||
| status_code=409, | ||
| detail=f"Project '{project_name}' is already being generated", | ||
| ) | ||
|
|
||
| _generating.add(project_name) | ||
|
|
||
| await save_project( | ||
| name=project_name, | ||
| repo_url=request.repo_url or request.repo_path or "", | ||
| status="generating", | ||
| ) | ||
|
|
||
| asyncio.create_task( | ||
| _run_generation( | ||
| repo_url=request.repo_url, | ||
| repo_path=request.repo_path, | ||
| project_name=project_name, | ||
| ai_provider=ai_provider, | ||
| ai_model=ai_model, | ||
| ai_cli_timeout=request.ai_cli_timeout or settings.ai_cli_timeout, | ||
| force=request.force, | ||
| ) | ||
| ) | ||
|
|
||
| return {"project": project_name, "status": "generating"} |
There was a problem hiding this comment.
Harden generation admission: validate project names and unwind _generating on enqueue failure.
Right now enqueue can accept a project name that later fails route validation (e.g., local directory names with spaces), and _generating can get stuck if save_project/create_task fails after insertion.
✅ Proposed fix
`@app.post`("/api/generate", status_code=202)
async def generate(request: GenerateRequest) -> dict[str, str]:
settings = get_settings()
ai_provider = request.ai_provider or settings.ai_provider
ai_model = request.ai_model or settings.ai_model
- project_name = request.project_name
+ project_name = _validate_project_name(request.project_name)
if project_name in _generating:
raise HTTPException(
status_code=409,
detail=f"Project '{project_name}' is already being generated",
)
_generating.add(project_name)
-
- await save_project(
- name=project_name,
- repo_url=request.repo_url or request.repo_path or "",
- status="generating",
- )
-
- asyncio.create_task(
- _run_generation(
- repo_url=request.repo_url,
- repo_path=request.repo_path,
- project_name=project_name,
- ai_provider=ai_provider,
- ai_model=ai_model,
- ai_cli_timeout=request.ai_cli_timeout or settings.ai_cli_timeout,
- force=request.force,
- )
- )
+ try:
+ await save_project(
+ name=project_name,
+ repo_url=request.repo_url or request.repo_path or "",
+ status="generating",
+ )
+ asyncio.create_task(
+ _run_generation(
+ repo_url=request.repo_url,
+ repo_path=request.repo_path,
+ project_name=project_name,
+ ai_provider=ai_provider,
+ ai_model=ai_model,
+ ai_cli_timeout=request.ai_cli_timeout or settings.ai_cli_timeout,
+ force=request.force,
+ )
+ )
+ except Exception:
+ _generating.discard(project_name)
+ raise
return {"project": project_name, "status": "generating"}🧰 Tools
🪛 Ruff (0.15.2)
[warning] 96-106: Store a reference to the return value of asyncio.create_task
(RUF006)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/docsfy/main.py` around lines 75 - 108, The generate route is adding
project_name into the _generating set before validating it and not removing it
if downstream save_project or task creation fails; update the generate function
to first validate/normalize project_name (e.g., disallow spaces/special chars or
run existing route-name validation) and reject with HTTPException if invalid,
then add to _generating only after validation, and wrap the save_project call
and asyncio.create_task invocation in try/except/finally so that on any
exception you remove project_name from _generating and re-raise or return a
500/appropriate HTTPException; reference the generate function, the _generating
set, save_project, and _run_generation/asyncio.create_task when implementing
these changes.
| .page-nav { | ||
| display: flex; | ||
| justify-content: space-between; | ||
| gap: 1rem; | ||
| margin-top: 3rem; | ||
| padding-top: 2rem; | ||
| border-top: 1px solid var(--border-color); | ||
| } | ||
|
|
||
| .page-nav-link { | ||
| display: flex; | ||
| flex-direction: column; | ||
| padding: 1rem 1.25rem; | ||
| border: 1px solid var(--border-color); | ||
| border-radius: 8px; | ||
| text-decoration: none; | ||
| transition: all 0.15s ease; | ||
| max-width: 50%; | ||
| } | ||
|
|
||
| .page-nav-link:hover { | ||
| border-color: var(--accent-color); | ||
| box-shadow: 0 2px 8px rgba(79, 70, 229, 0.1); | ||
| } | ||
|
|
||
| .page-nav-next { | ||
| text-align: right; | ||
| margin-left: auto; | ||
| } | ||
|
|
||
| .page-nav-label { | ||
| font-size: 0.75rem; | ||
| text-transform: uppercase; | ||
| letter-spacing: 0.05em; | ||
| color: var(--text-secondary); | ||
| margin-bottom: 0.25rem; | ||
| } | ||
|
|
||
| .page-nav-title { | ||
| font-size: 0.95rem; | ||
| font-weight: 600; | ||
| color: var(--accent-color); | ||
| } |
There was a problem hiding this comment.
Additional undefined CSS variable references in page navigation styles.
The .page-nav and related classes also reference undefined --border-color and --accent-color variables.
🐛 Proposed fix for page navigation styles
.page-nav {
display: flex;
justify-content: space-between;
gap: 1rem;
margin-top: 3rem;
padding-top: 2rem;
- border-top: 1px solid var(--border-color);
+ border-top: 1px solid var(--border-primary);
}
.page-nav-link {
display: flex;
flex-direction: column;
padding: 1rem 1.25rem;
- border: 1px solid var(--border-color);
+ border: 1px solid var(--border-primary);
border-radius: 8px;
text-decoration: none;
transition: all 0.15s ease;
max-width: 50%;
}
.page-nav-link:hover {
- border-color: var(--accent-color);
+ border-color: var(--accent);
box-shadow: 0 2px 8px rgba(79, 70, 229, 0.1);
} .page-nav-title {
font-size: 0.95rem;
font-weight: 600;
- color: var(--accent-color);
+ color: var(--accent);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| .page-nav { | |
| display: flex; | |
| justify-content: space-between; | |
| gap: 1rem; | |
| margin-top: 3rem; | |
| padding-top: 2rem; | |
| border-top: 1px solid var(--border-color); | |
| } | |
| .page-nav-link { | |
| display: flex; | |
| flex-direction: column; | |
| padding: 1rem 1.25rem; | |
| border: 1px solid var(--border-color); | |
| border-radius: 8px; | |
| text-decoration: none; | |
| transition: all 0.15s ease; | |
| max-width: 50%; | |
| } | |
| .page-nav-link:hover { | |
| border-color: var(--accent-color); | |
| box-shadow: 0 2px 8px rgba(79, 70, 229, 0.1); | |
| } | |
| .page-nav-next { | |
| text-align: right; | |
| margin-left: auto; | |
| } | |
| .page-nav-label { | |
| font-size: 0.75rem; | |
| text-transform: uppercase; | |
| letter-spacing: 0.05em; | |
| color: var(--text-secondary); | |
| margin-bottom: 0.25rem; | |
| } | |
| .page-nav-title { | |
| font-size: 0.95rem; | |
| font-weight: 600; | |
| color: var(--accent-color); | |
| } | |
| .page-nav { | |
| display: flex; | |
| justify-content: space-between; | |
| gap: 1rem; | |
| margin-top: 3rem; | |
| padding-top: 2rem; | |
| border-top: 1px solid var(--border-primary); | |
| } | |
| .page-nav-link { | |
| display: flex; | |
| flex-direction: column; | |
| padding: 1rem 1.25rem; | |
| border: 1px solid var(--border-primary); | |
| border-radius: 8px; | |
| text-decoration: none; | |
| transition: all 0.15s ease; | |
| max-width: 50%; | |
| } | |
| .page-nav-link:hover { | |
| border-color: var(--accent); | |
| box-shadow: 0 2px 8px rgba(79, 70, 229, 0.1); | |
| } | |
| .page-nav-next { | |
| text-align: right; | |
| margin-left: auto; | |
| } | |
| .page-nav-label { | |
| font-size: 0.75rem; | |
| text-transform: uppercase; | |
| letter-spacing: 0.05em; | |
| color: var(--text-secondary); | |
| margin-bottom: 0.25rem; | |
| } | |
| .page-nav-title { | |
| font-size: 0.95rem; | |
| font-weight: 600; | |
| color: var(--accent); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/docsfy/static/style.css` around lines 1075 - 1117, The CSS uses undefined
variables (--border-color, --accent-color, and --text-secondary) in selectors
like .page-nav, .page-nav-link, .page-nav-link:hover, .page-nav-label, and
.page-nav-title; fix by adding default fallbacks or defining those variables at
the root (e.g., :root) so the styles render predictably — update the stylesheet
to either declare --border-color, --accent-color, and --text-secondary with
appropriate values or change usages to var(--border-color, <fallback>),
var(--accent-color, <fallback>), and var(--text-secondary, <fallback>) in the
.page-nav and related rules (page-nav, page-nav-link, page-nav-link:hover,
page-nav-label, page-nav-title).
| var stored = localStorage.getItem('theme'); | ||
| if (stored) { | ||
| document.documentElement.setAttribute('data-theme', stored); | ||
| } else if (window.matchMedia && window.matchMedia('(prefers-color-scheme: dark)').matches) { | ||
| document.documentElement.setAttribute('data-theme', 'dark'); | ||
| } | ||
| if (toggle) { | ||
| toggle.addEventListener('click', function() { | ||
| var current = document.documentElement.getAttribute('data-theme'); | ||
| var next = current === 'dark' ? 'light' : 'dark'; | ||
| document.documentElement.setAttribute('data-theme', next); | ||
| localStorage.setItem('theme', next); | ||
| }); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's find and read the file mentioned in the review
fd -type f -name "theme.js" | grep -E "docsfy|static"Repository: myk-org/docsfy
Length of output: 228
🏁 Script executed:
# If the above doesn't find it, let's search more broadly
find . -type f -name "theme.js" 2>/dev/null | head -20Repository: myk-org/docsfy
Length of output: 86
🏁 Script executed:
# Also check the repository structure to understand the context
git ls-files | grep -E "(theme|static)" | head -20Repository: myk-org/docsfy
Length of output: 287
🏁 Script executed:
# Read the theme.js file to verify the current code state
cat -n ./src/docsfy/static/theme.jsRepository: myk-org/docsfy
Length of output: 858
🌐 Web query:
localStorage access restrictions browser security contexts where it throws exceptions
💡 Result:
When localStorage access is restricted and may throw
Per the HTML spec, window.localStorage throws a SecurityError DOMException when either (a) the document’s origin is opaque, or (b) the access violates a user-agent policy decision (for example, storage is disabled / persistence not allowed). [1]
1) Opaque origin → SecurityError
Common ways to end up with an opaque origin (often exposed as location.origin === "null") include:
- Sandboxed
<iframe>without origin retention (i.e., sandboxing that results in a unique/opaque origin) — then storage access is blocked by virtue of being opaque. [1] - Navigations/URLs that create opaque origins (commonly discussed as
data:and some sandboxed/initialabout:*cases), which again triggers the spec’s “opaque origin ⇒ throw” rule. [1]
2) Browser privacy / policy blocks (especially in embedded/3rd-party contexts) → SecurityError
- Firefox “Storage Access Policy”: if a resource is classified as a tracker and loaded in a third-party context,
Window.localStorageread/write attempts throwSecurityError. [2] - More broadly, browsers that block “third-party cookies and other site storage” may restrict access in third-party iframes; the Storage Access API exists specifically so embedded cross-site content can request access that would otherwise be denied. [3]
3) Quota exceeded on write → QuotaExceededError
localStorage.setItem(...)can throwQuotaExceededErrorwhen the origin hits its storage limit. [4][5]
What to do defensively
- Treat
localStorageas “may be unavailable” and wrap both the getter and writes intry/catch(because failure can happen at access-time, not just atsetItem). This follows directly from the spec-defined throws (SecurityError) [1] and the quota behavior (QuotaExceededError). [4][5]
Sources: [1] WHATWG HTML Standard (Web Storage) • [2] MDN Storage Access Policy (Firefox) • [3] MDN Storage Access API • [4] MDN Storage.setItem() • [5] MDN Storage quotas & eviction criteria
Guard localStorage access to prevent runtime crashes.
localStorage.getItem() and localStorage.setItem() throw SecurityError in restricted contexts (opaque origins, Firefox tracker blocking, private browsing mode, disabled storage) and QuotaExceededError when storage quota is exceeded. These exceptions will break theme initialization and toggling. Wrap both calls in try-catch blocks:
Suggested fix
- var stored = localStorage.getItem('theme');
+ var stored = null;
+ try {
+ stored = window.localStorage.getItem('theme');
+ } catch (_) {}
@@
- localStorage.setItem('theme', next);
+ try {
+ window.localStorage.setItem('theme', next);
+ } catch (_) {}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| var stored = localStorage.getItem('theme'); | |
| if (stored) { | |
| document.documentElement.setAttribute('data-theme', stored); | |
| } else if (window.matchMedia && window.matchMedia('(prefers-color-scheme: dark)').matches) { | |
| document.documentElement.setAttribute('data-theme', 'dark'); | |
| } | |
| if (toggle) { | |
| toggle.addEventListener('click', function() { | |
| var current = document.documentElement.getAttribute('data-theme'); | |
| var next = current === 'dark' ? 'light' : 'dark'; | |
| document.documentElement.setAttribute('data-theme', next); | |
| localStorage.setItem('theme', next); | |
| }); | |
| var stored = null; | |
| try { | |
| stored = window.localStorage.getItem('theme'); | |
| } catch (_) {} | |
| if (stored) { | |
| document.documentElement.setAttribute('data-theme', stored); | |
| } else if (window.matchMedia && window.matchMedia('(prefers-color-scheme: dark)').matches) { | |
| document.documentElement.setAttribute('data-theme', 'dark'); | |
| } | |
| if (toggle) { | |
| toggle.addEventListener('click', function() { | |
| var current = document.documentElement.getAttribute('data-theme'); | |
| var next = current === 'dark' ? 'light' : 'dark'; | |
| document.documentElement.setAttribute('data-theme', next); | |
| try { | |
| window.localStorage.setItem('theme', next); | |
| } catch (_) {} | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/docsfy/static/theme.js` around lines 3 - 15, Wrap all accesses to
localStorage in try-catch to avoid SecurityError/QuotaExceededError: when
reading the initial theme, guard the call to localStorage.getItem('theme') (the
variable stored) with try-catch and treat failures as "no stored theme" so the
existing prefers-color-scheme fallback runs; likewise, inside the toggle click
handler, wrap localStorage.setItem('theme', next) in try-catch so toggling still
updates data-theme even if storage write fails. Update the code around the
stored variable and the toggle.addEventListener callback to catch and ignore
storage exceptions (optionally log) without breaking theme application.
| <article class="article"> | ||
| <h1 class="article-title">{{ title }}</h1> | ||
| <div class="article-body"> | ||
| {{ content | safe }} |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Step 1: Find all uses of |safe filter in templates
echo "=== Finding |safe usage in templates ==="
rg -n "safe" src/docsfy/templates --type html -B2 -A2
echo ""
echo "=== Searching for sanitization-related code ==="
rg -n "bleach|sanitize|clean|html|escape" src/docsfy -i --type py -A3 -B1
echo ""
echo "=== Looking for renderer.py ==="
find . -name "renderer.py" -type f 2>/dev/nullRepository: myk-org/docsfy
Length of output: 5542
🏁 Script executed:
cat -n src/docsfy/renderer.py | head -100Repository: myk-org/docsfy
Length of output: 3672
🏁 Script executed:
# Find where render_page is called
rg -n "render_page|render_site" src/docsfy -B3 -A3
# Check for input validation on markdown content
rg -n "markdown_content|md_content" src/docsfy -B2 -A2 | head -60Repository: myk-org/docsfy
Length of output: 3562
🏁 Script executed:
# Find generate_all_pages definition and usage
rg -n "def generate_all_pages|generate_all_pages" src/docsfy -B2 -A5
# Check generator.py to see where markdown comes from
head -150 src/docsfy/generator.py | tail -100Repository: myk-org/docsfy
Length of output: 4345
🏁 Script executed:
# Let me check if there's any markdown configuration that might strip HTML
rg -n "markdown|Markdown" src/docsfy/generator.py -B3 -A3
# Also check what markdown library version and extensions are used
grep -r "markdown" setup.py pyproject.toml requirements.txt 2>/dev/null || echo "No dependency files found in expected locations"
find . -name "*.txt" -o -name "pyproject.toml" -o -name "setup.py" | head -10Repository: myk-org/docsfy
Length of output: 153
🏁 Script executed:
cat pyproject.toml | grep -A 50 "\[project\]" | grep -E "dependencies|markdown|bleach|sanitize"Repository: myk-org/docsfy
Length of output: 122
Sanitize AI-generated HTML before rendering as raw content.
content and toc are rendered with |safe filter, bypassing Jinja2's autoescaping. Since both originate from AI-generated markdown (via generate_page), malicious or jailbroken AI output can inject stored XSS into served documentation affecting all users. The Python markdown library does not sanitize HTML—it preserves raw tags passed through markdown.
Add sanitization using a library like bleach before rendering:
from bleach import clean
def _md_to_html(md_text: str) -> tuple[str, str]:
"""Convert markdown to HTML. Returns (content_html, toc_html)."""
md = markdown.Markdown(...)
content_html = md.convert(md_text)
# Sanitize the generated HTML
content_html = clean(content_html, tags=allowed_tags, strip=True)
toc_html = clean(getattr(md, "toc", ""), tags=allowed_tags, strip=True)
return content_html, toc_htmlThen remove |safe from the template, or rename context variables to content_sanitized and toc_sanitized to clarify the contract.
Also applies to: 102-102
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/docsfy/templates/page.html` at line 77, The template currently renders
AI-generated HTML with the Jinja2 |safe filter for variables content and toc,
which bypasses autoescaping and permits XSS; to fix, sanitize the
markdown-generated HTML before passing it to the template by updating the
markdown-to-HTML pipeline (e.g., the function _md_to_html or whichever converter
is used in generate_page) to call a sanitizer like bleach.clean on both the
converted content_html and toc_html (specifying allowed_tags/attributes and
strip=True), then either remove |safe from the template or rename the sanitized
values to content_sanitized/toc_sanitized and use those in page.html to ensure
only cleaned HTML is rendered.
| with patch.dict(os.environ, {}, clear=True): | ||
| settings = Settings() |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, find the relevant files
fd -t f "config.py|test_config.py" --type fRepository: myk-org/docsfy
Length of output: 99
🏁 Script executed:
# Check the structure
git ls-files | grep -E "(config|test)" | head -20Repository: myk-org/docsfy
Length of output: 373
🏁 Script executed:
# Examine the Settings class definition
cat -n src/docsfy/config.pyRepository: myk-org/docsfy
Length of output: 800
🏁 Script executed:
# Look at the test file
cat -n tests/test_config.pyRepository: myk-org/docsfy
Length of output: 1533
🏁 Script executed:
# Check if .env file exists
ls -la .env 2>&1 || echo "No .env file found"
fd -t f ".env" --max-one-result 2>&1 || trueRepository: myk-org/docsfy
Length of output: 362
🏁 Script executed:
# Check pydantic-settings version and documentation on env_file parameter
rg -n "pydantic" pyproject.toml setup.py requirements.txt 2>/dev/null | head -10Repository: myk-org/docsfy
Length of output: 100
🏁 Script executed:
# Search for any env_file configuration overrides or _env_file usage
rg -n "_env_file|env_file" src/ tests/Repository: myk-org/docsfy
Length of output: 165
🌐 Web query:
pydantic-settings BaseSettings disable env_file loading at instantiation
💡 Result:
To disable .env / env_file loading for a specific instantiation (even if your BaseSettings class has model_config = SettingsConfigDict(env_file=...)), pass _env_file=None:
from pydantic_settings import BaseSettings, SettingsConfigDict
class Settings(BaseSettings):
model_config = SettingsConfigDict(env_file=".env")
s = Settings(_env_file=None) # disables dotenv file loading for this instancePydantic explicitly documents that _env_file passed at instantiation overrides model_config['env_file'], and that passing None means “do not load any env file”. [1][2]
Tests are non-deterministic because .env files bypass patch.dict.
patch.dict(os.environ, ..., clear=True) only clears process env vars. Settings() still loads .env if present, so local files can alter these tests unexpectedly.
Apply to lines 12-13, 30-31, and 41-43:
Fix
- settings = Settings()
+ settings = Settings(_env_file=None)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_config.py` around lines 12 - 13, The tests are nondeterministic
because Settings() still reads .env files even when patch.dict(os.environ, {},
clear=True) is used; update each test that currently calls Settings() (the
instances created alongside patch.dict(os.environ, ..., clear=True)) to
instantiate Settings with _env_file=None (e.g., Settings(_env_file=None)) so the
Settings class won’t load any .env file and the environment is fully controlled
by the patched os.environ; keep the patch.dict(...) usage but replace bare
Settings() calls with Settings(_env_file=None).
| skipsdist = true | ||
|
|
||
| envlist = ["unused-code", "unittests"] |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
tox 4 tox.toml configuration canonical keys skipsdist envlist no_package env_list
💡 Result:
In tox 4 using TOML (tox.toml or pyproject.toml’s tool.tox), the canonical (preferred) keys are the snake_case ones:
env_list(canonical) — legacy alias:envlist. Defines the default environments to run. [2]no_package(canonical) — legacy alias:skipsdist. Whentrue, tox skips the packaging/build step (useful for apps vs libraries). [2]- Related but different:
skip_install(per-env) skips installing your project into that env (it does not control whether tox builds packages globally). [2]
If you want a quick “what’s canonical vs legacy alias” list (including envlist → env_list, skipsdist → no_package, etc.), tox-toml-fmt documents the alias normalization it applies for tox 4 TOML configs. [1]
Use canonical tox TOML keys for consistency with tox 4 standards.
At Line 1 and Line 3, replace legacy key aliases with canonical names. In tox 4 tox.toml, use no_package instead of skipsdist and env_list instead of envlist.
Proposed change
-skipsdist = true
+no_package = true
-envlist = ["unused-code", "unittests"]
+env_list = ["unused-code", "unittests"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tox.toml` around lines 1 - 3, Replace the legacy tox keys: change the
"skipsdist" setting to the canonical "no_package" and rename "envlist" to
"env_list" so the tox.toml uses tox 4 standard keys (update the entries for
"skipsdist" and "envlist" accordingly).
Summary
Implements docsfy, an AI-powered documentation generator that creates polished static HTML documentation from GitHub repositories. It uses Claude, Gemini, or Cursor CLI as AI backends to analyze codebases and produce comprehensive, browseable documentation served via a FastAPI web application.
Key Features
Architecture Overview
Flow: API request --> clone repo --> AI plans doc structure --> concurrent page generation --> render HTML --> serve/download
Commits
Documentation & Planning
docs: add docsfy implementation plan with 13 TDD tasksProject Setup
feat: project scaffolding with build config, linters, and container setupCore Modules
feat: add configuration module with pydantic-settingsfeat: add pydantic models for requests, doc plans, and project statusfeat: add SQLite storage layer for project metadatafeat: add AI CLI provider module with claude, gemini, and cursor supportfeat: add multi-strategy JSON response parser for AI CLI outputfeat: add repository cloning with shallow clone supportfeat: add AI prompt templates for planner and page generationfeat: add documentation generator with planner and concurrent page generationfeat: add HTML renderer with Jinja2 templates, dark/light theme, and searchfeat: add FastAPI application with all API endpointsTests
test: add integration test for full generate-serve-download flowFixes & Improvements
fix: resolve all pre-commit hook issuesfix: remove gcloud and cursor volume mounts from docker-composefix: add uv.lock and copy it in Dockerfile for frozen syncfix: optimize Dockerfile layer caching - CLI installs cached independently of code changesfeat: add llms.txt generation, on-this-page TOC, strip AI preamble, fix HTML renderingfeat: add UI improvements, ai-cli-runner migration, local repo support, llms.txt fixes, minimal READMEchore: remove all Mintlify referencesfix: address all code review findings - security, bugs, error handlingStats
Test Plan
docker compose upbuilds and starts the applicationPOST /api/generatewith a public GitHub repo/docs/{project}/pytest/toxfor unit and integration testsSummary by CodeRabbit
New Features
Chores