Skip to content

[aw][test audit] vscode_parser: three untested caching and discovery scenarios — symlink exclusion, child-dir cache invalidation [Content truncated due to length] #876

@microsasa

Description

@microsasa

Summary

Three related untested scenarios in src/copilot_usage/vscode_parser.py that together reduce confidence in the caching and discovery logic.


Gap 1 — _scan_child_ids silently excludes symlinked directories

Location: vscode_parser.py:149–153

st = entry.stat(follow_symlinks=False)
...
if not stat.S_ISDIR(st.st_mode):
    continue

DirEntry.stat(follow_symlinks=False) returns the symlink's own stat, and S_ISDIR(symlink_mode) is False. So any child directory that is itself a symlink (e.g., window1 -> /tmp/vscode-session/window1) is excluded from child_ids.

Consequence: If a symlinked window directory gains new log files, neither root_id nor child_ids changes → _cached_discover_vscode_logs returns a stale (empty or outdated) path list indefinitely.

On some Linux configurations (e.g., XDG dirs redirected via symlinks, or docker volume mounts), this is a real failure mode.

No tests exist for the symlink case. It's also unclear whether exclusion is intentional. If intentional, a test should document and lock in the behaviour; if not, the fix is follow_symlinks=True.

Tests to add:

def test_scan_child_ids_excludes_symlinked_dirs(tmp_path: Path) -> None:
    """Symlinks to directories must be handled deterministically."""
    real_dir = tmp_path / "real_window"
    real_dir.mkdir()
    link = tmp_path / "link_window"
    link.symlink_to(real_dir)
    ids = _scan_child_ids(tmp_path)
    # Document whether symlinked dirs are included or excluded.
    # Current behaviour (follow_symlinks=False) excludes them:
    names = {name for name, _ in ids}
    assert "real_window" in names
    assert "link_window" not in names  # or assert "link_window" in names if fixed

Gap 2 — _cached_discover_vscode_logs does not test cache-miss on child-identity change

Location: vscode_parser.py:254–267

The cache key is (root_id, child_ids). A child-identity change (new window directory, deleted window directory, or window directory with updated mtime) should trigger a glob re-run.

No test verifies that creating a new child directory under a watched root causes a cache miss on the next call. Existing tests in test_vscode_parser.py cover the basic hit/miss logic for the root itself but not for changes driven solely by child_ids.

Tests to add:

def test_cache_misses_when_child_dir_added(tmp_path: Path) -> None:
    """Adding a new child directory must invalidate the discovery cache."""
    # First call — no log files, cache is populated
    logs_before = _cached_discover_vscode_logs(tmp_path)
    assert logs_before == []

    # Create window dir + log file matching _GLOB_PATTERN
    window = tmp_path / "window1" / "exthost" / "GitHub.copilot-chat"
    window.mkdir(parents=True)
    log = window / "GitHub Copilot Chat.log"
    log.touch()

    # Second call — child_ids changed → cache miss → glob reruns
    logs_after = _cached_discover_vscode_logs(tmp_path)
    assert log in logs_after

Gap 3 — get_vscode_summary partial OSError during multi-file parsing is untested

Location: vscode_parser.py:535–604

get_vscode_summary iterates over discovered log files and calls _get_cached_vscode_requests for each. If some files raise OSError (e.g., file deleted between discovery and read, permissions change), the per-file cache helper propagates the exception.

No test verifies that a partial failure (some files readable, some not) either:

  • (a) returns a partial summary with only the successfully parsed files, or
  • (b) raises the error to the caller.

The correct behaviour should be documented and tested. Currently the code likely propagates the first OSError and loses data from already-parsed files.

Tests to add:

def test_get_vscode_summary_partial_oserror(tmp_path: Path, monkeypatch: MonkeyPatch) -> None:
    """OSError on one log file should not suppress data from others."""
    # Create two valid log files, monkeypatch one to raise OSError on read
    ...
    summary = get_vscode_summary(base_path=tmp_path)
    # Assert either partial summary or propagated error — whichever is intended

Acceptance Criteria

  • _scan_child_ids symlink behaviour is covered by at least one test that explicitly documents whether symlinks are included or excluded from child_ids.
  • _cached_discover_vscode_logs has a test that adds a child directory between two calls and asserts a cache miss.
  • get_vscode_summary partial-failure behaviour is tested and the expected result is asserted.

Generated by Test Suite Analysis · ● 1.9M ·

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions