Skip to content

RFC: Address codebase anti-patterns and encapsulation issues #18

Description

@evansenter

Summary

The codebase audit identified several anti-patterns and encapsulation violations that should be addressed to improve maintainability. The most critical issue is that storage._connect() (a private method) is accessed directly from 4 external modules, creating tight coupling.

Problem / Motivation

During /audit-codebase review, these issues were discovered:

P1: Critical

  1. Encapsulation violation: queries.py and patterns.py directly call storage._connect(), bypassing the SQLiteStorage public API
  2. Hardcoded version: "0.1.0" is hardcoded in server.py:73 rather than reading from package metadata

P2: Code Quality

  1. Import aliasing: do_* prefix pattern in server.py adds noise (lines 22-30)
  2. Monolithic format_output(): 80-line if-elif chain in cli.py:18-95 violates Open/Closed principle
  3. Duplicated query building: Same WHERE clause construction pattern repeated 4 times in queries.py
  4. Empty __init__.py: No public API definition, no __all__ exports

P3: Minor

  1. Schema migration: SCHEMA_VERSION=1 defined but no migration logic
  2. Empty pass branch: Unnecessary code in ingest.py:95-98
  3. Timestamp handling: Timezone stripping undocumented

P4: Polish

  1. CLI help: Subcommand descriptions not visible in main --help

Context

  • Discovered during: /audit-codebase command
  • Relevant files: storage.py (588 lines), queries.py (431 lines), patterns.py (369 lines), cli.py (259 lines), server.py (286 lines)
  • Related PRs: Phase 7: Polish and CLI tests #16 (Phase 7: Polish)

Proposed Solution

Phase A: Encapsulation Fix (P1)

  1. Add SQLiteStorage.execute_query() public method
  2. Migrate all _connect() usage in queries.py and patterns.py
  3. Read version from importlib.metadata

Phase B: Query Builder (P2)

  1. Extract build_where_clause() helper to reduce duplication
  2. Define public API in __init__.py with __all__

Phase C: CLI Refactor (P2)

  1. Refactor format_output() to use formatter registry pattern
  2. Remove do_* import aliasing in server.py

Phase D: Polish (P3-P4)

  1. Add schema migration infrastructure
  2. Remove empty pass branch
  3. Document timestamp handling
  4. Improve CLI help text

Assumptions

Assumption Confidence Impact if Wrong
No external consumers of current internal APIs High Low - internal project
SQLite will remain the only backend Medium High - validates need for abstraction
84 tests provide sufficient coverage High Medium - may need additions

Open Questions

  1. Should execute_query() be the only public query method, or add specialized methods per use case?
  2. For schema migrations - use simple version check or full migration framework?

Actionable Requirements

# Requirement Owner Blocked By
1 Add execute_query() to SQLiteStorage Claude None
2 Migrate queries.py to public API Claude #1
3 Migrate patterns.py to public API Claude #1
4 Read version from importlib.metadata Claude None
5 Extract query builder helper Claude #2
6 Define __all__ in __init__.py Claude None
7 Refactor format_output() Claude None
8 Add schema migration logic Claude None
9 CLI help improvements Claude None

Implementation Checklist

  • Add SQLiteStorage.execute_query() public method
  • Migrate queries.py to use public API (4 locations)
  • Migrate patterns.py to use public API (4 locations)
  • Extract build_where_clause() helper
  • Read version from package metadata
  • Define __all__ in __init__.py
  • Refactor format_output() to use registry pattern
  • Remove do_* import aliasing in server.py
  • Add basic schema migration support
  • Remove empty pass branch in ingest.py
  • Add timestamp handling documentation
  • Improve CLI help text with epilog
  • Run make check to verify no regressions

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions