The folder-datetime-fix tool is designed with a modular, extensible architecture that separates concerns into distinct layers:
- Command-line Interface (CLI) - User interaction layer
- DazzleTreeLib Integration - Universal tree traversal with adapters
- Analysis Strategies - Different approaches to folder traversal (now using DazzleTreeLib)
- Folder Scanner - Core timestamp computation engine (powered by DazzleTreeLib)
- Cache System - Performance optimization with completeness tracking
- Visualization - Tree rendering and progress display
- Timestamp Fixer - Actual filesystem modification
As of version 0.7.0, folder-datetime-fix has been migrated to use DazzleTreeLib as its tree traversal engine. This provides:
- 78% code reduction - Removed ~1,094 lines of custom traversal code
- O(1) depth tracking - 5-10x performance improvement over O(depth) recalculation
- Composable adapters - Stack adapters for timestamp calculation, caching, and depth tracking
- Async/await patterns - Modern Python async support throughout
- Post-order traversal - Native bottom-up processing for TreeStrategy
User Input (CLI)
↓
┌─────────────────────────────────────────────────┐
│ cli.py (main) │
│ - Parses arguments (--depth, --strategy, etc) │
│ - Selects analysis strategy via --analyze │
│ - Configures scanner and fixer │
└─────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────┐
│ StrategyFactory.create_strategy() │
│ - Creates appropriate strategy instance │
│ - Applies modifiers (no-cache, etc) │
└─────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────┐
│ DazzleStrategy (Abstract) │
│ - StandardDazzleStrategy (adaptive caching) │
│ - LowMemoryDazzleStrategy (no cache) │
│ - TreeDazzleStrategy (bottom-up with tree) │
│ - FolderOnlyDazzleStrategy (minimal memory) │
│ - AutoStrategy → StandardDazzleStrategy │
└─────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────┐
│ DazzleTreeLib Stack │
│ - AsyncFileSystemAdapter (base I/O) │
│ - TimestampCalculationAdapter (shallow/deep) │
│ - CompletenessAwareCacheAdapter (LRU cache) │
│ - DepthTrackingAdapter (O(1) depth) │
│ - Post-order traversal functions │
└─────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────┐
│ SmartStreamingCache │
│ - get_or_compute() with CacheCompleteness │
│ - Tracks: NONE, SHALLOW, PARTIAL_2/3, COMPLETE │
│ - Memory-bounded with LRU eviction │
└─────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────┐
│ FolderTimestampFixer │
│ - fix_folder_timestamp() │
│ - Applies computed timestamps to folders │
└─────────────────────────────────────────────────┘
Different analysis strategies implement the same AnalysisStrategy interface:
class AnalysisStrategy(ABC):
@abstractmethod
def analyze(self, base_path: Path, depths: List[int]) -> List[Tuple[Path, Optional[datetime]]]:
passThis allows runtime selection via --analyze:
--analyze=tree→ TreeStrategy--analyze=folder-only→ FolderOnlyStrategy--analyze=low-memory→ LowMemoryStrategy--analyze=auto→ AutoStrategy (selects based on path characteristics)
The cache system tracks how thoroughly each folder has been scanned:
class CacheCompleteness(Enum):
NONE = 0 # Not scanned
SHALLOW = 1 # Immediate children only
PARTIAL_2 = 2 # 2 levels deep
PARTIAL_3 = 3 # 3 levels deep
COMPLETE = 999 # Fully recursiveThis enables intelligent cache reuse:
- If a folder is marked
COMPLETE, no need to rescan - If marked
PARTIAL_2but we need depth 3, rescan required - If marked
COMPLETEbut we only need depth 1, use cached data
Each component has a single responsibility:
| Component | Responsibility |
|---|---|
| CLI | Parse arguments, coordinate components |
| AnalysisStrategy | Decide HOW to traverse folders |
| FolderScanner | Compute timestamps for folders |
| SmartStreamingCache | Store and retrieve computed results |
| FolderTimestampFixer | Apply timestamps to filesystem |
| TreeVisualizer | Render folder structures |
- CLI creates
StandardStrategywith aFolderScanner - StandardStrategy calls
scanner.scan_and_collect() - Scanner uses cache via
cache.get_or_compute() - Cache checks completeness and returns cached or computes new
- Results returned to strategy → CLI → Fixer
- CLI creates
TreeStrategywith aFolderScanner - TreeStrategy builds its own tree structure
- During tree building, checks cache for complete branches
- Computes timestamps bottom-up (children before parents)
- Stores results back to cache with completeness levels
- Returns results for requested depths
- CLI creates
FolderOnlyStrategywith aFolderScanner - Strategy does its own
os.walk()traversal - For each folder:
- Checks cache for sufficient completeness
- If cache hit, uses cached timestamp
- If cache miss, computes timestamp on-the-fly
- Stores result in cache with completeness level
- Prunes traversal when complete folders found
-
FolderScanner methods:
get_shallow_timestamp()→cache.get_or_compute(path, "shallow")get_deep_timestamp()→cache.get_or_compute(path, "deep")get_smart_timestamp()→cache.get_or_compute(path, "smart")
-
FolderOnlyStrategy:
- Checks cache before computing
- Stores results with completeness
- Prunes traversal on complete folders
-
TreeStrategy (after fixes):
- Checks cache during tree building
- Stores computed timestamps with completeness
- Prunes complete branches
# When storing in cache:
if not has_subdirs:
completeness = COMPLETE
elif at_max_depth:
completeness = SHALLOW
else:
remaining_depth = max_depth - current_depth
completeness = from_depth(remaining_depth)
# When checking cache sufficiency:
if cached.completeness == COMPLETE:
return True # Always sufficient
if needed_depth <= cached_actual_depth:
return True # Have enough depth
return False # Need to recompute- Create class extending
AnalysisStrategy - Implement
analyze()method - Add to
StrategyFactory.create_strategy() - Update
get_available_strategies()
Example:
class CustomStrategy(AnalysisStrategy):
def analyze(self, base_path: Path, depths: List[int]):
# Custom traversal logic
# Can use self.scanner for timestamp computation
# Should integrate with cache for performance
pass- Extend
TreeVisualizeror create new visualizer - Add to CLI options
- Integrate with processing loop
The cache interface is simple:
cache.get_or_compute(path, strategy) -> (mtime, completeness)
cache.cache[path] = SmartCacheEntry(...)Could replace with Redis, SQLite, or other backends.
| Strategy | Memory per 10K folders | Use Case |
|---|---|---|
| folder-only | ~1MB | Ultra-minimal |
| low-memory | <1MB | Massive trees |
| tree | ~2MB | Deep hierarchies |
| standard | ~3.5MB | General use |
With completeness tracking:
- First run: 0% hit rate (cold cache)
- Second run (same depths): 95-100% hit rate
- Incremental deeper: 50-80% hit rate (reuses shallow)
- Pruning: Stop traversing when complete folders found
- Depth filtering: Only process requested depths
- System file skipping: Ignore
__pycache__,.git, etc.
CLI Arguments
↓
--analyze=tree,no-cache (comma-separated modifiers)
↓
StrategyFactory.create_strategy()
↓
Parse: strategy="tree", modifiers=["no-cache"]
↓
Create TreeStrategy
↓
Apply modifiers (disable cache)
↓
Return configured strategy
test_analysis_strategies.py- Strategy implementationstest_cache_completeness.py- Cache completeness logictest_folder_scanner.py- Scanner methodstest_tree_strategy.py- Tree-specific teststest_folderonly_completeness.py- FolderOnly cache integration
- Full pipeline tests with real directory structures
- Cache persistence across runs
- Performance benchmarks
--depth-to N- Specify range 0 to N easily- Parallel processing - Multi-threaded traversal
- Remote caching - Shared cache across machines
- Watch mode - Monitor and fix in real-time
- Plugin system for custom strategies
- Export formats for analysis results
- Web UI for visualization
- API mode for integration
The architecture is designed for:
- Modularity: Each component has a single responsibility
- Extensibility: Easy to add new strategies and features
- Performance: Cache with completeness tracking
- Flexibility: Multiple strategies for different use cases
- Testability: Clear interfaces and separation of concerns
The key insight is that analysis strategies control HOW to traverse, while FolderScanner computes timestamps, and SmartStreamingCache optimizes performance with completeness tracking. This separation allows for powerful combinations like --analyze=tree,no-cache or --analyze=auto --strategy=deep.