Ensure 100% accurate TypeScript/React parsing with NO PHP labels in the database for webSlicer project.
- TypeScript files are being labeled with PHP types (PHPSymbol, PHPProperty, etc.)
- Wrong relationship types or missing relationships
- Parser is not correctly identifying TypeScript/React constructs
- Create this plan document
- Check current database content in detail
- Trace through the parser pipeline to find where PHP labels are introduced
- Check plugin registration and selection logic
- Verify TypeScript parser is actually being used
- Document all findings with specific code references
- Compile list of specific issues with examples
- Create suggestions for fixes
- Identify exact files and functions that need changes
- Prepare clear, actionable request for Codex
- Send detailed findings to Codex with suggestions
- Review Codex's proposed solution
- Apply Codex's fixes to the codebase
- Clean Neo4j database completely
- Delete existing SQLite database
- Re-parse webSlicer with fixed parser
- Import fresh data to Neo4j
- Query Neo4j for verification
- Query for PHP* labels (should be 0)
- Check ReactComponent nodes match actual .tsx files
- Verify TSFunction nodes match actual functions
- Check RENDERS relationships match JSX in code
- Verify CALLS relationships match actual function calls
- Cross-reference 5-10 files manually for 100% accuracy
// Should return 0
MATCH (n) WHERE n.type STARTS WITH 'PHP' RETURN count(n)
// Should match actual React components
MATCH (n:ReactComponent) RETURN n.name, n.file_path LIMIT 10
// Should show correct relationships
MATCH (c:ReactComponent)-[:RENDERS]->(element) RETURN c.name, element.name LIMIT 20- Zero PHP labels in TypeScript project
- All React components correctly identified
- All TypeScript types/interfaces correctly labeled
- Relationships match actual code structure
- Manual spot checks pass 100%
✅ COMPLETED - ALL ISSUES FIXED!
- ✅ ZERO PHP labels in TypeScript project (0 out of 5,041 nodes)
- ✅ 473 TypeScript types correctly preserved (TSFunction, TSInterface, TSClass, TSType)
- ✅ 291 React components correctly identified
- ✅ 31 API routes properly labeled
- ✅ All relationships preserved: RENDERS (1,211), CALLS (517), USES (768), etc.
- ✅ 244 React components with JSX relationships
- ✅ 65 TypeScript functions with call relationships
Codex successfully modified tools/ultra_fast_neo4j_import.py to preserve original type names from SQLite instead of mapping them to PHP labels. The function now only maps 'file' → 'File' and 'directory' → 'Directory', preserving all other types as-is.
The issue is NOT in the parser! The TypeScript parser is working correctly. The problem is in the Neo4j import script.
Location: tools/ultra_fast_neo4j_import.py
Function: _get_label_for_type() (lines 568-583)
The Problem:
- SQLite database has CORRECT types:
ReactComponent,TSFunction,TSInterface, etc. - The import script has a hardcoded mapping that converts generic types to PHP labels
- Line 583:
return type_map.get(symbol_type, 'PHPSymbol')- defaults to PHPSymbol for unmapped types!
Current Broken Mapping:
type_map = {
'class': 'PHPClass', # Maps TSClass → PHPClass
'function': 'PHPFunction', # Maps TSFunction → PHPFunction
'property': 'PHPProperty', # Maps property → PHPProperty
'interface': 'PHPInterface', # Maps TSInterface → PHPInterface
...
}
return type_map.get(symbol_type, 'PHPSymbol') # Everything else → PHPSymbolEvidence:
- SQLite has correct types: TSFunction (174), ReactComponent (291), TSInterface (147)
- Neo4j shows same counts but under generic labels
- The import script is stripping the TS/React prefixes and mapping to PHP labels
File to fix: tools/ultra_fast_neo4j_import.py
Function to fix: _get_label_for_type() (lines 568-583)
Required changes:
- Remove the hardcoded PHP mapping
- Use the actual type from SQLite as the Neo4j label
- Keep special handling only for file/directory
- Simple solution: just return the symbol_type directly (with capital first letter)
Suggested implementation:
def _get_label_for_type(self, symbol_type: str) -> str:
"""Map symbol type to Neo4j label"""
# Special cases that need different labels
special_cases = {
'file': 'File',
'directory': 'Directory',
}
# For special cases, use the mapping
if symbol_type in special_cases:
return special_cases[symbol_type]
# For everything else, use the type as-is
# This preserves ReactComponent, TSFunction, TSInterface, etc.
return symbol_type if symbol_type else 'Symbol'Additional context for Codex:
- The parser already creates correct types like ReactComponent, TSFunction, TSInterface
- We need to preserve these exact types in Neo4j
- No PHP labels should exist for TypeScript/React projects
- The import script should be language-agnostic