-
Notifications
You must be signed in to change notification settings - Fork 0
analysis python_parser
Python-specific parser using the built-in ast module. Extracts functions, classes, decorators, arguments, base classes, and source code segments.
| Term | Definition | Example |
|---|---|---|
| AST | Abstract Syntax Tree — a tree representation of source code structure, where each node is a language construct (function, class, if-statement, etc.). |
def add(a, b): return a+b becomes a tree: FunctionDef → [args: a, b] → [body: Return → BinOp(a + b)]. |
Source: src/codewalk/analysis/python_parser.py
Parses a Python file using ast.parse() and extracts all function and class definitions with their metadata.
Input file_path: "/repo/src/codewalk/config.py"
The file contains:
import os
class Settings:
host = "localhost"
port = 8080
@staticmethod
def load_config(path: str) -> Settings:
return Settings()Line 7: source = Path(file_path).read_text(encoding="utf-8") → the full file text
Line 12: tree = ast.parse(source) → Python AST
Line 15: lines = source.splitlines() → ["import os", "", "class Settings:", ' host = "localhost"', " port = 8080", "", "@staticmethod", "def load_config(path: str) -> Settings:", " return Settings()"]
Line 16: items = []
Line 18: ast.walk(tree) — walks every node in the AST in no particular order:
Node: class Settings (ClassDef)
-
Line 28:
isinstance(node, ast.ClassDef)✓ -
Line 29–35:
type = "class"name = "Settings"-
start_line = 3,end_line = 5 -
code = get_source_segment(lines, 3, 5)→"class Settings:\n host = \"localhost\"\n port = 8080" -
bases = [](no parent classes) -
methods = [](no methods inside)
Node: load_config (FunctionDef)
-
Line 19:
isinstance(node, ast.FunctionDef)✓ -
Line 20–27:
type = "function"name = "load_config"-
start_line = 8,end_line = 9 -
code = get_source_segment(lines, 8, 9)→"@staticmethod\ndef load_config(path: str) -> Settings:"— wait, decorators are on line 7, the function is lines 8–9 - Actually
node.linenofor a decorated function points to thedefline (8), not the decorator. Socode = "@staticmethod\ndef load_config(path: str) -> Settings:\n return Settings()"— no,lines[7:9]= lines 8–9 (0-indexed 7,8) =["def load_config(path: str) -> Settings:", " return Settings()"] -
decorators = [get_decorator_name(d) for d in node.decorator_list]→["staticmethod"] -
args = [arg.arg for arg in node.args.args]→["path"]
Return:
[
{
"type": "class",
"name": "Settings",
"start_line": 3,
"end_line": 5,
"code": "class Settings:\n host = \"localhost\"\n port = 8080",
"bases": [],
"methods": [],
},
{
"type": "function",
"name": "load_config",
"start_line": 8,
"end_line": 9,
"code": "def load_config(path: str) -> Settings:\n return Settings()",
"decorators": ["staticmethod"],
"args": ["path"],
},
]Extracts source code lines from a list, converting from 1-indexed line numbers to 0-indexed list indices.
Input lines: ["import os", "", "class Settings:", " host = \"localhost\"", " port = 8080"]
Input start: 3
Input end: 5
Line 39: lines[3-1 : 5] → lines[2:5] → ["class Settings:", ' host = "localhost"', " port = 8080"]
Line 39: "\n".join(...) → Return: "class Settings:\n host = \"localhost\"\n port = 8080"
Extracts a decorator's name from an AST node. Handles simple names, dotted names, and call-style decorators.
Input node: <ast.Name id="staticmethod">
Line 43: isinstance(node, ast.Name) ✓ → Return: "staticmethod"
Input node: <ast.Attribute: value=<Name "app">, attr="route">
Line 45: isinstance(node, ast.Attribute) ✓
Line 46: get_name(node.value) → "app", node.attr → "route"
Return: "app.route"
Input node: <ast.Call func=<Attribute "app.route">>
Line 47: isinstance(node, ast.Call) ✓
Line 48: get_decorator_name(node.func) → recurses into the Attribute case → Return: "app.route"
Extracts a dotted name string from an AST node. Used by both get_decorator_name and the bases extraction for classes.
Input node: <ast.Name id="Settings">
Line 52: isinstance(node, ast.Name) ✓ → Return: "Settings"
Input node: <ast.Attribute value=<Attribute value=<Name "src">, attr="codewalk">, attr="config">
Line 54: isinstance(node, ast.Attribute) ✓
Line 55: get_name(node.value) → recurses:
-
node.valueis<Attribute value=<Name "src">, attr="codewalk"> -
get_name(<Name "src">)→"src" - Returns
"src.codewalk"
Back in the outer call: f"src.codewalk.config" → Return: "src.codewalk.config"