Summary
hcl2.dumps() becomes slow when generating large HCL documents from Python dictionaries / Builder output.
From local profiling, the slow path appears to be generation/reconstruction rather than parsing. The current pipeline builds a LarkElement tree, formats it, converts it into raw lark.Tree / lark.Token objects, then recursively reconstructs text.
This creates a large number of Python and Lark objects for data that is already structured.
Environment
- python-hcl2 version / commit: main at
e76c9ae
- Python: 3.13.5
- OS: macOS arm64
Example benchmark
Synthetic Terraform-like document with 500 resource blocks:
deserialize ~351 ms
format ~98 ms
to_lark ~676 ms
reconstruct ~142 ms
total ~1267 ms
Skipping the format pass reduced the same case to roughly:
In one 500-resource sample, formatting increased the IR from about 179k nodes to 228k nodes, and to_lark() then copied those nodes into new Lark objects.
Observed hot spots
Relevant code paths:
hcl2.api.dumps() calls from_dict() then reconstruct()
from_dict() always applies BaseFormatter by default
reconstruct() converts StartRule to raw Lark via tree.to_lark()
LarkRule.to_lark() and LarkToken.to_lark() allocate new lark.Tree / lark.Token objects for the whole document
- expression strings such as "${var.x}" are reparsed as small HCL snippets in
_deserialize_expression()
Summary
hcl2.dumps()becomes slow when generating large HCL documents from Python dictionaries /Builderoutput.From local profiling, the slow path appears to be generation/reconstruction rather than parsing. The current pipeline builds a
LarkElementtree, formats it, converts it into rawlark.Tree/lark.Tokenobjects, then recursively reconstructs text.This creates a large number of Python and Lark objects for data that is already structured.
Environment
e76c9aeExample benchmark
Synthetic Terraform-like document with 500
resourceblocks:Skipping the format pass reduced the same case to roughly:
In one 500-resource sample, formatting increased the IR from about 179k nodes to 228k nodes, and to_lark() then copied those nodes into new Lark objects.
Observed hot spots
Relevant code paths:
hcl2.api.dumps()callsfrom_dict()thenreconstruct()from_dict()always appliesBaseFormatterby defaultreconstruct()convertsStartRuleto raw Lark viatree.to_lark()LarkRule.to_lark()andLarkToken.to_lark()allocate newlark.Tree/lark.Tokenobjects for the whole document_deserialize_expression()