Skip to content

Commit 1464fe0

Browse files
committed
Add template automation regression suite
1 parent e932f8c commit 1464fe0

137 files changed

Lines changed: 2153 additions & 11 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CHANGELOG.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,17 @@
22

33
모든 중요한 변경 사항은 이 문서에 기록됩니다. 형식은 [Keep a Changelog](https://keepachangelog.com/ko/1.1.0/)[Semantic Versioning](https://semver.org/lang/ko/)을 따릅니다.
44

5+
## [2.8.1] - 2026-03-08
6+
### 추가
7+
- 템플릿 자동화 회귀 스위트를 추가했습니다 (`tests/template_automation/`). 단순 토큰, 반복 토큰, split-run, 공백 정규화, 표/머리글/바닥글/다중 섹션, 체크박스 토글, extract-repack, 비표준 rootfile 패턴을 대표 fixture + 시나리오 계약으로 점검합니다.
8+
- `DevDoc/template-automation-regression-suite.md`를 추가해 스위트의 보장 범위, 한계, fixture 추가 절차를 문서화했습니다.
9+
10+
### 변경
11+
- 실제 `lxml` 기반 문서에서 `set_header_text()`/`set_footer_text()`가 동작하도록 header/footer 생성 경로를 XML 엔진 호환 방식으로 정리했습니다.
12+
- 섹션 속성(`secPr`)이 비어 있을 때 보강 생성하는 경로를 XML 엔진 호환 방식으로 정리했습니다.
13+
- `add_section()`이 새 섹션을 잘못된 네임스페이스로 만들던 문제를 수정했습니다.
14+
- mypy/pyright gradual scope에 이번에 추가한 template automation helper/generator 모듈을 포함했습니다.
15+
516
## [2.8] - 2026-03-08
617
### 변경
718
- `HwpxPackage`와 OXML 로딩/저장이 rootfile/manifest-relative 경로를 실제로 따르도록 정렬했습니다.
Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
# Template Automation Regression Suite
2+
3+
## Goal
4+
5+
This suite is for **repeatable template automation patterns that the project explicitly covers**.
6+
It is not a claim that every arbitrary HWPX template is safe to automate.
7+
8+
The regression cases focus on four questions:
9+
10+
1. Is the automation step reproducible?
11+
2. Does it validate after modification?
12+
3. Does it preserve structure closely enough to avoid obvious package/layout drift?
13+
4. Does it make silent failure visible through explicit counts or explicit errors?
14+
15+
## Fixture Layout
16+
17+
Fixtures live under `tests/template_automation/fixtures/<fixture-id>/`.
18+
19+
Each fixture contains:
20+
21+
- `scenario.json`
22+
- `package/` - a pack-ready extracted HWPX workspace with `.hwpx-pack-metadata.json`
23+
24+
The suite repacks these workspaces with `hwpx-pack` behavior during tests instead of storing opaque `.hwpx` binaries directly. That keeps fixture diffs reviewable and exercises the real pack/unpack path.
25+
26+
To regenerate the extracted fixture packages:
27+
28+
```bash
29+
PYTHONPATH=src python3 tests/template_automation/generate_fixtures.py
30+
```
31+
32+
## Covered Fixture Categories
33+
34+
- `simple-placeholder`: single token in a normal body paragraph
35+
- `repeated-placeholder`: one logical value repeated across multiple locations
36+
- `split-run-placeholder`: token split across runs, where exact token replacement must not silently pretend success
37+
- `whitespace-variant`: uneven internal spacing that only matches when normalized replacement is requested explicitly
38+
- `table-placeholder`: token inside a table cell
39+
- `header-footer-placeholder`: header/footer token handling
40+
- `multi-section-placeholder`: section-targeted replacement
41+
- `checkbox-toggle`: explicit checkbox/symbol toggles
42+
- `extract-repack`: analyze -> extract -> patch -> repack workflow
43+
- `nonstandard-rootfile`: engine-openable package with a nondefault rootfile path
44+
45+
## What The Suite Protects Against
46+
47+
- Exact token replacement returning success when nothing actually matched
48+
- Split-run placeholders being mistaken for normal contiguous tokens
49+
- Missing-token operations silently doing no work when the caller required a replacement
50+
- Multi-location replacement losing count information
51+
- Table/header/footer/section-specific automation accidentally being tested only against top-level body text
52+
- Extracted workspaces that repack into invalid or engine-unopenable archives
53+
- Simple text substitutions causing unexpected structural drift according to `hwpx-page-guard`
54+
55+
## What The Suite Does Not Guarantee
56+
57+
- Correct final rendering in Hancom Office
58+
- True rendered page counts
59+
- Safety for arbitrary real-world templates with unknown controls, fields, or editor-specific behaviors
60+
- Semantic correctness of a template beyond the covered operation contract
61+
62+
`hwpx-page-guard` is used here exactly as documented: a **layout-drift proxy**, not a renderer.
63+
64+
## Operation Terms
65+
66+
### Exact Replacement
67+
68+
Literal search and replacement against explicit target surfaces such as body paragraphs, table paragraphs, headers, or footers.
69+
This is the safest covered mode when the placeholder is actually contiguous text.
70+
71+
### Normalized-Text Replacement
72+
73+
Matches a logical phrase after removing whitespace differences.
74+
This is broader than exact token replacement and should only be used when the caller explicitly wants whitespace tolerance.
75+
76+
### Token-Based Replacement
77+
78+
An exact replacement flow aimed at explicit placeholders such as `{{NAME}}`.
79+
It is intentionally conservative: if the token is split across runs, the suite expects zero matches or an explicit error, not magical reconstruction.
80+
81+
### Structural Safety vs Semantic Template Correctness
82+
83+
Structural safety means the package still opens, validates, and stays within expected structure/layout-drift thresholds.
84+
Semantic template correctness is a stronger claim about whether the template still means the right thing to a human reader. This suite does not try to prove the latter in the general case.
85+
86+
## Adding A New Regression From A Bug Report
87+
88+
1. Reduce the bug to the smallest template pattern that still reproduces the failure.
89+
2. Add a new fixture directory under `tests/template_automation/fixtures/`.
90+
3. Capture the package in extracted form with the smallest possible synthetic content.
91+
4. Add one or more scenarios to `scenario.json` that describe:
92+
- the operation
93+
- the expected replacement count or expected explicit failure
94+
- the postconditions that should stay true
95+
5. If you need a new automation mode in the helper layer, keep it narrow and evidence-driven.
96+
6. Regenerate the fixture package workspace if the source builder changed.
97+
7. Run the targeted template automation tests plus validators/type checks you touched.
98+
99+
If a bug only reproduces for one very specific document, do not describe the fix as “general template support” unless the operation truly generalizes.

pyproject.toml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
44

55
[project]
66
name = "python-hwpx"
7-
version = "2.8"
7+
version = "2.8.1"
88
description = "Hancom HWPX 패키지를 로드하고 편집하기 위한 Python 유틸리티 모음"
99
readme = { file = "README.md", content-type = "text/markdown" }
1010
license = { file = "LICENSE" }
@@ -89,6 +89,8 @@ files = [
8989
"src/hwpx/tools/template_analyzer.py",
9090
"src/hwpx/tools/text_extract_cli.py",
9191
"src/hwpx/tools/text_extractor.py",
92+
"tests/template_automation/helpers.py",
93+
"tests/template_automation/generate_fixtures.py",
9294
]
9395
ignore_missing_imports = true
9496

@@ -108,6 +110,8 @@ include = [
108110
"src/hwpx/tools/template_analyzer.py",
109111
"src/hwpx/tools/text_extract_cli.py",
110112
"src/hwpx/tools/text_extractor.py",
113+
"tests/template_automation/helpers.py",
114+
"tests/template_automation/generate_fixtures.py",
111115
]
112116
pythonVersion = "3.10"
113117
typeCheckingMode = "basic"

src/hwpx/oxml/document.py

Lines changed: 16 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,8 @@
4747

4848
_HP_NS = "http://www.hancom.co.kr/hwpml/2011/paragraph"
4949
_HP = f"{{{_HP_NS}}}"
50+
_HS_NS = "http://www.hancom.co.kr/hwpml/2011/section"
51+
_HS = f"{{{_HS_NS}}}"
5052
_HH_NS = "http://www.hancom.co.kr/hwpml/2011/head"
5153
_HH = f"{{{_HH_NS}}}"
5254

@@ -535,18 +537,22 @@ def _initial_sublist_attributes(self) -> dict[str, str]:
535537
def _ensure_text_element(self) -> ET.Element:
536538
sublist = self.element.find(f"{_HP}subList")
537539
if sublist is None:
538-
sublist = ET.SubElement(self.element, f"{_HP}subList", self._initial_sublist_attributes())
540+
sublist = _append_child(
541+
self.element,
542+
f"{_HP}subList",
543+
self._initial_sublist_attributes(),
544+
)
539545
paragraph = sublist.find(f"{_HP}p")
540546
if paragraph is None:
541547
paragraph_attrs = dict(_DEFAULT_PARAGRAPH_ATTRS)
542548
paragraph_attrs["id"] = _paragraph_id()
543-
paragraph = ET.SubElement(sublist, f"{_HP}p", paragraph_attrs)
549+
paragraph = _append_child(sublist, f"{_HP}p", paragraph_attrs)
544550
run = paragraph.find(f"{_HP}run")
545551
if run is None:
546-
run = ET.SubElement(paragraph, f"{_HP}run", {"charPrIDRef": "0"})
552+
run = _append_child(paragraph, f"{_HP}run", {"charPrIDRef": "0"})
547553
text = run.find(f"{_HP}t")
548554
if text is None:
549-
text = ET.SubElement(run, f"{_HP}t")
555+
text = _append_child(run, f"{_HP}t")
550556
return text
551557

552558
@property
@@ -851,7 +857,7 @@ def _ensure_header_footer_apply(
851857
attrs = {"applyPageType": page_type}
852858
if header_id is not None:
853859
attrs[self._apply_id_attributes(tag)[0]] = header_id
854-
apply = ET.SubElement(self.element, f"{_HP}{tag}Apply", attrs)
860+
apply = _append_child(self.element, f"{_HP}{tag}Apply", attrs)
855861
changed = True
856862
else:
857863
if apply.get("applyPageType") != page_type:
@@ -897,7 +903,7 @@ def _ensure_header_footer(self, tag: str, page_type: str) -> ET.Element:
897903
element = self._find_header_footer(tag, page_type)
898904
changed = False
899905
if element is None:
900-
element = ET.SubElement(
906+
element = _append_child(
901907
self.element,
902908
f"{_HP}{tag}",
903909
{"id": _object_id(), "applyPageType": page_type},
@@ -3493,11 +3499,11 @@ def _ensure_section_properties_element(self) -> ET.Element:
34933499
if paragraph is None:
34943500
paragraph_attrs = dict(_DEFAULT_PARAGRAPH_ATTRS)
34953501
paragraph_attrs["id"] = _paragraph_id()
3496-
paragraph = ET.SubElement(self._element, f"{_HP}p", paragraph_attrs)
3502+
paragraph = _append_child(self._element, f"{_HP}p", paragraph_attrs)
34973503
run = paragraph.find(f"{_HP}run")
34983504
if run is None:
3499-
run = ET.SubElement(paragraph, f"{_HP}run", {"charPrIDRef": "0"})
3500-
element = ET.SubElement(run, f"{_HP}secPr")
3505+
run = _append_child(paragraph, f"{_HP}run", {"charPrIDRef": "0"})
3506+
element = _append_child(run, f"{_HP}secPr")
35013507
self._properties_cache = None
35023508
self.mark_dirty()
35033509
return element
@@ -4660,7 +4666,7 @@ def add_section(self, *, after: int | None = None) -> HwpxOxmlSection:
46604666
part_name = f"Contents/{section_id}.xml"
46614667

46624668
# Build minimal section XML
4663-
section_element = ET.Element(f"{_HP}sec")
4669+
section_element = ET.Element(f"{_HS}sec")
46644670
para_attrs = {"id": _paragraph_id(), **_DEFAULT_PARAGRAPH_ATTRS}
46654671
para = ET.SubElement(section_element, f"{_HP}p", para_attrs)
46664672
run = ET.SubElement(para, f"{_HP}run", {"charPrIDRef": "0"})

tests/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
{
2+
"format_version": 1,
3+
"entries": [
4+
{
5+
"path": "mimetype",
6+
"compress_type": 0
7+
},
8+
{
9+
"path": "Contents/content.hpf",
10+
"compress_type": 8
11+
},
12+
{
13+
"path": "Contents/header.xml",
14+
"compress_type": 8
15+
},
16+
{
17+
"path": "Contents/section0.xml",
18+
"compress_type": 8
19+
},
20+
{
21+
"path": "META-INF/container.rdf",
22+
"compress_type": 8
23+
},
24+
{
25+
"path": "META-INF/container.xml",
26+
"compress_type": 8
27+
},
28+
{
29+
"path": "META-INF/manifest.xml",
30+
"compress_type": 8
31+
},
32+
{
33+
"path": "Preview/PrvImage.png",
34+
"compress_type": 8
35+
},
36+
{
37+
"path": "Preview/PrvText.txt",
38+
"compress_type": 8
39+
},
40+
{
41+
"path": "settings.xml",
42+
"compress_type": 8
43+
},
44+
{
45+
"path": "version.xml",
46+
"compress_type": 8
47+
}
48+
]
49+
}
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?><opf:package xmlns:ha="http://www.hancom.co.kr/hwpml/2011/app" xmlns:hp="http://www.hancom.co.kr/hwpml/2011/paragraph" xmlns:hp10="http://www.hancom.co.kr/hwpml/2016/paragraph" xmlns:hs="http://www.hancom.co.kr/hwpml/2011/section" xmlns:hc="http://www.hancom.co.kr/hwpml/2011/core" xmlns:hh="http://www.hancom.co.kr/hwpml/2011/head" xmlns:hhs="http://www.hancom.co.kr/hwpml/2011/history" xmlns:hm="http://www.hancom.co.kr/hwpml/2011/master-page" xmlns:hpf="http://www.hancom.co.kr/schema/2011/hpf" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf/" xmlns:ooxmlchart="http://www.hancom.co.kr/hwpml/2016/ooxmlchart" xmlns:hwpunitchar="http://www.hancom.co.kr/hwpml/2016/HwpUnitChar" xmlns:epub="http://www.idpf.org/2007/ops" xmlns:config="urn:oasis:names:tc:opendocument:xmlns:config:1.0" version="" unique-identifier="" id=""><opf:metadata><opf:title/><opf:language>ko</opf:language><opf:meta name="creator" content="text">kokyu</opf:meta><opf:meta name="subject" content="text"/><opf:meta name="description" content="text"/><opf:meta name="lastsaveby" content="text">kokyu</opf:meta><opf:meta name="CreatedDate" content="text">2025-09-17T04:32:50Z</opf:meta><opf:meta name="ModifiedDate" content="text">2025-09-17T04:33:13Z</opf:meta><opf:meta name="date" content="text">2025년 9월 17일 수요일 오후 1:32:50</opf:meta><opf:meta name="keyword" content="text"/></opf:metadata><opf:manifest><opf:item id="header" href="Contents/header.xml" media-type="application/xml"/><opf:item id="section0" href="Contents/section0.xml" media-type="application/xml"/><opf:item id="settings" href="settings.xml" media-type="application/xml"/></opf:manifest><opf:spine><opf:itemref idref="header" linear="yes"/><opf:itemref idref="section0" linear="yes"/></opf:spine></opf:package>

tests/template_automation/fixtures/checkbox-toggle/package/Contents/header.xml

Lines changed: 1 addition & 0 deletions
Large diffs are not rendered by default.
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
<?xml version='1.0' encoding='utf-8'?>
2+
<hs:sec xmlns:hp="http://www.hancom.co.kr/hwpml/2011/paragraph" xmlns:hs="http://www.hancom.co.kr/hwpml/2011/section"><hp:p id="3121190098" paraPrIDRef="0" styleIDRef="0" pageBreak="0" columnBreak="0" merged="0"><hp:run charPrIDRef="0"><hp:secPr id="" textDirection="HORIZONTAL" spaceColumns="1134" tabStop="8000" tabStopVal="4000" tabStopUnit="HWPUNIT" outlineShapeIDRef="1" memoShapeIDRef="0" textVerticalWidthHead="0" masterPageCnt="0"><hp:grid lineGrid="0" charGrid="0" wonggojiFormat="0" /><hp:startNum pageStartsOn="BOTH" page="0" pic="0" tbl="0" equation="0" /><hp:visibility hideFirstHeader="0" hideFirstFooter="0" hideFirstMasterPage="0" border="SHOW_ALL" fill="SHOW_ALL" hideFirstPageNum="0" hideFirstEmptyLine="0" showLineNumber="0" /><hp:lineNumberShape restartType="0" countBy="0" distance="0" startNumber="0" /><hp:pagePr landscape="WIDELY" width="59528" height="84186" gutterType="LEFT_ONLY"><hp:margin header="4252" footer="4252" gutter="0" left="8504" right="8504" top="5668" bottom="4252" /></hp:pagePr><hp:footNotePr><hp:autoNumFormat type="DIGIT" userChar="" prefixChar="" suffixChar=")" supscript="0" /><hp:noteLine length="-1" type="SOLID" width="0.12 mm" color="#000000" /><hp:noteSpacing betweenNotes="283" belowLine="567" aboveLine="850" /><hp:numbering type="CONTINUOUS" newNum="1" /><hp:placement place="EACH_COLUMN" beneathText="0" /></hp:footNotePr><hp:endNotePr><hp:autoNumFormat type="DIGIT" userChar="" prefixChar="" suffixChar=")" supscript="0" /><hp:noteLine length="14692344" type="SOLID" width="0.12 mm" color="#000000" /><hp:noteSpacing betweenNotes="0" belowLine="567" aboveLine="850" /><hp:numbering type="CONTINUOUS" newNum="1" /><hp:placement place="END_OF_DOCUMENT" beneathText="0" /></hp:endNotePr><hp:pageBorderFill type="BOTH" borderFillIDRef="1" textBorder="PAPER" headerInside="0" footerInside="0" fillArea="PAPER"><hp:offset left="1417" right="1417" top="1417" bottom="1417" /></hp:pageBorderFill><hp:pageBorderFill type="EVEN" borderFillIDRef="1" textBorder="PAPER" headerInside="0" footerInside="0" fillArea="PAPER"><hp:offset left="1417" right="1417" top="1417" bottom="1417" /></hp:pageBorderFill><hp:pageBorderFill type="ODD" borderFillIDRef="1" textBorder="PAPER" headerInside="0" footerInside="0" fillArea="PAPER"><hp:offset left="1417" right="1417" top="1417" bottom="1417" /></hp:pageBorderFill></hp:secPr><hp:ctrl><hp:colPr id="" type="NEWSPAPER" layout="LEFT" colCount="1" sameSz="1" sameGap="0" /></hp:ctrl><hp:t>□ Option A</hp:t></hp:run></hp:p><hp:p id="3380610467" paraPrIDRef="0" styleIDRef="0" pageBreak="0" columnBreak="0" merged="0"><hp:run charPrIDRef="0"><hp:t>■ Option B</hp:t></hp:run></hp:p></hs:sec>

0 commit comments

Comments
 (0)