Architecture: The Cognitive Logic API Pattern

The problem this library solves

When you ask an LLM to build a Goldman Sachs financial table, it can describe perfectly what should be on the slide:

Right-aligned dollar columns
Bold revenue / EBITDA / net income rows
Italic gray % growth and % margin rows
Header row in navy with white text
Source citation at the bottom in 8pt italic gray

But when you ask the same LLM to write the Python code (Inches(2.5), Pt(11), RGBColor(0x1B, 0x36, 0x5D)), it makes mistakes. Across multiple iterations:

Right-alignment gets applied inconsistently
Bar heights are not proportional to data
Reference line labels overlap titles
Source text gets cut off below the footer
Text boxes are sized too narrow, causing wrap

These aren't model intelligence problems. They're separation of concerns problems. The LLM is being asked to do two fundamentally different tasks:

Content reasoning — what should the slide say, which data matters, what's the action title (LLM is great at this)
Spatial execution — pixel placement, EMU coordinates, proportional sizing (LLM is bad at this)

The solution: Cognitive Logic API

Separate the two layers entirely:

┌─────────────────────────────────────┐
│  LAYER 1: CONTENT REASONING         │
│                                     │
│  LLM (or human) decides:            │
│  - Which template fits this slide?  │
│  - What's the action title?         │
│  - What data goes in the table?     │
│  - Which row should be highlighted? │
└────────────────┬────────────────────┘
                 │
                 │  Output: structured JSON
                 │
                 ▼
┌─────────────────────────────────────┐
│  LAYER 2: STRUCTURED SPEC           │
│                                     │
│  {                                  │
│    "title": "...",                  │
│    "headers": [...],                │
│    "rows": [                        │
│      {"label", "values", "style"}   │
│    ],                               │
│    "source": "..."                  │
│  }                                  │
└────────────────┬────────────────────┘
                 │
                 │  Pure data — no coordinates
                 │
                 ▼
┌─────────────────────────────────────┐
│  LAYER 3: DETERMINISTIC RENDERER    │
│                                     │
│  Pre-built Python functions that:   │
│  - Compute column widths            │
│  - Compute proportional bar heights │
│  - Apply right-alignment ALWAYS     │
│  - Apply bold/italic per row style  │
│  - Position source line dynamically │
│  - Manage spacing/padding/colors    │
└────────────────┬────────────────────┘
                 │
                 │  python-pptx output
                 │
                 ▼
            output.pptx

The LLM never touches Inches(), Pt(), RGBColor(), or any coordinate primitive. It only writes the JSON spec. The renderer handles every pixel.

Why this works

Right-alignment is guaranteed because the renderer always sets PP_ALIGN.RIGHT on cells in numeric columns. The LLM can't forget.

Proportional bars are guaranteed because the renderer computes bar heights as (value / max_value) * chart_height. The LLM can't approximate.

Tables are real table objects because the renderer always uses slide.shapes.add_table(...). The LLM can't accidentally use text boxes with tabs.

Layout doesn't break across edits because there are no coordinates to update — change the data, the renderer recomputes positions.

Why current AI slide tools struggle

Most current AI slide tools fall into one of three architectures:

Architecture 1: LLM writes Office JS / python-pptx directly

slide.shapes.add_textbox(Inches(2.5), Inches(3.1), Inches(4.8), Inches(2.3))
# LLM has to remember: 2.5? Or 2.6? What was the title position?
# Result: misaligned, overflowing, unprofessional

This is where the most common failure modes come from. The LLM is doing both reasoning AND pixel work. It's bad at the second job.

Architecture 2: LLM modifies a passive .pptx template

1. User uploads template.pptx
2. LLM parses XML, identifies placeholders
3. LLM writes Office JS to fill placeholders
4. Result: still depends on LLM understanding XML correctly

Better, but the LLM still has to interpret the template structure each time. Multi-turn editing degrades — by turn 5, it has forgotten what the template's color scheme was in turn 1.

Architecture 3 (this library): LLM fills a JSON spec

1. LLM picks a template by name (e.g., "render_financial_summary")
2. LLM fills the JSON schema for that template
3. Renderer (deterministic Python code) produces pixel-perfect output
4. Editing means changing the JSON, not rewriting code

The LLM never sees coordinates. The template guarantees correctness by construction.

Comparable patterns in production

This isn't a new insight — it's the architecture used by all the slide tools that produce good output:

Tool	Pattern
Beautiful.ai	Web-based design engine. AI selects layout, engine computes spatial placement using CSS-like rules. PPTX is an export step.
Gamma.app	Web rendering with smart layouts. AI generates content, layout algorithms handle visual hierarchy.
Pitch	Component-based design system. AI fills component slots; components handle their own layout.
UpSlide	Designer-built native PPTX templates. The tool links data to placeholders but never generates layout code.
Macabacus	Excel-PowerPoint linking. Templates are designer-built; the tool only handles data flow.

What they all have in common: the LLM (or AI) never writes spatial code. Layout is handled by either a deterministic engine or by humans who built the templates upfront.

Trade-offs

Pros:

Pixel-perfect output by construction
Consistent across multi-turn edits
LLM only needs to be good at content reasoning
Easy to QA — the rendering is deterministic
Template library scales horizontally (add more templates without changing the LLM)

Cons:

Limited to the templates in the library — can't generate arbitrary new layouts
Adding a new layout requires Python code, not just a prompt
The library has to be maintained by humans (someone has to add new templates)

For investment banking use cases, the tradeoff is overwhelmingly worth it. IB decks follow ~20-30 standardized patterns. A library that covers those patterns will produce 90% of bank decks correctly the first time.

What this means for AI slide tools

If you're building an AI presentation tool and you want it to produce output that looks professional, the architecture matters more than the model. Here's the order of operations:

Build the template library first — ~30 templates covering the patterns your users need
Define JSON schemas — what data does each template need?
Train the LLM to fill schemas — not to write coordinate code
The LLM's job is selection and content — pick the right template, write the right action title, populate the right data

Without step 1, no amount of better prompting or smarter models will produce reliably professional output. With step 1, even a small model can produce bank-ready slides.

Read the source

ib_deck_engine/templates.py — The 14-template library
ib_deck_engine/SLIDE_CATALOG.md — Catalog of patterns from real bank decks
docs/slide_architecture_proposal.md — Deeper architectural argument

License

MIT — see LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture: The Cognitive Logic API Pattern

The problem this library solves

The solution: Cognitive Logic API

Why this works

Why current AI slide tools struggle

Architecture 1: LLM writes Office JS / python-pptx directly

Architecture 2: LLM modifies a passive .pptx template

Architecture 3 (this library): LLM fills a JSON spec

Comparable patterns in production

Trade-offs

What this means for AI slide tools

Read the source

License

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

Architecture: The Cognitive Logic API Pattern

The problem this library solves

The solution: Cognitive Logic API

Why this works

Why current AI slide tools struggle

Architecture 1: LLM writes Office JS / python-pptx directly

Architecture 2: LLM modifies a passive .pptx template

Architecture 3 (this library): LLM fills a JSON spec

Comparable patterns in production

Trade-offs

What this means for AI slide tools

Read the source

License