diff --git a/.github/plugin/marketplace.json b/.github/plugin/marketplace.json index f45cd7b24..f04bfeff1 100644 --- a/.github/plugin/marketplace.json +++ b/.github/plugin/marketplace.json @@ -736,6 +736,12 @@ "description": "Complete toolkit for developing Power Platform custom connectors with Model Context Protocol integration for Microsoft Copilot Studio", "version": "1.0.0" }, + { + "name": "pptify-slides-creation", + "source": "pptify-slides-creation", + "description": "Generate production-ready, editable PowerPoint (PPTX) decks from Copilot chat: narrative strategy, design-context selection, coordinate-explicit slide specs, visual asset planning, import-only extraction tooling, and audit-driven quality gates.", + "version": "1.0.0" + }, { "name": "project-documenter", "source": "project-documenter", diff --git a/agents/pptify-slides-builder.agent.md b/agents/pptify-slides-builder.agent.md new file mode 100644 index 000000000..b3589edc2 --- /dev/null +++ b/agents/pptify-slides-builder.agent.md @@ -0,0 +1,92 @@ +--- +name: pptify-slides-builder +description: "Help create editable PowerPoint PPTX deck specifications focusing on structure, content strategy, and slide design." +tools: [read, search, edit, execute/getTerminalOutput, execute/runInTerminal, read/terminalLastCommand, read/terminalSelection, browser, agent, todo] +--- + +You are a PPTify slides-building specialist. You guide users through the full deck creation workflow — from narrative strategy to production-ready, coordinate-explicit JSON specifications and build scripts. + +## Skills Reference + +- **pptify-context-prep** — Business framework, narrative strategy, story spine +- **pptify-slide-spec** — Coordinate-explicit layout trees, JSON specification, spec→PPTX build contract +- **pptify-visual-assets** — Design asset guidance (icons, typography, color) +- **pptify-tooling** — Import-only extraction APIs for reference-deck analysis and PPTX package inspection +- **pptify-quality-gates** — Specification validation and quality checklist + +## Workflow + +Before starting any step, confirm the supporting references are readable with the `read` tool: the design profile catalog (`references/design-profiles.md`), the spec→PPTX build contract in pptify-slide-spec, and the audit checklist in pptify-quality-gates. If any is unreadable, halt and notify the user with the file path rather than proceeding from assumed content. + +### 1. Understand the Goal + +1. Collect key information: audience, business decision, narrative framework, language, slide count target, reference deck, brand constraints, deadline. +2. Group all required inputs into a single upfront message of no more than 5 questions. When all inputs are unknown, ask about the five highest-priority ones — audience, business decision, slide count, brand constraints, and deadline — and assume defaults for the rest (language = English, narrative framework = Situation-Complication-Resolution, reference deck = none) unless the user volunteers them. Pre-fill reasonable defaults inline (e.g., "Slide count: 10 — adjust?") so the user can confirm rather than fill blanks. +3. Document the business framework choice in `summary.business_framework`. +4. If only a topic is given, create an executive narrative and mark assumptions clearly. If the topic alone is insufficient to determine audience, business decision, or at least two concrete content points, ask one targeted clarifying question before generating the narrative. Do not generate placeholder content (e.g., "[Insert data here]") — surface the gap explicitly instead. + +### 2. Design Strategy + +1. Select a design direction (modern, minimal, corporate, creative, formal, etc.) +2. If the user has a reference PPTX, use its style as guidance. If the reference PPTX style conflicts with stated brand constraints (colors, fonts, logos), brand constraints take precedence. Document the conflict and resolution in `summary.design_context.conflict_notes`. If reference PPTX analysis fails, the path cannot be read (file not found, permission error, or unsupported format), or it returns insufficient style data, notify the user with the exact error, fall back to the design profile catalog in `references/design-profiles.md`, and document the fallback in `summary.design_context.source: "catalog-fallback"` with the error reason. +3. If no brand constraints are provided and no reference PPTX is available, load the design profile catalog at `references/design-profiles.md`, select a profile based on the stated design direction, and document the selection in `summary.design_context.source: "catalog-default"`. +4. Document the chosen design system in `summary.design_context` (palette, typography, spacing, signature elements) +5. If the deck language is not English, verify the selected design system fonts support the required character set (e.g., CJK glyphs). If they do not, flag the gap and recommend a substitute font that supports the language before authoring the spec. +6. Every visible content slide must include at least one non-text decorative object (e.g., color band, card background, accent shape) whose fill or stroke color is drawn from the palette defined in `summary.design_context`. + +### 3. Plan Slide Structure + +1. Map business framework to deck outline (title, setup, evidence, decision, appendix) +2. If the user's target slide count cannot accommodate the full business framework outline, surface the conflict explicitly: list which sections would be merged or dropped, and ask the user to confirm the trade-off before proceeding. +3. One clear message per slide +4. Choose slide form: title, agenda, comparison, process, metrics, roadmap, risk, architecture, evidence, decision +5. 3–5 content groups per slide — groups whose `role` is NOT `background` or `decoration` (background and decoration groups are excluded from this count) — each represented as a distinct `group` entry in `layout_tree.groups` with a descriptive `role`. Exception: slides of form `title`, `agenda`, or `decision` may have 1–2 content groups. Slides of form `comparison` or `evidence` may have up to 6 content groups only when each group maps to a distinct data entity (e.g., a separate product, vendor, or risk item) that cannot be merged without losing meaning; state the reason in the slide's `notes` field. +6. Preserve user terminology, metrics, dates, tone + +### 4. Spec Authoring + +1. Return JSON with `slides` array and optional `summary` +2. Each slide: `id`, `title`, `layout_tree` +3. Each layout_tree: `slide_size`, `root_group_id`, `groups`, `objects` +4. Each group: `id`, `role`, `layout_mode`, `object_ids`, `group_ids`, `bbox` +5. Each object: `id`, `kind`, `role`, `classification`, `content`, `style`, `bbox`, `z_index` +6. Explicit coordinates and styling for all objects (no shortcuts) +7. Content text ≥9pt (body/evidence 10–12pt, labels/captions 9–10pt); only decorative `layout_design` text may go below 9pt. Decorative `layout_design` text is any text object whose `classification` equals `layout_design` and whose `role` equals `decoration`, carrying no user-readable content that conveys slide meaning. +8. All shapes/lines: explicit fill, stroke, endpoints +9. Keep every object inside slide bounds with a minimum content-safe margin of 0.25 in (18 pt) on all four edges, unless the object is a full-bleed background element with `role: background`; follow the spec→PPTX build contract in pptify-slide-spec +10. Translate design direction into objects, colors, spacing, and typography + +**Constraint precedence:** When per-object constraints conflict (Steps 4–5), resolve in this order: (1) slide bounds + margin, (2) minimum font size, (3) palette color, (4) decorative-object presence. Document any trade-off in the object's `notes` field. + +### 5. Quality Validation + +1. Verify `summary.design_context` exists (design system name, palette, typography documented) +2. Verify every visible content slide has ≥1 non-text decorative object using a `summary.design_context` palette color +3. Check typography consistency and spacing rhythm +4. No `content` text below 9pt; body/evidence text must be 10–12pt and labels/captions 9–10pt; only decorative `layout_design` meta may go lower +5. Verify zero content collisions, zero text/table overflow, and no object outside slide bounds (run the full audit-checklist in pptify-quality-gates) +6. Reject specs with plain white backgrounds, Calibri-only text, unstyled bullets, or missing design context. On rejection, do not return the failing spec. Instead: + 1. List each failed check by slide ID. + 2. Classify each failure as **Type A** (the fix is deterministic from the documented design system — e.g., swap Calibri for the design system font) or **Type B** (the fix needs a value absent from the spec or design system — e.g., a font or brand color that has not been defined). + 3. Apply all Type A fixes and re-validate silently. Never invent a value that is absent from the spec or design system — surface it as Type B instead. + 4. List all Type B fixes together and ask the user in a single message before proceeding. + 5. If re-validation after Type A correction still fails, do not attempt a third autonomous pass; list the remaining failures by slide ID and ask the user for guidance before proceeding. + +### 6. Response Contract + +1. When all required inputs are available and you are authoring a spec, output strict JSON (no markdown fences unless user asks for prose) +2. If required inputs are missing, ask questions in plain prose before authoring JSON; do not wrap clarifying questions in JSON +3. When a turn needs both a blocking question and spec output (e.g., after a Type B correction), ask the question in plain prose and withhold the JSON spec until the user resolves it; never embed questions inside JSON or bury spec output inside prose +4. Mark assumptions clearly in summary +5. Report spec path and validation status + +## Boundaries + +This plugin provides guidance, design context, and import-only analysis APIs — not hosted infrastructure. Keep the following external and user-managed: + +- Environment bootstrap. The plugin works standalone with no install step and no bundled setup scripts. +- LLM access for source summarization. context-prep guides *how* to summarize; you bring the API (OpenAI, Azure OpenAI, etc.). +- Image/infographic generation. visual-assets provides runnable inline snippets, but the provider, model, and credentials are user-managed (`.env` / `az login`). +- The python-pptx build step. The agent authors the JSON spec **and** the build script; no general renderer is bundled. See the build contract in pptify-slide-spec. + +Design context is **not** out of scope: the bundled catalog in pptify-context-prep (`references/design-profiles.md`) is always available and should be loaded for every deck. diff --git a/docs/README.agents.md b/docs/README.agents.md index 359085a07..0934afd1a 100644 --- a/docs/README.agents.md +++ b/docs/README.agents.md @@ -165,6 +165,7 @@ See [CONTRIBUTING.md](../CONTRIBUTING.md#adding-agents) for guidelines on how to | [Power BI Visualization Expert Mode](../agents/power-bi-visualization-expert.agent.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fpower-bi-visualization-expert.agent.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fpower-bi-visualization-expert.agent.md) | Expert Power BI report design and visualization guidance using Microsoft best practices for creating effective, performant, and user-friendly reports and dashboards. | | | [Power Platform Expert](../agents/power-platform-expert.agent.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fpower-platform-expert.agent.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fpower-platform-expert.agent.md) | Power Platform expert providing guidance on Code Apps, canvas apps, Dataverse, connectors, and Power Platform best practices | | | [Power Platform MCP Integration Expert](../agents/power-platform-mcp-integration-expert.agent.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fpower-platform-mcp-integration-expert.agent.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fpower-platform-mcp-integration-expert.agent.md) | Expert in Power Platform custom connector development with MCP integration for Copilot Studio - comprehensive knowledge of schemas, protocols, and integration patterns | | +| [Pptify Slides Builder](../agents/pptify-slides-builder.agent.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fpptify-slides-builder.agent.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fpptify-slides-builder.agent.md) | Help create editable PowerPoint PPTX deck specifications focusing on structure, content strategy, and slide design. | | | [Principal software engineer](../agents/principal-software-engineer.agent.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fprincipal-software-engineer.agent.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fprincipal-software-engineer.agent.md) | Provide principal-level software engineering guidance with focus on engineering excellence, technical leadership, and pragmatic implementation. | | | [Project Architecture Planner](../agents/project-architecture-planner.agent.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fproject-architecture-planner.agent.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fproject-architecture-planner.agent.md) | Holistic software architecture planner that evaluates tech stacks, designs scalability roadmaps, performs cloud-agnostic cost analysis, reviews existing codebases, and delivers interactive Mermaid diagrams with HTML preview and draw.io export | | | [Project Documenter](../agents/project-documenter.agent.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fproject-documenter.agent.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/agent?url=vscode-insiders%3Achat-agent%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Fagents%2Fproject-documenter.agent.md) | Generates professional MS Word project documentation with draw.io architecture diagrams and embedded PNG images. Automatically discovers any project's technology stack, architecture, and code structure. Produces Markdown, draw.io diagrams, PNG exports, and .docx output. | | diff --git a/docs/README.plugins.md b/docs/README.plugins.md index 1ce848780..b844fdb4b 100644 --- a/docs/README.plugins.md +++ b/docs/README.plugins.md @@ -72,6 +72,7 @@ See [CONTRIBUTING.md](../CONTRIBUTING.md#adding-plugins) for guidelines on how t | [power-bi-development](../plugins/power-bi-development/README.md) | Comprehensive Power BI development resources including data modeling, DAX optimization, performance tuning, visualization design, security best practices, and DevOps/ALM guidance for building enterprise-grade Power BI solutions. | 8 items | power-bi, dax, data-modeling, performance, visualization, security, devops, business-intelligence | | [power-platform-architect](../plugins/power-platform-architect/README.md) | Solution Architect for the Microsoft Power Platform, turning business requirements into functioning Power Platform solution architectures. | 1 items | power-platform, power-platform-architect, power-apps, dataverse, power-automate, power-pages, power-bi | | [power-platform-mcp-connector-development](../plugins/power-platform-mcp-connector-development/README.md) | Complete toolkit for developing Power Platform custom connectors with Model Context Protocol integration for Microsoft Copilot Studio | 3 items | power-platform, mcp, copilot-studio, custom-connector, json-rpc | +| [pptify-slides-creation](../plugins/pptify-slides-creation/README.md) | Generate production-ready, editable PowerPoint (PPTX) decks from Copilot chat: narrative strategy, design-context selection, coordinate-explicit slide specs, visual asset planning, import-only extraction tooling, and audit-driven quality gates. | 6 items | powerpoint, pptx, presentations, slide-decks, deck-generation, slide-design, document-generation | | [project-documenter](../plugins/project-documenter/README.md) | Generate professional project documentation with draw.io architecture diagrams and Word (.docx) output with embedded images. Automatically discovers any project's technology stack and produces Markdown, diagrams, PNG exports, and a formatted Word document. | 3 items | documentation, architecture-diagrams, drawio, word-document, docx, png-images, c4-model, project-summary, auto-discovery | | [project-planning](../plugins/project-planning/README.md) | Tools and guidance for software project planning, feature breakdown, epic management, implementation planning, and task organization for development teams. | 15 items | planning, project-management, epic, feature, implementation, task, architecture, technical-spike | | [python-mcp-development](../plugins/python-mcp-development/README.md) | Complete toolkit for building Model Context Protocol (MCP) servers in Python using the official SDK with FastMCP. Includes instructions for best practices, a prompt for generating servers, and an expert chat mode for guidance. | 2 items | python, mcp, model-context-protocol, fastmcp, server-development | diff --git a/docs/README.skills.md b/docs/README.skills.md index 6b38622ce..e0fb7e670 100644 --- a/docs/README.skills.md +++ b/docs/README.skills.md @@ -286,6 +286,11 @@ See [CONTRIBUTING.md](../CONTRIBUTING.md#adding-skills) for guidelines on how to | [power-platform-architect](../skills/power-platform-architect/SKILL.md)
`gh skills install github/awesome-copilot power-platform-architect` | Use this skill when the user needs to transform business requirements, use case descriptions, or meeting transcripts into a technical Power Platform solution architecture, including component selection and Mermaid.js diagrams. | None | | [power-platform-mcp-connector-suite](../skills/power-platform-mcp-connector-suite/SKILL.md)
`gh skills install github/awesome-copilot power-platform-mcp-connector-suite` | Generate complete Power Platform custom connector with MCP integration for Copilot Studio - includes schema generation, troubleshooting, and validation | None | | [powerbi-modeling](../skills/powerbi-modeling/SKILL.md)
`gh skills install github/awesome-copilot powerbi-modeling` | Power BI semantic modeling assistant for building optimized data models. Use when working with Power BI semantic models, creating measures, designing star schemas, configuring relationships, implementing RLS, or optimizing model performance. Triggers on queries about DAX calculations, table relationships, dimension/fact table design, naming conventions, model documentation, cardinality, cross-filter direction, calculation groups, and data model best practices. Always connects to the active model first using power-bi-modeling MCP tools to understand the data structure before providing guidance. | `references/MEASURES-DAX.md`
`references/PERFORMANCE.md`
`references/RELATIONSHIPS.md`
`references/RLS.md`
`references/STAR-SCHEMA.md` | +| [pptify-context-prep](../skills/pptify-context-prep/SKILL.md)
`gh skills install github/awesome-copilot pptify-context-prep` | Prepare narrative framework, source material, and design context before authoring a pptify deck spec. Use when selecting a business/storytelling framework, converting documents, summarizing long sources, analyzing reference PPTX decks, or selecting and loading bundled design profiles. | `references/design-profiles.md` | +| [pptify-quality-gates](../skills/pptify-quality-gates/SKILL.md)
`gh skills install github/awesome-copilot pptify-quality-gates` | Validate and repair pptify PPTX artifacts. Use when checking deck specs, PPTX packages, audits, coordinate-explicit layout trees, collisions, text overflows, warnings, visual hierarchy, asset layering, or reference deck alignment. | `references/audit-checklist.md` | +| [pptify-slide-spec](../skills/pptify-slide-spec/SKILL.md)
`gh skills install github/awesome-copilot pptify-slide-spec` | Author or repair coordinate-explicit pptify JSON deck specs. Use when writing layout_tree groups, objects, bboxes, tables, images, lines, shapes, type scale, or collision-safe content. | None | +| [pptify-tooling](../skills/pptify-tooling/SKILL.md)
`gh skills install github/awesome-copilot pptify-tooling` | Core PPTX tooling: extraction, style analysis, deck diagnostics, and integration contracts without heavy runtime scripts. | `references/pptx_extractor.py`
`references/pptx_style_master.py`
`references/toolkit-setup.md` | +| [pptify-visual-assets](../skills/pptify-visual-assets/SKILL.md)
`gh skills install github/awesome-copilot pptify-visual-assets` | Plan and place visual assets for pptify PPTX decks. Use when adding icons, images, SVGs, infographics, image placeholders, or asset-backed slide objects. | `references/visual-asset-adapters.md` | | [pr-dashboard](../skills/pr-dashboard/SKILL.md)
`gh skills install github/awesome-copilot pr-dashboard` | Open a GitHub PR dashboard in the browser. Use when the user asks to see their pull requests, open the PR dashboard, show PRs for a date range, or check PR status. Trigger phrases include "show my PRs", "open PR dashboard", "pull request dashboard". | `assets/dashboard.html`
`scripts/lib`
`scripts/pr-dashboard-cli.mjs` | | [pr-screenshots](../skills/pr-screenshots/SKILL.md)
`gh skills install github/awesome-copilot pr-screenshots` | Embed before/after screenshots and annotated images in pull request descriptions. Covers PR description patterns, image upload for Azure DevOps and GitHub, and sizing best practices. | None | | [prd](../skills/prd/SKILL.md)
`gh skills install github/awesome-copilot prd` | Generate high-quality Product Requirements Documents (PRDs) for software systems and AI-powered features. Includes executive summaries, user stories, technical specifications, and risk analysis. | None | diff --git a/plugins/pptify-slides-creation/.github/plugin/plugin.json b/plugins/pptify-slides-creation/.github/plugin/plugin.json new file mode 100644 index 000000000..ecbcf3fd3 --- /dev/null +++ b/plugins/pptify-slides-creation/.github/plugin/plugin.json @@ -0,0 +1,29 @@ +{ + "name": "pptify-slides-creation", + "description": "Generate production-ready, editable PowerPoint (PPTX) decks from Copilot chat: narrative strategy, design-context selection, coordinate-explicit slide specs, visual asset planning, import-only extraction tooling, and audit-driven quality gates.", + "version": "1.0.0", + "keywords": [ + "powerpoint", + "pptx", + "presentations", + "slide-decks", + "deck-generation", + "slide-design", + "document-generation" + ], + "author": { + "name": "kimtth" + }, + "repository": "https://github.com/github/awesome-copilot", + "license": "MIT", + "agents": [ + "./agents/pptify-slides-builder.md" + ], + "skills": [ + "./skills/pptify-context-prep/", + "./skills/pptify-quality-gates/", + "./skills/pptify-slide-spec/", + "./skills/pptify-tooling/", + "./skills/pptify-visual-assets/" + ] +} diff --git a/plugins/pptify-slides-creation/README.md b/plugins/pptify-slides-creation/README.md new file mode 100644 index 000000000..ec3c7d920 --- /dev/null +++ b/plugins/pptify-slides-creation/README.md @@ -0,0 +1,49 @@ +# pptify-slides-creation + +Generate production-ready, **editable** PowerPoint (PPTX) decks with GitHub Copilot — from narrative strategy to coordinate-explicit slide specifications and audit-driven quality gates. + +The `pptify-slides-creation` plugin bundles one agent and five skills that cover the full deck-creation workflow: framing the business story, selecting a design direction, authoring collision-safe layout trees, planning visual assets, analyzing reference decks, and validating the final package. + +## Install + +```bash +copilot plugin install pptify-slides-creation@awesome-copilot +``` + +If the marketplace isn't registered yet: + +```bash +copilot plugin marketplace add github/awesome-copilot +copilot plugin install pptify-slides-creation@awesome-copilot +``` + +## What's included + +### Agent + +- **pptify-slides-builder** — Guides the end-to-end deck workflow, from narrative strategy to production-ready, coordinate-explicit JSON specifications and build scripts. + +### Skills + +- **pptify-context-prep** — Choose a business/storytelling framework, convert and summarize source material, analyze reference PPTX decks, and load bundled design profiles before authoring a spec. +- **pptify-slide-spec** — Author or repair coordinate-explicit JSON deck specs: layout-tree groups, objects, bounding boxes, tables, images, lines, shapes, and a type scale that avoids collisions and overflow. +- **pptify-visual-assets** — Plan and place icons, images, SVGs, infographics, and asset-backed slide objects with placement and decision guidance. +- **pptify-tooling** — Import-only PPTX extraction and style-analysis helpers for reference-deck analysis and package inspection, without heavy runtime scripts. +- **pptify-quality-gates** — Validate and repair deck specs and PPTX packages against an 11-dimension audit checklist (collisions, overflows, hierarchy, asset layering, reference alignment). + +## Typical workflow + +1. **Context prep** — pick a narrative framework, ingest sources, select a design profile. +2. **Slide spec** — write a coordinate-explicit `layout_tree` JSON spec. +3. **Visual assets** — plan icons/images/infographics and bind them to slide objects. +4. **Tooling** — analyze a reference deck or inspect a generated package when needed. +5. **Quality gates** — run the audit checklist and repair issues before sign-off. + +## Notes + +- The plugin is **standalone**: it provides guidance, design context, and import-only analysis APIs — there is no install step and no bundled setup scripts. +- LLM access for source summarization is user-managed (bring your own OpenAI, Azure OpenAI, etc.). + +## Credits + +Upstream project: [kimtth/agent-pptify-kit](https://github.com/kimtth/agent-pptify-kit). Licensed under MIT. diff --git a/skills/pptify-context-prep/SKILL.md b/skills/pptify-context-prep/SKILL.md new file mode 100644 index 000000000..0c511c11d --- /dev/null +++ b/skills/pptify-context-prep/SKILL.md @@ -0,0 +1,116 @@ +--- +name: pptify-context-prep +description: "Prepare narrative framework, source material, and design context before authoring a pptify deck spec. Use when selecting a business/storytelling framework, converting documents, summarizing long sources, analyzing reference PPTX decks, or selecting and loading bundled design profiles." +--- + +# PPTify Context Prep + +Use this skill before writing a deck spec. It covers three parallel preparation tracks: **narrative framework** (the business story spine), **source context** (documents, research, reference PPTX), and **design context** (predefined style profiles in [`references/design-profiles.md`](references/design-profiles.md)). + +## Business Framework & Narrative + +The business framework is defined by the user, not by the assistant. If the user has already specified a framework, use it directly. If the user names a framework that does not match any entry in the table, treat it as a `custom` framework request, confirm the interpretation with the user, and proceed with the custom elicitation questions before planning. If no framework has been specified, present the available options and ask which one to use before planning the deck. Include `custom` when the user wants to provide their own structure, naming convention, or slide sequence. Do not auto-select a framework on the user's behalf. + +| Framework | Best for | +|---|---| +| `mckinsey` | Executive proposals, consulting deliverables, strategic recommendations | +| `scqa` | Problem-solving presentations, situation analysis, incident reports | +| `pyramid` | Complex arguments requiring strong logical structure | +| `mece` | Issue decomposition, audits, multi-workstream analysis | +| `action-title` | Executive communications where every slide must drive action | +| `assertion-evidence` | Technical or academic presentations, research findings | +| `exec-summary-first` | C-suite briefings, board decks, press releases | +| `custom` | User-defined structures, organization-specific playbooks, hybrid narrative patterns | + +Use the selected framework as the starting narrative spine, then adapt slide count and evidence density to the user's source material. + +| Framework | Default slide spine | +|---|---| +| `mckinsey` | Title → executive summary → situation → complication → key question → recommendation → 2-3 evidence slides → options → roadmap → appendix | +| `scqa` | Title → situation → complication → question → answer → evidence → implementation plan → summary | +| `pyramid` | Title → main answer → argument 1 → argument 2 → argument 3 → evidence → summary | +| `mece` | Title → issue tree → workstream slides → synthesis | +| `action-title` | Title → action summary → action-titled content slides → next steps | +| `assertion-evidence` | Title → overview assertion → assertion/evidence slides → conclusion | +| `exec-summary-first` | Title → full answer on slide 2 → supporting detail → appendix | +| `custom` | Ask for framework name, objective, slide sequence, title rules, layout preferences, and evidence expectations before planning. If the user provides only partial answers, apply Pyramid Principle defaults for any unspecified field, such as assertion-style title rules, and document each assumption in `summary.business_framework` | + +Record the resolved framework in `summary.business_framework`, including source, slide sequence, title rules, and approved assumptions. + +### Storytelling Principles + +- Use the selected framework's slide order as the deck spine. Apply the Pyramid Principle as title discipline and synthesis guidance: make each slide title state the slide's conclusion or assertion when the framework stage supports it, avoid vague labels, and treat SCQA `question` or McKinsey `key question` steps as narrative roles rather than mandatory question-form slide titles unless the user explicitly requests literal question titles. +- Make every key message answer "So what?" for the audience. +- Keep topics MECE: mutually exclusive and collectively exhaustive. +- Write specific slide titles, such as "Azure AI cuts development costs by 40%" or "3 implementation patterns enable rapid onboarding," instead of generic labels like "About Azure AI" or "Implementation Patterns Overview." +- Include concrete data, numbers, dates, owners, sources, or quantified directional signals in bullets when the source material supports them. +- Keep speaker notes useful: two to three sentences, never empty and never just a dash. +- Avoid generic statements; every bullet should be specific, defensible, and tied to the selected framework's role in the story. + +## Source Documents + +- For long source documents, ask the user to convert to markdown or paste key sections directly. +- If the source exceeds approximately 1,500 words or covers more than three distinct topics, ask the user to pre-summarize it using their preferred tool, such as an LLM API or summarization pipeline, and paste the result here before proceeding. Do not attempt to call external APIs directly. +- Record the corpus path, summary path, source count, and source URLs in `summary.source_enrichment` so enrichment evidence survives review. +- Use summaries to identify audience, thesis, slide sequence, evidence, risks, and decision points. +- Do not paste entire long documents into the deck spec; summarize into concise slide messages and cite sources in footers when needed. + +## Reference PPTX + +- Use the importable helpers in `skills/pptify-tooling/references`, or unzip the `.pptx` file and parse its XML contents directly, to inspect production complexity, slide text, style, brand, template, and layout-rhythm facts. +- If reference PPTX inspection fails or returns no usable data, notify the user, skip reference-derived context, and proceed using the selected design profile as the sole design source. Document this in `summary.design_context`. +- Use the extracted facts as agent context when the new deck should follow a source deck's language, slide count, topic sequence, executive tone, colors, fonts, template conventions, and layout rhythm. +- When authoring the new spec, translate `brands.primary_color`, `brands.accent_colors`, `brands.fonts`, `template.slide_size`, `template.layout_usage`, and `layout.slides[*].dominant_flow` into explicit `layout_tree` primitives, colors, typography, spacing, and coordinates. +- Use extraction helpers when the goal is reconstructing or preserving an existing production deck rather than authoring a new editable deck. +- For new editable decks, treat reference layout rhythm as prompt context; generated coordinates must be authored directly by the agent in `layout_tree`. +- Never copy or mutate a referenced PPTX as the generation strategy. Use analysis as context and build a new PPTX artifact. + +## Design Profile Selection + +Load [`references/design-profiles.md`](references/design-profiles.md) for the full profile catalog with IDs, `best_for` guidance, key style signals, and license information. If `references/design-profiles.md` cannot be loaded, notify the user that design profile selection is unavailable, fall back to `fluent-ui-design-tokens` defaults using only the inline descriptions in this prompt, and flag the limitation in `summary.design_context`. + +Use bundled design profiles; do not invent a new design template when the user asks for predefined templates. + +Apply profile rules in this priority order: (1) explicit user request for a named profile, (2) `likaku-mck-ppt-design-skill` if a consulting or strategy framework is selected, (3) `primer-primitives` if the deck is developer or GitHub-focused, (4) `corazzon-pptx-design-styles` only if the user explicitly requests multiple style options, and (5) `fluent-ui-design-tokens` for all remaining cases. + +- Use `fluent-ui-design-tokens` as the default for remaining new decks, including Microsoft, M365, Teams, Power Platform, enterprise-aligned, general modern, stylish, product, app, pitch, or unspecified visual style requests. +- Use `primer-primitives` for GitHub-style product, developer, or token-driven engineering decks. +- Use `corazzon-pptx-design-styles` when a broader modern style catalog or multiple visual direction options are explicitly useful. Pick one style from the catalog and lock its palette, typography, spacing, and signature element before layout planning. +- Use `likaku-mck-ppt-design-skill` for consulting, strategy, governance, or operations decks that need action-title discipline and structured native PPTX layouts. +- Use `awesome-copilot-design-agents` when the agent prompt itself needs design review, UX discovery, visual hierarchy, or accessibility framing. +- Keep source attribution and license metadata attached to the context used. +- If no catalog profile fits, use reference PPTX analysis, search for another public source, or ask the user for a source template. +- Record selected profile IDs, source URLs, and style lock details in `summary.design_context` before building the PPTX. + +Profile descriptions are in [`references/design-profiles.md`](references/design-profiles.md) — load that file for the full catalog. + +## Applying Context to Spec Authoring + +1. Put the selected profile payload into the agent context before writing `deck-spec.json`. + +### Layout Authoring Rules + +1. Translate source signals into explicit `layout_tree` objects, colors, fills, lines, typography, spacing, bboxes, and z-order. +2. Use source CSS or reference deck rhythm only as design evidence; final coordinates must be authored directly in inches. + +### Design Quality Gates + +1. Keep meaningful slide content as `classification: "content"` objects and decorative/background elements as `classification: "layout_design"` objects. +2. Add at least one style-derived visible design element to every slide except the title slide, section dividers, and appendix slides: accent band, rule, card shell, grid cell, diagram primitive, shape motif, image treatment, or pattern. A plain title-plus-bullets slide fails the design gate. +3. Do not treat design profiles as content source material; they are design context only. + +## Source-to-Deck Planning + +- Map the selected business framework to the deck outline before authoring visuals, and document the resolved framework in `summary.business_framework`. +- Convert source material into one message per slide before authoring visual structure. +- Treat charts and dashboard-style slides as source-evidence-driven exhibits; do not create generic metric or dashboard slides when the source corpus does not provide relevant data. +- Preserve important terminology, product names, metrics, dates, and user-provided wording. +- Reduce dense narrative into executive slide titles plus short sections. +- Track open assumptions in speaker notes or audit-facing summary fields instead of overcrowding slides. + +## Restrictions + +- Do not copy external fonts, icon packs, photos, or binary assets unless their license and source are explicitly added. +- Do not claim the output is a Primer, Fluent UI, or Awesome Copilot artifact; these are context sources for a new `pptify` deck. +- Do not let source CSS override pptify quality gates: built decks still need zero content collisions and zero text overflows. +- Do not accept default PowerPoint theme colors, Calibri-only text boxes, plain white backgrounds, or placeholder-style bullet layouts as a finished design. diff --git a/skills/pptify-context-prep/references/design-profiles.md b/skills/pptify-context-prep/references/design-profiles.md new file mode 100644 index 000000000..dd0345690 --- /dev/null +++ b/skills/pptify-context-prep/references/design-profiles.md @@ -0,0 +1,167 @@ +# PPTify Design Profile Catalog + +**This is the bundled reference for design profiles.** Use it for all new decks. Design context is built-in and always available. + +## Quick-Select Guide + +| Profile ID | Best for | +|---|---| +| `fluent-ui-design-tokens` | Microsoft, M365, Teams, Power Platform, enterprise — **default for new decks** | +| `primer-primitives` | GitHub-style, developer products, token-driven UI reviews, engineering docs | +| `corazzon-pptx-design-styles` | 30 modern style catalog; use when visual variety or multiple direction options are needed | +| `likaku-mck-ppt-design-skill` | Consulting, strategy, governance, operations — strict action-title discipline | +| `sunbigfly-ppt-agent-skills` | Source-backed, stage-gated delivery pipelines with human approval at each phase | +| `awesome-copilot-design-agents` | Prompting agent for design review, UX discovery, visual hierarchy reasoning | +| `nexu-io-open-design` | Direction-picker workflows with explicit style-lock and artifact-lint gates | +| `alchaincyf-huashu-design` | Brand-constrained enterprise decks requiring exact color/type fidelity | +| `pptwork-oh-my-slides` | HTML-prototype-first workflows; raster fidelity + constrained PPTX editability as separate deliverables | +| `erickittelson-slidemason` | Cautionary reference: JSX primitive composition — auto-layout incompatibility | +| `gabberflast-academic-pptx-skill` | High-stakes governance, board, or investor presentations needing narrative rigour | + +## Profiles + +### `fluent-ui-design-tokens` +**Name:** Fluent UI Design Token Guidance +**Kind:** design-system-context +**License:** MIT — Copyright (c) Microsoft Corporation +**Source:** [microsoft/fluentui](https://github.com/microsoft/fluentui/blob/master/docs/architecture/design-tokens.md) +**Token categories:** color, spacing, border radius, font, line height, stroke, shadow, duration, easing +**Themes:** webLightTheme, webDarkTheme, teamsLightTheme, teamsDarkTheme, teamsHighContrastTheme +**Agent rule:** Use design tokens instead of hardcoded colors, spacing, or typography values. +**Best for:** Microsoft-aligned decks, Teams, M365, Power Platform governance, enterprise product reviews + +--- + +### `primer-primitives` +**Name:** Primer Primitives Design Tokens +**Kind:** design-system-context +**License:** MIT — Copyright (c) 2018 GitHub Inc. +**Source:** [primer/primitives](https://github.com/primer/primitives) +**Token categories:** color, spacing, typography, motion, z-index +**Spacing scale:** xxs, xs, sm, md, lg, xl +**Typography roles:** display, title, subtitle, body, caption, codeBlock, codeInline +**Color examples:** `#ffffff`, `#1f2328`, `#F6F8FA`, `#0969da`, `#1a7f37`, `#cf222e` +**Best for:** GitHub-style decks, developer products, token-driven UI reviews, engineering documentation + +--- + +### `corazzon-pptx-design-styles` +**Name:** corazzon/pptx-design-styles — 30 Modern PPTX Style Templates +**Kind:** pptx-style-template-context +**License:** MIT — Copyright TodayCode / corazzon contributors +**Source:** [corazzon/pptx-design-styles](https://github.com/corazzon/pptx-design-styles) +**30 styles:** Glassmorphism, Neo-Brutalism, Bento Grid, Dark Academia, Gradient Mesh, Claymorphism, Swiss International, Aurora Neon Glow, Retro Y2K, Nordic Minimalism, Typographic Bold, Duotone Color Split, Monochrome Minimal, Cyberpunk Outline, Editorial Magazine, Pastel Soft UI, Dark Neon Miami, Hand-crafted Organic, Isometric 3D Flat, Vaporwave, Art Deco Luxe, Brutalist Newspaper, Stained Glass Mosaic, Liquid Blob Morphing, Memphis Pop Pattern, Dark Forest Nature, Architectural Blueprint, Maximalist Collage, SciFi Holographic Data, Risograph Print +**Style families:** modern-ui, editorial, retro, technical, luxury, organic, experimental +**Source inputs per style:** hex colors, font pairings, layout rules, signature elements, avoid lists +**Agent rule:** Pick one style, lock its palette and typography, then translate visual effects into explicit pptify `layout_tree` primitives or documented raster accents. Do not mix styles accidentally. +**Best for:** Choosing a predefined modern style from a broad catalog; generating multiple visual direction options before deck production + +--- + +### `likaku-mck-ppt-design-skill` +**Name:** likaku/Mck-ppt-design-skill — McKinsey-Style Native PPTX Layout Runtime +**Kind:** pptx-pattern-context +**License:** MIT — Copyright likaku contributors +**Source:** [likaku/Mck-ppt-design-skill](https://github.com/likaku/Mck-ppt-design-skill) +**Pattern count:** ~70 consulting-style layout patterns +**Pattern families:** structure-navigation, data-metrics, frameworks-matrices, content-narrative +**Action title discipline:** required on every content slide +**Geometry norms (inches):** kicker_y=0.48, title_y=0.72, rule_y=1.12, content_top_y=1.30 +**Agent rule:** Use the source taxonomy as design inspiration only. Author exact `layout_tree` coordinates, sizes, and primitives. Action titles on every content slide. +**Best for:** Consulting decks for strategy, governance, or operations reviews; strict action-title discipline + +--- + +### `sunbigfly-ppt-agent-skills` +**Name:** sunbigfly/ppt-agent-skills — Staged Deck Generation Pipeline +**Kind:** agent-pipeline-context +**License:** MIT — Copyright sunbigfly contributors +**Source:** [sunbigfly/ppt-agent-skills](https://github.com/sunbigfly/ppt-agent-skills) +**Pipeline stages:** interview → source-compression → outline → style-lock → slide-plan → visual-qa → dual-export +**Stage outputs:** structured brief JSON, compressed source ≤800 words, outline JSON, style_lock JSON, complete spec.json, qa-report, pptx + raster +**Agent rule:** Never skip the interview. Source compression before outline. Style lock is stage-gated. Per-slide plans are full specs. Action titles mandatory. +**Best for:** Source-grounded decks; high-stakes presentations; workflows requiring explicit human approval at each phase + +--- + +### `awesome-copilot-design-agents` +**Name:** Awesome Copilot Design Agent and Prompt Context +**Kind:** agent-prompt-context +**License:** MIT — Copyright GitHub, Inc. +**Source:** [github/awesome-copilot](https://github.com/github/awesome-copilot) +**Key files:** `agents/gem-designer.agent.md`, `agents/se-ux-ui-designer.agent.md`, `skills/penpot-uiux-design/SKILL.md`, `skills/prompt-optimizer/SKILL.md` +**Prompt focus:** existing design systems, visual hierarchy, UX discovery, accessibility, slides and reports design intentionality +**Best for:** Prompting an LLM to reason about deck design; UX discovery before deck planning; design review checklists; visual hierarchy guidance + +--- + +### `nexu-io-open-design` +**Name:** nexu-io/open-design — Claude Design Style +**Kind:** agent-skill-context +**License:** MIT — Copyright nexu-io contributors +**Source:** [nexu-io/open-design](https://github.com/nexu-io/open-design) +**Key patterns:** direction-picker, sandbox-preview, artifact-lint, design-critique +**Stage gates:** direction selection → style lock → preview approval → artifact lint → critique gate +**Agent rule:** Never start layout without a locked direction. Run artifact lint after every build. Preview before full deck. +**Best for:** Reasoning about deck design direction before committing; parallel design options for selection; lint and critique gates on generated decks + +--- + +### `alchaincyf-huashu-design` +**Name:** alchaincyf/huashu-design — HTML-Native Brand Design Pipeline +**Kind:** agent-skill-context +**License:** MIT — Copyright alchaincyf contributors +**Source:** [alchaincyf/huashu-design](https://github.com/alchaincyf/huashu-design) +**Key patterns:** brand-asset-protocol, visual-directions, html-to-editable-pptx, playwright-check +**Brand lock fields:** primary_palette, neutral_palette, typeface_display, typeface_body, tone +**Agent rule:** Brand lock is non-negotiable. Parallel directions before deck plan. Every text frame must be individually editable. +**Best for:** Brand-constrained enterprise decks requiring exact color/type fidelity; multi-direction style exploration before committing + +--- + +### `pptwork-oh-my-slides` +**Name:** PPTWork/oh-my-slides — HTML-as-Source PPTX Build Artifact Pipeline +**Kind:** pptx-export-context +**License:** MIT — Copyright PPTWork contributors +**Source:** [PPTWork/oh-my-slides](https://github.com/PPTWork/oh-my-slides) +**Key patterns:** html-source, preset-picker, mini-preview, raster-export, constrained-editable +**Export model:** HTML (design source) → raster export (fidelity) + constrained PPTX (editability) +**Forbidden in editable PPTX:** background images, raster embeds of slide content, CSS transform rotate, SVG filter effects +**Agent rule:** Never promise both pixel fidelity and full editability from the same export path. Raster embeds in editable PPTX are a quality failure. +**Best for:** HTML-prototype-first workflows; design fidelity and PowerPoint editability as separate deliverables; Playwright-in-the-loop generation + +--- + +### `erickittelson-slidemason` +**Name:** erickittelson/slidemason — JSX Primitive Composition (Cautionary Reference) +**Kind:** agent-skill-context +**License:** MIT — Copyright erickittelson contributors +**Source:** [erickittelson/slidemason](https://github.com/erickittelson/slidemason) +**Key patterns:** jsx-primitives, jsx-bento, bespoke-slide, primitive-composition +**Primitive map:** Card→`_shape(round_rect)`, Text→`_text()`, Line→`_line()`, Image→`_image()`, Oval→`_shape(oval)` +**Editability failure modes:** nested flex containers, auto-sized text frames, SVG filter effects, rotated text boxes, image fills on shapes +**Agent rule:** Auto-layout is the enemy of editability. All coordinates are in inches. Bespoke layout is a last resort. +**Best for:** Understanding limits of programmatic slide composition; cautionary reference for auto-layout / PPTX editability incompatibility + +--- + +### `gabberflast-academic-pptx-skill` +**Name:** Gabberflast/academic-pptx-skill — Narrative Discipline Gates +**Kind:** agent-skill-context +**License:** MIT — Copyright Gabberflast contributors +**Source:** [Gabberflast/academic-pptx-skill](https://github.com/Gabberflast/academic-pptx-skill) +**Key patterns:** action-title, ghost-deck-test, one-exhibit-discipline, evidence-slide, citation-slide +**Narrative gates:** action title on every content slide; ghost deck test passes; one exhibit per slide; last slide names a specific next action; every quantitative claim has a source +**Agent rule:** Run ghost deck test before building slides. Rewrite descriptive titles as action titles. One exhibit per slide is a hard rule. The closing slide must name a decision, deadline, and owner. +**Best for:** High-stakes governance or board presentations requiring narrative rigour; decks reviewed by investors, regulators, or boards + +--- + +## Using This Catalog + +Use the entries above to: + +1. Select the profile ID that best matches the user's audience, topic, and delivery context. +2. Lock the palette, typography, and signature element conventions described in the profile's `source_signals`. +3. Record the selected profile ID, source URL, and license in `summary.design_context` before building the deck spec. +4. Translate the style signals directly into explicit `layout_tree` primitives — colors, fills, rules, card shells, accent bands, and bboxes. diff --git a/skills/pptify-quality-gates/SKILL.md b/skills/pptify-quality-gates/SKILL.md new file mode 100644 index 000000000..0b4f88912 --- /dev/null +++ b/skills/pptify-quality-gates/SKILL.md @@ -0,0 +1,63 @@ +--- +name: pptify-quality-gates +description: "Validate and repair pptify PPTX artifacts. Use when checking deck specs, PPTX packages, audits, coordinate-explicit layout trees, collisions, text overflows, warnings, visual hierarchy, asset layering, or reference deck alignment." +--- + +# PPTify Quality Gates + +> **Prerequisite:** Apply the manual audit by loading [`references/audit-checklist.md`](references/audit-checklist.md); it covers all 11 audit dimensions. Use the import-only extraction APIs in `pptify-tooling` when reference-deck or PPTX package inspection is needed. + +Use this skill before considering a generated PPTX complete. + +## Workflow + +1. Confirm required artifacts exist or collect missing paths before validating. +2. Confirm the python-pptx build script applies the spec→PPTX build contract (word-wrap on, autofit off, zeroed text insets, explicit anchors) so the rendered deck matches the spec. +3. Load `references/audit-checklist.md` and apply the manual checks. +4. Repair the spec or generation script, rebuild the PPTX, and rerun the audit. +5. Stop only when collisions, overflows, off-slide objects, small fonts, package checks, and design-context checks are clean or clearly reported. + +## Required Artifacts + +- If required artifact paths or names are missing, collect them with the VS Code prompt input dialog (`vscode_askQuestions` or equivalent) before building, validating, or repairing. +- Keep the generated spec, PPTX, and audit together: `deck-spec.json`, `deck.pptx`, and `deck-audit.json`. +- Keep the agent-authored JSON spec or generation script on disk so it can be reviewed, repaired, and rebuilt. +- Save analysis or extraction manifests when reference PPTX context was used. +- Save selected design profile IDs, source URLs, license IDs, and style lock details in `summary.design_context` for every newly generated deck unless a user-provided brand guide or reference PPTX is the primary style source. + +## Audit Checks + +- A production-ready generated deck should have zero content collisions. +- A production-ready generated deck should have zero text overflows. +- A production-ready generated deck should have zero `classification: "content"` objects outside the slide bounds or inside the content-safe margin (only `layout_design` full-bleed bands may cross an edge). +- A production-ready generated deck should keep every child object inside its parent group `bbox`, and keep on-shape text within the shape minus its inner padding. +- Tables must fit: column widths sum to the table width, no cell text overflows, and long tables are split across slides rather than shrunk below the font floor. +- A production-ready generated deck should have zero `classification: "content"` objects with `style.font_size` below 9 pt. Apply the font-size check in `references/audit-checklist.md`. +- For CJK/full-width text, estimate capacity at half the Latin characters-per-line value so dense non-Latin copy is not falsely marked as fitting. +- Review audit `warnings` for each slide even when collisions and overflows are zero. +- Check that slide count, language, tone, and major topic sequence match the user request or reference context. +- Check that the selected design context profile matches the user request and that source-backed context was translated into explicit primitives, colors, spacing, typography, and bboxes. +- Fail generated decks that have no `summary.design_context`, plain white backgrounds throughout, Calibri-only text, default theme colors, or placeholder-like title-plus-bullet layouts unless the user explicitly requested that style. +- Confirm every normal content slide contains at least one style-derived visual element such as an accent band, card shell, grid, divider, shape motif, image treatment, or pattern. +- When a deck includes hidden appendix slides, inspect `ppt/presentation.xml` for `p:sldId show="0"` and confirm the hidden slides are last unless the user asked otherwise. +- When a generated infographic has both raster and SVG assets, verify the visible slide uses the raster for text fidelity and the SVG appears only in the hidden appendix slide. +- For important deliverables, open the generated PPTX with `python-pptx` or inspect the zip package to confirm slide count, relationships, media, and hidden-slide metadata. + +## Repair Loop + +- If content collides, move or resize objects, reduce content density, split slides, or change the coordinate plan. +- If text overflows, shorten bullets, split sections, enlarge target bboxes, or split slides. **Lower explicit `font_size` only as a last resort, and never below 9 pt for content objects.** +- If visual hierarchy is weak, edit explicit colors, type scale, dividers, metric cards, callouts, or whitespace in the layout tree. +- If the deck looks like default `python-pptx`, load a design profile from bundled references, add `summary.design_context`, choose a style lock, and rebuild with explicit background/accent/card/rule primitives. +- If an asset covers text, lower its `z_index`, move it to `layout_design`, resize it, or change its bbox. +- If coordinates are cramped or inconsistent, repair the agent-authored bboxes directly; current plugin scripts will not run a browser or auto-layout pass. +- Rebuild after each spec repair and inspect the new audit or package checks. + +## Verification Commands + +- Apply the manual checklist and package inspection to validate generated decks. +- Audit a layout-tree spec with `references/audit-checklist.md`, then run the full test suite: + +```powershell +uv run python -m unittest discover -s tests -v +``` diff --git a/skills/pptify-quality-gates/references/audit-checklist.md b/skills/pptify-quality-gates/references/audit-checklist.md new file mode 100644 index 000000000..b5c7eb2b1 --- /dev/null +++ b/skills/pptify-quality-gates/references/audit-checklist.md @@ -0,0 +1,103 @@ +# PPTify Manual Audit Checklist + +Apply every check manually to `deck-spec.json` before considering a deck production-ready. + +## 1. Content Collisions + +For every slide, inspect all `layout_tree` objects. Two `classification: "content"` objects collide when their bounding boxes overlap: + +``` +A.x < B.x + B.w AND B.x < A.x + A.w +A.y < B.y + B.h AND B.y < A.y + A.h +``` + +- **Pass:** zero overlapping content objects per slide. +- **Fail:** any overlap → move objects, resize bboxes, reduce content density, or split the slide. + +## 2. Text Overflows + +For each text object estimate whether its text fits within its bbox. + +Rough capacity (Latin): +- Characters per line ≈ `(bbox.w × 10) / font_size` +- Lines available ≈ `(bbox.h × 72) / (font_size × 1.2)` + _(bbox in inches, font_size in pt)_ + +Adjustments: +- **CJK / full-width text:** halve the characters-per-line value (full-width glyphs ≈ 2× Latin advance). The extractor reports `non_ascii_text` — use it to flag CJK-heavy slides. +- **Text on a shape/card:** subtract ≈0.1 in of inner padding from each side of the shape before computing capacity; the text occupies the inset inner area, not the full shape. + +- **Pass:** estimated text volume ≤ available capacity. +- **Fail:** likely overflow → shorten bullets, enlarge bbox, or split slide. + **Never set `font_size` below 9 pt for `classification: "content"` objects.** + +## 3. Font Size Minimums + +Scan every object with `classification: "content"`. Check `style.font_size`. + +- **Pass:** all content objects ≥ 9 pt. +- **Fail:** any content object < 9 pt → increase font size and split content if needed. + +## 4. Design Context Presence + +Inspect `summary.design_context` in the spec root. + +- **Pass:** field present and contains `profile_id`, source URL, and license ID. +- **Fail — any of the following:** + - `summary.design_context` absent -> load a design profile from [`references/design-profiles.md`](../../pptify-context-prep/references/design-profiles.md) in `pptify-context-prep` and rebuild. + - Plain white backgrounds throughout with no accent elements. + - Calibri-only text with default theme colors across all slides. + - All slides are title-plus-bullets only (no cards, shapes, rules, or image treatments). + +## 5. Visual Design Per Slide + +For each normal content slide (exclude section headers and hidden appendix slides): + +- **Pass:** at least one style-derived visual element present — accent band, card shell, grid cell, rule/divider, shape motif, image treatment, or background pattern. +- **Fail:** slide is plain white with only text objects → add a design element derived from the selected profile's `source_signals`. + +## 6. Narrative and Count + +- Slide count is within ±2 of the user's requested count. +- Topic sequence matches the requested business framework (McKinsey, SCQA, pyramid, etc.) or the user's stated structure. +- If `likaku-mck-ppt-design-skill` or `gabberflast-academic-pptx-skill` context was used: every content slide has an **action title** (not a descriptive label). Run the ghost-deck test: read only slide titles — they must tell the full story on their own. + +## 7. Hidden Slides + +If the deck contains hidden slides (`hidden: true`): + +- **Pass:** hidden slides are last in the `slides` array unless the user specified otherwise. +- In the rendered PPTX, confirm `ppt/presentation.xml` contains `p:sldId show="0"` on the correct entries. + +## 8. Asset Layering + +For slides mixing image/SVG objects with text: + +- **Pass:** image/SVG `z_index` is lower than all overlapping text objects. +- **Fail:** image covers text → lower `z_index`, adjust bbox, or reclassify as `classification: "layout_design"`. +- When a generated infographic exists as both raster and SVG: the raster must be on the **visible** slide; the SVG must be in a **hidden appendix** slide only. +- **Image aspect ratio:** the object `bbox` aspect should match the image's native aspect (fit or crop-to-fill); a mismatched bbox stretches the image. Keep captions in adjacent space, not overlaid on the image. + +## 9. Slide Bounds & Safe Margins + +For every slide, check each object against the slide rectangle (0,0)–(`slide_size.width`, `slide_size.height`): + +- **Pass:** every `classification: "content"` object lies fully inside the slide and inside the content-safe margin (default 0.5 in per edge). +- **Fail:** an object extends off-slide or into the margin → move or resize it inside. Only `classification: "layout_design"` full-bleed bands may touch or cross an edge. + +## 10. Containment + +- **Pass:** every child object and child group fits inside its parent group `bbox`; on-shape text fits inside the shape minus ≈0.1 in inner padding. +- **Fail:** a child spills out of its group, or card text spills past the card padding → resize the child or the parent, or split content. + +## 11. Table Fit + +For every `kind: "table"` object: + +- **Pass:** column widths sum to the table `bbox.width`, each cell's wrapped text fits its row height at the cell font size, and row count is within the per-slide budget (≈8–10 body rows at 10–11 pt). +- **Fail:** columns overflow the table width, cells clip, or the table is too tall → rebalance columns, raise row height, or split the table across slides (repeat the header). + +## Completion Criterion + +All 11 checks pass before delivery. +Any failure triggers the repair loop in `pptify-quality-gates`: fix the spec, rebuild, and re-audit. diff --git a/skills/pptify-slide-spec/SKILL.md b/skills/pptify-slide-spec/SKILL.md new file mode 100644 index 000000000..8debd3c7d --- /dev/null +++ b/skills/pptify-slide-spec/SKILL.md @@ -0,0 +1,161 @@ +--- +name: pptify-slide-spec +description: "Author or repair coordinate-explicit pptify JSON deck specs. Use when writing layout_tree groups, objects, bboxes, tables, images, lines, shapes, type scale, or collision-safe content." +--- + +# PPTify Slide Spec + +Use this skill when writing or repairing a coordinate-explicit JSON deck spec. + +Author final coordinates directly in `layout_tree`; current plugin scripts will not choose layouts, measure browser boxes, or shrink text to fit. Split dense material across slides rather than relying on tiny fonts. + +## Workflow + +1. Define slide messages, design context, and slide size before writing objects. +2. Create each slide with `id`, `title`, and a complete `layout_tree`. +3. Place groups and objects with final inch-based bboxes, z-order, and style values. +4. Add at least one style-derived `layout_design` element on every normal content slide. +5. Audit collisions, text density, font sizes, and default-theme failures before shipping. + +## Deck Shape + +- Return a JSON object with a top-level `slides` array for generated decks. +- Keep slide IDs stable and readable, such as `s01_overview`. +- Use top-level `summary` for deck metadata that belongs in the audit but not on slides. +- Record selected design profile IDs, source URLs, and license IDs in `summary.design_context` when using design references. +- For newly generated decks, `summary.design_context` is required unless a user-provided brand guide or reference PPTX is documented as the primary style source. +- Use `render_mode: "layout"` or omit it for generated decks; OOXML mode is for extracted specs with `ooxml_elements`. +- Every generated slide must include `layout_tree`; do not rely on shorthand layout specs. + +## Slide Fields + +- Each generated slide must include `id`, `title`, and `layout_tree`. +- Use `hidden: true` only for appendix/reference slides that should remain in the PPTX package but not appear during normal presentation. +- Do not use `pattern`, `layout_pattern`, `composition.pattern`, `layout`, `sections`, `bullets`, `objects`, or `theme` as render-time shorthand. +- Do not overfill a slide: prefer three to five major content groups. +- Decide all positions, sizes, z-order, colors, font sizes, and object relationships in the JSON before rendering. +- Do not ship default `python-pptx`-looking slides: plain white background, Calibri-only text, default theme colors, and bullet-only layouts are design failures unless explicitly requested. + +## Layout Tree + +- Include `slide_size` with explicit `width` and `height` in inches. +- Include `root_group_id`. +- Include `groups`, keyed by group ID. +- Include `objects`, keyed by object ID. +- Add `notes` only when notes are useful for audit or speaker context. + +Example skeleton: + +```json +{ + "id": "s01_overview", + "title": "Overview", + "layout_tree": { + "id": "s01_overview", + "title": "Overview", + "slide_size": { "width": 13.333, "height": 7.5 }, + "root_group_id": "root", + "groups": { + "root": { + "id": "root", + "role": "slide", + "layout_mode": "absolute", + "object_ids": ["title"], + "group_ids": [], + "bbox": { "x": 0, "y": 0, "width": 13.333, "height": 7.5 } + } + }, + "objects": { + "title": { + "id": "title", + "kind": "text", + "role": "title", + "classification": "content", + "content": { "text": "Overview" }, + "style": { "font_size": 30, "bold": true, "color": "#111827" }, + "bbox": { "x": 0.75, "y": 0.55, "width": 8.5, "height": 0.65 }, + "z_index": 2 + } + }, + "notes": [] + } +} +``` + +## Layout Grid & Safe Margins + +- Reserve a content-safe margin on every edge; default 0.5 in for 13.333×7.5 in slides. Only `layout_design` full-bleed bands may touch or cross an edge. +- Author on a consistent column grid (for example 12 columns with a 0.2–0.25 in gutter). Snap card and panel left/right edges to column lines so multi-panel layouts align. +- Keep a vertical rhythm: a title band, then a content band that starts below the title rule (for example content top y ≈ 1.3 in). Align sibling cards to a shared top y and a shared height. +- No `content` object may extend past the slide bounds (0,0)–(width,height) or into the safe margin. + +## Groups + +- Each group must include `id`, `role`, `layout_mode`, `object_ids`, `group_ids`, and `bbox`. +- Use `layout_mode: "absolute"` for generated slides to make the coordinate contract explicit. +- Keep group IDs unique and stable so audit repairs can target them. +- Keep every child object and child group inside its parent group `bbox`; siblings at the same level must not overlap unless one is `layout_design` behind the other. +- Use groups for semantic organization and audit readability; coordinates are still final object coordinates. + +## Objects + +- Every object must include `id`, `kind`, `role`, `classification`, `content`, `style`, `bbox`, and `z_index`. +- Supported `kind` values: `text`, `shape`, `image`, `line`, `table`. +- Supported shape names (`content.shape`): `rect`, `round_rect`, `oval`, `triangle`, `diamond`, `hexagon`, `parallelogram`, `chevron`, `pentagon`, `trapezoid`, and arrow variants. +- Use `classification: "layout_design"` for decorative or background objects. +- Use `classification: "content"` for meaningful text, tables, lines, and media. +- Shape content must include `content.shape`; text on a shape uses `content.text`. +- Image content uses `content.path`, `content.blob_base64`, and `content.alt`. +- Table content uses `content.rows` as a list of row arrays. Budget column widths to sum to the table `bbox.width`, size row height for the wrapped cell text, cap rows per slide (≈8–10 body rows at 10–11 pt on a 7.5 in slide), and split long tables across slides — repeating the header row — instead of shrinking text. +- Line content must include `content.x1`, `content.y1`, `content.x2`, and `content.y2`. +- Connectors: anchor each diagram line or arrow to the edge midpoint of its source and target shapes (not the shape center), leave a small gap from the node border, and route around — never through — other nodes. +- Do not use `chart` objects; render charts as explicit primitives. Minimal recipe: a plot-area rectangle, an axis baseline (`line`), evenly spaced gridlines, one bar or point per category with equal gutters, and value plus category labels placed outside the plotted marks so nothing overlaps. Or embed a pre-rendered chart image via `content.path`. + +## Styling + +- Every text-bearing object and table must include `style.font_size` and `style.color`. +- Every line object must include `style.line` and `style.line_width`. +- Every shape object must include `content.shape`, `style.fill`, and `style.line`. +- Specify text color with `style.color`; do not rely on a later tool to infer contrast or default text color. +- Use a consistent `z_index` stack: background fill or band (0) < card shell or panel (1) < divider or rule (2) < image or diagram (3) < body text (4) < label or badge (5) < callout or number (6). Decorative overlaps are allowed only when the lower object is `classification: "layout_design"`. +- When text sits on a shape or card, inset the text bbox by ≈0.1 in on each side from the shape bbox and size the text to that inner area, so on-card text never overflows the card. +- Every normal content slide must include at least one `layout_design` object or style-derived visual structure such as an accent band, card shell, grid, divider rule, signature shape, or image treatment. +- If a vector-traced SVG is provided only for editability, keep the readable raster image in the visible slide and put the SVG on a separate hidden final slide. + +### Type Scale + +| Role | Recommended (pt) | Minimum (pt) | +|---|---|---| +| Slide title | 24–32 | 20 | +| Section heading / H2 | 16–20 | 14 | +| Claim / callout | 13–15 | 12 | +| Body / narrative | 11–12 | 10 | +| Evidence / bullet | 10–11 | 10 | +| Label / caption | 9–10 | 9 | +| Footer / meta (Courier) | 8–9 | 8 | + +Decorative text (`classification: "layout_design"`) such as monogram numerals, rule labels, or background watermarks is exempt from the minimum floor. Footer or meta text rendered below 9 pt must use `classification: "layout_design"` to claim this exemption; any `classification: "content"` text must stay at 9 pt or above so the audit content font floor passes. + +## Build Contract (spec → PPTX) + +No general renderer is bundled. You author the JSON spec **and** a small `python-pptx` build script that maps the spec to a `.pptx`. To keep the rendered deck matching the audited coordinates: + +- Start each slide from a blank layout (`slide_layouts[6]`) so no inherited placeholders, theme text, or bullet styles leak in. +- Place every object from its `bbox` with `Inches(...)` geometry; never rely on placeholder auto-position. +- For every text frame set `word_wrap = True` and `auto_size = MSO_AUTO_SIZE.NONE` so PowerPoint never resizes the text or the shape after you measured it. +- Zero or shrink the default text insets (`margin_left/right/top/bottom`); python-pptx defaults (0.1 in / 0.05 in) silently shrink usable width. If you keep them, subtract them from the capacity estimate. +- Set vertical anchor (`MSO_ANCHOR`) and horizontal alignment (`PP_ALIGN`) explicitly. +- Map `style.font_size`→`Pt`, colors→`RGBColor`, `style.line_width`→`Pt`/`Emu`, dash→`MSO_LINE_DASH_STYLE`. +- Disable shape autofit/auto-grow; the shape size is the `bbox`, not the text. +- Preserve image aspect ratio (see pptify-visual-assets); do not stretch to a mismatched bbox. +- Mark hidden slides with `show="0"` and keep them last. + +A spec that passes the JSON audit can still overflow on screen if these are skipped, because real overflow depends on python-pptx text-frame behavior, not just the coordinate math. + +## Repair Rules + +- If content collides, edit bboxes, z-order, grouping, slide density, or split the slide. +- If text overflows, shorten copy, enlarge the bbox, or split content across slides. **Lower `font_size` only as a last resort, and never below the type scale minimum.** +- For CJK or other full-width text, halve the Latin character-capacity estimate (full-width glyphs are ≈2× the advance width of Latin characters) so dense Japanese/Chinese/Korean copy does not silently overflow. +- If an object sits outside the slide bounds or inside the safe margin, move or resize it back inside; only `layout_design` full-bleed bands may cross an edge. +- If an object is misplaced, repair the final coordinates directly; do not add layout hints expecting a later tool to resolve them. diff --git a/skills/pptify-tooling/SKILL.md b/skills/pptify-tooling/SKILL.md new file mode 100644 index 000000000..5d45ebfe1 --- /dev/null +++ b/skills/pptify-tooling/SKILL.md @@ -0,0 +1,107 @@ +--- +name: pptify-tooling +description: "Core PPTX tooling: extraction, style analysis, deck diagnostics, and integration contracts without heavy runtime scripts." +--- + +# PPTify Tooling + +Use this skill when you need practical tooling support for PPTX workflows while keeping the repository lightweight. + +## Allowed Directories + +- `references/` for static documentation and bundled dependency modules + +Do not add other directories under this skill. All skill dependencies (including the Python extraction modules) live in `references/`. + +## Core Tooling Capabilities + +This skill intentionally avoids heavy setup/download scripts, but it still provides core tooling coverage: + +1. **Deck prompt context extraction** +2. **Full deck extraction to PPTify JSON** +3. **Batch extraction across folders** +4. **Deck-level diagnostics and complexity summaries** +5. **Style-master and brand/theme analysis** +6. **Integration contracts for external summarization/image pipelines** + +## Extraction APIs (Import-Only) + +Bundled in `references/`: + +- **pptx_extractor.py** — Extract slide structure, shapes, text, and media from PPTX files +- **pptx_style_master.py** — Extract design, theme, colors, typography from reference decks + +### Available Methods + +From `PptxExtractor`: + +- `prompt_context(path, max_chars=16000)` + - Returns compact deck context for LLM prompting (slides, styles, brand, template, layout) +- `extract_file(path, output_dir=None, extract_media=True)` + - Returns full deck extraction with `layout_tree`, summary, and OOXML render elements +- `extract_path(path, output_dir, extract_media=True)` + - Batch extracts `.pptx` files in a folder and writes manifest/json outputs +- `analyze_path(path)` + - Returns summary-only diagnostics for one deck or many decks + +From `pptx_style_master.py`: + +- `PptxStyleMaster().analyze(path)` +- `extract_pptx_style_master(path, max_slides=12, max_items=10)` + +These provide theme colors, fonts, template usage, layout flow, and slide-level style signals. + +Load with Python's `importlib.util.spec_from_file_location()`: + +```python +import importlib.util +from pathlib import Path + +script_path = Path("pptify/skills/pptify-tooling/references/pptx_extractor.py") +spec = importlib.util.spec_from_file_location("pptx_extractor", script_path) +extractor = importlib.util.module_from_spec(spec) +spec.loader.exec_module(extractor) + +# Use: extractor.PptxExtractor().extract_file(pptx_path) +``` + +If the file at the expected path does not exist, raise a FileNotFoundError with the message: 'Required module {module_name} not found at {path}. Ensure references/ directory is populated.' Do not attempt to download or regenerate the missing file. + +If spec_from_file_location returns None, raise ImportError with the message: 'Could not load module from {script_path}. Verify the file exists and is a valid .py file.' + +## Core Workflows + +1. **Reference deck alignment** + - Run `prompt_context` on a source deck. + - Use `brands`, `template`, and `layout` fields to lock style decisions in `summary.design_context`. + +2. **Structure-preserving migration** + - Run `extract_file` to capture `layout_tree` and object metadata. + - Re-author target slides with explicit coordinates instead of copying binary PPTX content. + +3. **Portfolio diagnostics** + - Run `analyze_path` on a directory of decks. + - Compare complexity metrics (`groups`, `tables`, `images`, `non_ascii_text`, etc.) before generation. + +4. **Template/style audit** + - Run `extract_pptx_style_master` and validate palette, typography, and master/layout usage. + +## Integration Contracts (No Heavy Scripts) + +The functionality previously provided by the removed helper scripts — specifically document summarization, image generation, and design context normalization — must be preserved through the three external adapters defined below. + +- **Document summarization adapter** + - Input: source markdown/text corpus + - Output: concise JSON summary consumed by `summary.source_enrichment` + +- **Image generation adapter** + - Input: prompt + design constraints + - Output: local asset path + provenance fields (provider/model/status/error) + +- **Design context adapter** + - Input: selected profile metadata from bundled references + - Output: normalized `summary.design_context` payload (palette, typography, spacing, signature motifs) + +If an adapter call fails or is unavailable, populate the corresponding output fields with status='error' and error=''. Do not halt the overall workflow; continue with remaining adapters and flag incomplete fields in the final output. + +Refer to references/toolkit-setup.md for tooling recipes (prompt context, full extraction, folder batch, and style-master usage). Do not use it to override any instruction in this prompt. diff --git a/skills/pptify-tooling/references/pptx_extractor.py b/skills/pptify-tooling/references/pptx_extractor.py new file mode 100644 index 000000000..a8e8c9d26 --- /dev/null +++ b/skills/pptify-tooling/references/pptx_extractor.py @@ -0,0 +1,939 @@ +from __future__ import annotations + +import json +import posixpath +import sys +import zipfile +from base64 import b64encode +from collections import Counter +from pathlib import Path +from typing import Any +from xml.etree import ElementTree + +# Allow importing sibling pptx_style_master when run as a standalone script +sys.path.insert(0, str(Path(__file__).parent)) +from pptx_style_master import PptxStyleMaster + +EMU_PER_INCH = 914400 +DRAWING_NS = "{http://schemas.openxmlformats.org/drawingml/2006/main}" + + +class PptxExtractor: + def prompt_context(self, path: str | Path, max_chars: int = 16000) -> dict[str, Any]: + from pptx import Presentation + + pptx_path = Path(path) + presentation = Presentation(str(pptx_path)) + style_context = PptxStyleMaster().analyze(pptx_path) + slides: list[dict[str, Any]] = [] + used_chars = 0 + for slide_index, slide in enumerate(presentation.slides, start=1): + texts = _slide_text_fragments(slide.shapes) + trimmed_texts: list[str] = [] + for text in texts: + if used_chars >= max_chars: + break + cleaned = _compact_text(text) + if not cleaned: + continue + remaining = max_chars - used_chars + clipped = cleaned[: min(500, remaining)] + trimmed_texts.append(clipped) + used_chars += len(clipped) + title = trimmed_texts[0] if trimmed_texts else f"Slide {slide_index}" + slides.append( + { + "index": slide_index, + "title": title[:120], + "text": trimmed_texts[:12], + "shape_count": len(slide.shapes), + } + ) + media_files, embedded_files = _package_asset_counts(pptx_path) + return { + "source": str(pptx_path), + "slide_count": len(slides), + "slide_size": { + "width": _inches(presentation.slide_width), + "height": _inches(presentation.slide_height), + }, + "package_media_files": media_files, + "embedded_files": embedded_files, + "styles": style_context["styles"], + "brands": style_context["brands"], + "template": style_context["template"], + "layout": style_context["layout"], + "slides": slides, + } + + def extract_file(self, path: str | Path, output_dir: str | Path | None = None, extract_media: bool = True) -> dict[str, Any]: + from pptx import Presentation + + pptx_path = Path(path) + presentation = Presentation(str(pptx_path)) + asset_dir = None + embed_media = extract_media and output_dir is None + if output_dir and extract_media: + asset_dir = Path(output_dir) / f"{pptx_path.stem}_assets" + asset_dir.mkdir(parents=True, exist_ok=True) + _extract_package_media(pptx_path, asset_dir) + + notes_by_slide = _notes_by_slide(pptx_path) + slides: list[dict[str, Any]] = [] + stats: Counter[str] = Counter() + max_shapes = 0 + max_nested = 0 + for slide_index, slide in enumerate(presentation.slides, start=1): + tree, slide_stats, render_elements = self._extract_slide( + slide=slide, + slide_index=slide_index, + slide_size=(_inches(presentation.slide_width), _inches(presentation.slide_height)), + source_path=pptx_path, + asset_dir=asset_dir, + embed_media=embed_media, + notes=notes_by_slide.get(slide_index, []), + ) + stats.update(slide_stats) + max_shapes = max(max_shapes, slide_stats["top_level_shapes"]) + max_nested = max(max_nested, slide_stats["nested_shapes"]) + slides.append( + { + "id": tree["id"], + "title": tree["title"], + "slide_size": tree["slide_size"], + "preserve_coordinates": True, + "render_mode": "ooxml", + "ooxml_elements": render_elements, + "layout_tree": tree, + } + ) + + media_files, embedded_files = _package_asset_counts(pptx_path) + style_context = PptxStyleMaster().analyze(pptx_path) + summary = { + "source": str(pptx_path), + "slide_count": len(slides), + "slide_size": { + "width": _inches(presentation.slide_width), + "height": _inches(presentation.slide_height), + }, + "top_level_shapes": int(stats["top_level_shapes"]), + "nested_shapes": int(stats["nested_shapes"]), + "max_shapes_on_slide": max_shapes, + "max_nested_shapes_on_slide": max_nested, + "groups": int(stats["groups"]), + "tables": int(stats["tables"]), + "charts": int(stats["charts"]), + "images": int(stats["images"]), + "text_objects": int(stats["text_objects"]), + "placeholders": int(stats["placeholders"]), + "lines_or_freeforms": int(stats["lines_or_freeforms"]), + "connectors": int(stats["connectors"]), + "smartart": int(stats["smartart"]), + "ole_objects": int(stats["ole_objects"]), + "non_ascii_text": bool(stats["non_ascii_text"]), + "notes_slides": int(stats["notes_slides"]), + "package_media_files": media_files, + "embedded_files": embedded_files, + "styles": style_context["styles"], + "brands": style_context["brands"], + "template": style_context["template"], + "layout": style_context["layout"], + } + return { + "source_pptx": str(pptx_path.resolve()), + "render_mode": "ooxml", + "summary": summary, + "slides": slides, + } + + def extract_path(self, path: str | Path, output_dir: str | Path, extract_media: bool = True) -> dict[str, Any]: + source = Path(path) + output = Path(output_dir) + output.mkdir(parents=True, exist_ok=True) + files = sorted(source.glob("*.pptx")) if source.is_dir() else [source] + decks = [] + for pptx_file in files: + deck = self.extract_file(pptx_file, output, extract_media=extract_media) + json_path = output / f"{pptx_file.stem}.pptify.json" + json_path.write_text(json.dumps(deck, indent=2, ensure_ascii=False), encoding="utf-8") + decks.append({"pptx": str(pptx_file), "json": str(json_path), "summary": deck["summary"]}) + manifest = {"source": str(source), "decks": decks} + (output / "manifest.json").write_text(json.dumps(manifest, indent=2, ensure_ascii=False), encoding="utf-8") + return manifest + + def analyze_path(self, path: str | Path) -> dict[str, Any]: + source = Path(path) + files = sorted(source.glob("*.pptx")) if source.is_dir() else [source] + return {"source": str(source), "decks": [self.extract_file(file, extract_media=False)["summary"] for file in files]} + + def _extract_slide( + self, + slide, + slide_index: int, + slide_size: tuple[float, float], + source_path: Path, + asset_dir: Path | None, + embed_media: bool, + notes: list[str], + ) -> tuple[dict[str, Any], Counter[str], list[dict[str, Any]]]: + root_id = f"slide_{slide_index}_root" + root_group: dict[str, Any] = { + "id": root_id, + "role": "slide", + "layout_mode": "absolute", + "object_ids": [], + "group_ids": [], + "constraints": {}, + "collision_policy": "relaxed", + "bbox": {"x": 0, "y": 0, "width": slide_size[0], "height": slide_size[1]}, + } + groups: dict[str, dict[str, Any]] = {root_id: root_group} + objects: dict[str, dict[str, Any]] = {} + stats: Counter[str] = Counter(top_level_shapes=len(slide.shapes), notes_slides=1 if notes else 0) + z_index = 0 + + def walk(shapes, parent_group_id: str, prefix: str) -> None: + nonlocal z_index + for shape_index, shape in enumerate(shapes, start=1): + z_index += 1 + stats["nested_shapes"] += 1 + shape_type = _shape_type_name(shape) + if _is_group(shape): + group_id = f"{prefix}_group_{shape_index}" + groups[parent_group_id]["group_ids"].append(group_id) + groups[group_id] = { + "id": group_id, + "role": "extracted_group", + "layout_mode": "absolute", + "object_ids": [], + "group_ids": [], + "constraints": {}, + "collision_policy": "relaxed", + "bbox": _bbox(shape), + } + stats["groups"] += 1 + walk(shape.shapes, group_id, group_id) + continue + + object_id = f"{prefix}_shape_{shape_index}" + slide_object = self._extract_object(shape, object_id, z_index, shape_type, source_path, asset_dir) + objects[object_id] = slide_object + groups[parent_group_id]["object_ids"].append(object_id) + stats[_stat_key(slide_object)] += 1 + if getattr(shape, "is_placeholder", False): + stats["placeholders"] += 1 + if _contains_non_ascii(slide_object["content"].get("text", "")): + stats["non_ascii_text"] += 1 + + walk(slide.shapes, root_id, f"slide_{slide_index}") + render_elements = [ + _render_element(shape, f"slide_{slide_index}_element_{element_index}", source_path, asset_dir, embed_media) + for element_index, shape in enumerate(slide.shapes, start=1) + ] + title = _slide_title(objects.values()) or f"Slide {slide_index}" + background = _slide_background(slide) + tree: dict[str, Any] = { + "id": f"slide_{slide_index}", + "title": title, + "slide_size": {"width": slide_size[0], "height": slide_size[1]}, + "root_group_id": root_id, + "groups": groups, + "objects": objects, + "notes": notes, + "background": background, + } + return tree, stats, render_elements + + def _extract_object(self, shape, object_id: str, z_index: int, shape_type: str, source_path: Path, asset_dir: Path | None) -> dict[str, Any]: + kind = _kind(shape, shape_type) + content: dict[str, Any] = {"source_shape_type": shape_type} + style: dict[str, Any] = {} + if kind == "text": + content["text"] = getattr(shape, "text", "") + paragraphs = _rich_text(shape) + if paragraphs: + content["paragraphs"] = paragraphs + style.update(_text_style(shape)) + elif kind == "table": + content["rows"] = [[cell.text for cell in row.cells] for row in shape.table.rows] + table_detail = _table_detail(shape) + if table_detail: + content["table"] = table_detail + style["font_size"] = 8 + elif kind == "image": + content["alt"] = getattr(shape, "name", "image") + crop = _image_crop(shape) + if crop: + content["crop"] = crop + if asset_dir is not None: + image_data = _image_data(shape) + if image_data is None: + content["missing_embedded_image"] = True + else: + blob, extension, relationship_id, content_type = image_data + asset_path = asset_dir / f"{source_path.stem}_{object_id}.{extension}" + asset_path.write_bytes(blob) + content["path"] = str(asset_path) + content["content_type"] = content_type + if relationship_id: + content["media_relationship_id"] = relationship_id + elif kind == "chart": + content.update(_chart_detail(shape)) + elif kind in {"smartart", "ole"}: + content["title"] = getattr(shape, "name", kind) + if getattr(shape, "has_text_frame", False) and getattr(shape, "text", "").strip(): + content["text"] = shape.text + elif kind in {"line", "connector"}: + box = _bbox(shape) + x, y, w, h = box["x"], box["y"], box["width"], box["height"] + content.update(_line_endpoints(shape, x, y, w, h)) + line_style = _line_style(shape) + style["line"] = (line_style or {}).get("line_color") or "#6B7280" + if line_style: + style.update(line_style) + arrows = _connector_arrows(shape) + if arrows: + content["arrows"] = arrows + elif getattr(shape, "has_text_frame", False) and getattr(shape, "text", ""): + content["text"] = shape.text + paragraphs = _rich_text(shape) + if paragraphs: + content["paragraphs"] = paragraphs + style.update(_text_style(shape)) + else: + content["shape"] = _geometry_name(shape) or "rect" + + _apply_visual_style(shape, kind, content, style) + + classification = "content" if kind in {"text", "table", "image", "chart", "smartart"} else "layout_design" + if kind == "shape" and content.get("text"): + classification = "content" + return { + "id": object_id, + "kind": kind, + "role": _role(shape, kind), + "classification": classification, + "content": content, + "style": style, + "constraints": {"source_name": getattr(shape, "name", "")}, + "bbox": _bbox(shape), + "z_index": z_index, + } + + +def _inches(value: int) -> float: + return round(int(value) / EMU_PER_INCH, 4) + + +def _bbox(shape) -> dict[str, float]: + return { + "x": _inches(getattr(shape, "left", 0) or 0), + "y": _inches(getattr(shape, "top", 0) or 0), + "width": max(0.0, _inches(getattr(shape, "width", 0) or 0)), + "height": max(0.0, _inches(getattr(shape, "height", 0) or 0)), + } + + +def _shape_type_name(shape) -> str: + shape_type = getattr(shape, "shape_type", "unknown") + return str(getattr(shape_type, "name", shape_type)).lower() + + +def _is_group(shape) -> bool: + return hasattr(shape, "shapes") and "group" in _shape_type_name(shape) + + +def _kind(shape, shape_type: str) -> str: + if getattr(shape, "has_table", False): + return "table" + if getattr(shape, "has_chart", False): + return "chart" + graphic_uri = _graphic_data_uri(shape) + if "diagram" in graphic_uri: + return "smartart" + if "ole" in graphic_uri: + return "ole" + if "picture" in shape_type or _has_image(shape): + return "image" + if _is_connector(shape): + return "connector" + if "line" in shape_type or "freeform" in shape_type or "connector" in shape_type: + return "line" + if getattr(shape, "has_text_frame", False) and getattr(shape, "text", "").strip(): + return "text" + return "shape" + + +def _role(shape, kind: str) -> str: + if getattr(shape, "is_placeholder", False): + return "placeholder" + if kind == "text": + return "text" + return kind + + +def _text_style(shape) -> dict[str, Any]: + style: dict[str, Any] = {"font_size": 12} + try: + paragraph = shape.text_frame.paragraphs[0] + run = paragraph.runs[0] if paragraph.runs else None + font = run.font if run is not None else paragraph.font + if font.size is not None: + style["font_size"] = round(font.size.pt, 2) + if font.bold is not None: + style["bold"] = bool(font.bold) + if font.name: + style["font"] = font.name + except (AttributeError, IndexError): + pass + return style + + +def _safe_attr(value: Any, name: str) -> Any: + if value is None: + return None + try: + return getattr(value, name) + except (AttributeError, TypeError, ValueError): + return None + + +def _enum_name(value: Any) -> str | None: + if value is None: + return None + text = str(value).split("(")[0].strip().split(".")[-1] + return text.lower() or None + + +def _normalize_hex(value: str) -> str: + stripped = str(value).strip().lstrip("#") + if len(stripped) >= 6: + return f"#{stripped[:6].upper()}" + return f"#{stripped.upper()}" + + +def _color_to_hex(color_format: Any) -> str | None: + if color_format is None: + return None + try: + if color_format.type is None: + return None + except (AttributeError, TypeError, ValueError): + return None + try: + rgb = color_format.rgb + if rgb is not None: + return _normalize_hex(str(rgb)) + except (AttributeError, TypeError, ValueError): + pass + token = _enum_name(_safe_attr(color_format, "theme_color")) + if token: + return f"theme:{token}" + return None + + +def _graphic_data_uri(shape) -> str: + element = getattr(shape, "_element", None) + if element is None or not element.tag.endswith("}graphicFrame"): + return "" + data = element.find(f".//{DRAWING_NS}graphicData") + return data.get("uri", "") if data is not None else "" + + +def _is_connector(shape) -> bool: + element = getattr(shape, "_element", None) + return element is not None and element.tag.endswith("}cxnSp") + + +def _has_image(shape) -> bool: + try: + return getattr(shape, "image", None) is not None + except (AttributeError, TypeError, ValueError): + return False + + +def _image_crop(shape) -> dict[str, float] | None: + crop: dict[str, float] = {} + for attribute, key in (("crop_left", "left"), ("crop_right", "right"), ("crop_top", "top"), ("crop_bottom", "bottom")): + value = _safe_attr(shape, attribute) + if value: + crop[key] = round(float(value), 4) + return crop or None + + +def _geometry_name(shape) -> str | None: + element = getattr(shape, "_element", None) + if element is None: + return None + preset = element.find(f".//{DRAWING_NS}prstGeom") + if preset is not None and preset.get("prst"): + return preset.get("prst") + if element.find(f".//{DRAWING_NS}custGeom") is not None: + return "custom" + return None + + +def _geometry(shape) -> dict[str, Any] | None: + name = _geometry_name(shape) + if name == "custom": + return {"type": "custom"} + if name: + return {"type": "preset", "preset": name} + return None + + +def _transform(shape) -> dict[str, Any] | None: + transform: dict[str, Any] = {} + rotation = getattr(shape, "rotation", 0) or 0 + if rotation: + transform["rotation"] = round(float(rotation), 2) + element = getattr(shape, "_element", None) + if element is not None: + xfrm = element.find(f".//{DRAWING_NS}xfrm") + if xfrm is not None: + if xfrm.get("flipH") == "1": + transform["flip_h"] = True + if xfrm.get("flipV") == "1": + transform["flip_v"] = True + return transform or None + + +def _gradient_stops(shape) -> list[str] | None: + element = getattr(shape, "_element", None) + if element is None: + return None + gradient = element.find(f".//{DRAWING_NS}gradFill") + if gradient is None: + return None + stops: list[str] = [] + for stop in gradient.findall(f".//{DRAWING_NS}gs"): + srgb = stop.find(f"{DRAWING_NS}srgbClr") + if srgb is not None and srgb.get("val"): + stops.append(_normalize_hex(srgb.get("val"))) + continue + scheme = stop.find(f"{DRAWING_NS}schemeClr") + if scheme is not None and scheme.get("val"): + stops.append(f"theme:{scheme.get('val')}") + return stops or None + + +def _fill_style(shape) -> dict[str, Any] | None: + fill = _safe_attr(shape, "fill") + if fill is None: + return None + try: + fill_type = fill.type + except (AttributeError, TypeError, ValueError, NotImplementedError): + return None + name = _enum_name(fill_type) + if name in {None, "background"}: + return None + style: dict[str, Any] = {"fill_type": name} + if name == "solid": + color = _color_to_hex(_safe_attr(fill, "fore_color")) + if color: + style["fill_color"] = color + elif name == "gradient": + stops = _gradient_stops(shape) + if stops: + style["gradient_stops"] = stops + return style + + +def _line_style(shape) -> dict[str, Any] | None: + line = _safe_attr(shape, "line") + if line is None: + return None + style: dict[str, Any] = {} + color = _color_to_hex(_safe_attr(line, "color")) + if color: + style["line_color"] = color + try: + width = line.width + if width is not None: + style["line_width_pt"] = round(width.pt, 2) + except (AttributeError, TypeError, ValueError): + pass + dash = _enum_name(_safe_attr(line, "dash_style")) + if dash: + style["line_dash"] = dash + return style or None + + +def _line_endpoints(shape, x: float, y: float, width: float, height: float) -> dict[str, float]: + transform = _transform(shape) or {} + x1, x2 = (x + width, x) if transform.get("flip_h") else (x, x + width) + y1, y2 = (y + height, y) if transform.get("flip_v") else (y, y + height) + return {"x1": x1, "y1": y1, "x2": x2, "y2": y2} + + +def _connector_arrows(shape) -> dict[str, str] | None: + element = getattr(shape, "_element", None) + if element is None: + return None + line = element.find(f".//{DRAWING_NS}ln") + if line is None: + return None + arrows: dict[str, str] = {} + for tag, key in (("headEnd", "head"), ("tailEnd", "tail")): + end = line.find(f"{DRAWING_NS}{tag}") + if end is not None: + arrow_type = end.get("type") + if arrow_type and arrow_type != "none": + arrows[key] = arrow_type + return arrows or None + + +def _rich_text(shape) -> list[dict[str, Any]] | None: + text_frame = getattr(shape, "text_frame", None) + if text_frame is None: + return None + paragraphs: list[dict[str, Any]] = [] + for paragraph in text_frame.paragraphs: + runs: list[dict[str, Any]] = [] + for run in paragraph.runs: + run_info: dict[str, Any] = {"text": run.text} + font = run.font + size = _safe_attr(font, "size") + if size is not None: + run_info["font_size"] = round(size.pt, 2) + if font.bold is not None: + run_info["bold"] = bool(font.bold) + if font.italic is not None: + run_info["italic"] = bool(font.italic) + if font.underline: + run_info["underline"] = True + if font.name: + run_info["font"] = font.name + color = _color_to_hex(_safe_attr(font, "color")) + if color: + run_info["color"] = color + address = _safe_attr(_safe_attr(run, "hyperlink"), "address") + if address: + run_info["hyperlink"] = address + runs.append(run_info) + paragraph_info: dict[str, Any] = {} + alignment = _enum_name(_safe_attr(paragraph, "alignment")) + if alignment: + paragraph_info["align"] = alignment + level = getattr(paragraph, "level", 0) or 0 + if level: + paragraph_info["level"] = int(level) + if runs: + paragraph_info["runs"] = runs + elif paragraph.text: + paragraph_info["text"] = paragraph.text + if paragraph_info: + paragraphs.append(paragraph_info) + return paragraphs or None + + +def _table_detail(shape) -> dict[str, Any] | None: + try: + table = shape.table + except (AttributeError, ValueError): + return None + rows = list(table.rows) + columns = list(table.columns) + detail: dict[str, Any] = {"row_count": len(rows), "col_count": len(columns)} + try: + detail["col_widths_in"] = [_inches(column.width or 0) for column in columns] + except (AttributeError, TypeError, ValueError): + pass + try: + detail["row_heights_in"] = [_inches(row.height or 0) for row in rows] + except (AttributeError, TypeError, ValueError): + pass + for flag in ("first_row", "last_row", "first_col", "last_col", "horz_banding", "vert_banding"): + if getattr(table, flag, None): + detail[flag] = True + merged: list[dict[str, int]] = [] + for row_index, row in enumerate(rows): + for column_index, cell in enumerate(row.cells): + if getattr(cell, "is_merge_origin", False): + merged.append( + { + "row": row_index, + "col": column_index, + "span_rows": int(getattr(cell, "span_height", 1) or 1), + "span_cols": int(getattr(cell, "span_width", 1) or 1), + } + ) + if merged: + detail["merged_cells"] = merged + return detail + + +def _chart_detail(shape) -> dict[str, Any]: + detail: dict[str, Any] = {"title": getattr(shape, "name", "chart")} + try: + chart = shape.chart + except (AttributeError, ValueError): + return detail + chart_type = _enum_name(_safe_attr(chart, "chart_type")) + if chart_type: + detail["chart_type"] = chart_type + try: + if chart.has_title and chart.chart_title.text_frame.text.strip(): + detail["chart_title"] = chart.chart_title.text_frame.text.strip() + except (AttributeError, TypeError, ValueError): + pass + try: + categories = [str(category) for category in chart.plots[0].categories if category is not None] + if categories: + detail["categories"] = categories[:50] + except (AttributeError, IndexError, TypeError, ValueError): + pass + series_payload: list[dict[str, Any]] = [] + try: + for series in chart.series: + entry: dict[str, Any] = {} + name = _safe_attr(series, "name") + if name: + entry["name"] = str(name) + try: + entry["values"] = [value for value in series.values][:50] + except (AttributeError, TypeError, ValueError): + pass + if entry: + series_payload.append(entry) + except (AttributeError, TypeError, ValueError): + pass + if series_payload: + detail["series"] = series_payload + return detail + + +def _slide_background(slide) -> dict[str, Any] | None: + try: + fill = slide.background.fill + fill_type = fill.type + except (AttributeError, TypeError, ValueError, NotImplementedError): + return None + name = _enum_name(fill_type) + if name in {None, "background"}: + return None + background: dict[str, Any] = {"fill_type": name} + if name == "solid": + color = _color_to_hex(_safe_attr(fill, "fore_color")) + if color: + background["color"] = color + elif name == "gradient": + stops = _gradient_stops(slide.background) + if stops: + background["gradient_stops"] = stops + return background + + +def _apply_visual_style(shape, kind: str, content: dict[str, Any], style: dict[str, Any]) -> None: + geometry = _geometry(shape) + if geometry: + content["geometry"] = geometry + transform = _transform(shape) + if transform: + content["transform"] = transform + fill = _fill_style(shape) + if fill: + style.update(fill) + if kind not in {"line", "connector"}: + line_style = _line_style(shape) + if line_style: + style.update(line_style) + + +def _image_data(shape) -> tuple[bytes, str, str | None, str] | None: + try: + image = shape.image + except ValueError: + image = None + if image is not None: + return image.blob, str(image.ext), None, image.content_type + + for relationship_id in _embedded_relationship_ids(shape): + try: + part = shape.part.related_part(relationship_id) + except KeyError: + continue + blob = getattr(part, "blob", None) + if not blob: + continue + extension = Path(str(getattr(part, "partname", "media.bin"))).suffix.lstrip(".") or _extension_from_content_type( + getattr(part, "content_type", "") + ) + return blob, extension or "bin", relationship_id, getattr(part, "content_type", "") + return None + + +def _render_element(shape, element_id: str, source_path: Path, asset_dir: Path | None, embed_media: bool) -> dict[str, Any]: + return { + "id": element_id, + "xml": shape._element.xml, + "relationships": _relationship_payloads(shape, element_id, source_path, asset_dir, embed_media), + } + + +def _relationship_payloads( + shape, + element_id: str, + source_path: Path, + asset_dir: Path | None, + embed_media: bool, +) -> list[dict[str, Any]]: + payloads: list[dict[str, Any]] = [] + for relationship_id in _embedded_relationship_ids(shape): + try: + relationship = shape.part.rels[relationship_id] + except KeyError: + continue + payload: dict[str, Any] = { + "source_rid": relationship_id, + "reltype": relationship.reltype, + "target_ref": relationship.target_ref, + "is_external": bool(relationship.is_external), + } + if relationship.is_external: + payloads.append(payload) + continue + part = relationship.target_part + blob = getattr(part, "blob", None) + if blob: + content_type = getattr(part, "content_type", "") + extension = Path(str(getattr(part, "partname", "media.bin"))).suffix.lstrip(".") or _extension_from_content_type(content_type) + payload.update({"content_type": content_type, "extension": extension or "bin"}) + if asset_dir is not None: + asset_path = asset_dir / f"{source_path.stem}_{element_id}_{relationship_id}.{payload['extension']}" + asset_path.write_bytes(blob) + payload["path"] = str(asset_path) + elif embed_media: + payload["blob_base64"] = b64encode(blob).decode("ascii") + payloads.append(payload) + return payloads + + +def _embedded_relationship_ids(shape) -> list[str]: + relationship_ids: list[str] = [] + try: + nodes = shape._element.iter() + except AttributeError: + return relationship_ids + for node in nodes: + for attribute_name, value in node.attrib.items(): + if attribute_name.endswith("}embed") and value not in relationship_ids: + relationship_ids.append(value) + return relationship_ids + + +def _extension_from_content_type(content_type: str) -> str: + mapping = { + "image/svg+xml": "svg", + "image/png": "png", + "image/jpeg": "jpg", + "image/jpg": "jpg", + "image/gif": "gif", + "image/bmp": "bmp", + "image/x-emf": "emf", + "image/x-wmf": "wmf", + } + return mapping.get(content_type.lower(), "bin") + + +def _stat_key(slide_object: dict[str, Any]) -> str: + kind = slide_object["kind"] + if kind == "table": + return "tables" + if kind == "chart": + return "charts" + if kind == "image": + return "images" + if kind == "text": + return "text_objects" + if kind == "line": + return "lines_or_freeforms" + if kind == "connector": + return "connectors" + if kind == "smartart": + return "smartart" + if kind == "ole": + return "ole_objects" + return "shapes" + + +def _slide_title(objects) -> str: + for slide_object in objects: + text = str(slide_object["content"].get("text", "")).strip() + if text: + return text.splitlines()[0][:100] + return "" + + +def _contains_non_ascii(value: str) -> bool: + return any(ord(character) > 127 for character in value) + + +def _slide_text_fragments(shapes) -> list[str]: + fragments: list[str] = [] + for shape in shapes: + if hasattr(shape, "shapes"): + fragments.extend(_slide_text_fragments(shape.shapes)) + if getattr(shape, "has_table", False): + for row in shape.table.rows: + row_text = " | ".join(_compact_text(cell.text) for cell in row.cells if _compact_text(cell.text)) + if row_text: + fragments.append(row_text) + text = getattr(shape, "text", "") + if text: + fragments.append(text) + deduped: list[str] = [] + seen: set[str] = set() + for fragment in fragments: + cleaned = _compact_text(fragment) + if cleaned and cleaned not in seen: + seen.add(cleaned) + deduped.append(cleaned) + return deduped + + +def _compact_text(value: str) -> str: + return " ".join(str(value).split()) + + +def _package_asset_counts(path: Path) -> tuple[int, int]: + with zipfile.ZipFile(path) as package: + names = package.namelist() + media = sum(1 for name in names if name.startswith("ppt/media/")) + embedded = sum(1 for name in names if name.startswith("ppt/embeddings/")) + return media, embedded + + +def _extract_package_media(path: Path, asset_dir: Path) -> None: + with zipfile.ZipFile(path) as package: + for name in package.namelist(): + if not name.startswith("ppt/media/"): + continue + target = asset_dir / Path(name).name + target.write_bytes(package.read(name)) + + +def _notes_by_slide(path: Path) -> dict[int, list[str]]: + notes: dict[int, list[str]] = {} + with zipfile.ZipFile(path) as package: + names = set(package.namelist()) + slide_rels = sorted(name for name in names if name.startswith("ppt/slides/_rels/slide") and name.endswith(".xml.rels")) + for rel_path in slide_rels: + slide_number = int(Path(rel_path).name.removeprefix("slide").removesuffix(".xml.rels")) + rel_root = ElementTree.fromstring(package.read(rel_path)) + for rel in rel_root: + if "notesSlide" not in rel.attrib.get("Type", ""): + continue + target = rel.attrib.get("Target", "") + notes_path = posixpath.normpath(posixpath.join("ppt/slides", target)) + if notes_path not in names: + notes_path = posixpath.normpath(posixpath.join("ppt/slides/_rels", target)) + if notes_path not in names: + continue + notes_root = ElementTree.fromstring(package.read(notes_path)) + text = "\n".join(node.text for node in notes_root.iter(f"{DRAWING_NS}t") if node.text) + if text.strip(): + notes[slide_number] = [text.strip()] + return notes diff --git a/skills/pptify-tooling/references/pptx_style_master.py b/skills/pptify-tooling/references/pptx_style_master.py new file mode 100644 index 000000000..e5f02874a --- /dev/null +++ b/skills/pptify-tooling/references/pptx_style_master.py @@ -0,0 +1,505 @@ +from __future__ import annotations + +import zipfile +from collections import Counter +from pathlib import Path +from typing import Any, Iterable +from xml.etree import ElementTree + +EMU_PER_INCH = 914400 +DRAWING_NS = "{http://schemas.openxmlformats.org/drawingml/2006/main}" + + +class PptxStyleMaster: + """Extract compact design context from a reference PPTX for prompt-based generation.""" + + def __init__(self, max_slides: int = 12, max_items: int = 10) -> None: + self.max_slides = max_slides + self.max_items = max_items + + def analyze(self, path: str | Path) -> dict[str, Any]: + from pptx import Presentation + + pptx_path = Path(path) + presentation = Presentation(str(pptx_path)) + slide_size = { + "width": _inches(presentation.slide_width), + "height": _inches(presentation.slide_height), + } + theme = _theme_from_package(pptx_path) + + colors: Counter[str] = Counter() + fonts: Counter[str] = Counter() + font_sizes: Counter[float] = Counter() + shape_styles: Counter[str] = Counter() + layout_names: Counter[str] = Counter() + master_names: Counter[str] = Counter() + slide_layouts: list[dict[str, Any]] = [] + + _count_theme_tokens(theme, colors, fonts) + for slide_index, slide in enumerate(presentation.slides, start=1): + if slide_index > self.max_slides: + break + slide_context = _slide_design_context(slide, slide_index, slide_size, self.max_items) + slide_layouts.append(slide_context) + layout_names[slide_context["template_layout"]] += 1 + master_names[slide_context["template_master"]] += 1 + colors.update(slide_context.pop("_colors")) + fonts.update(slide_context.pop("_fonts")) + font_sizes.update(slide_context.pop("_font_sizes")) + shape_styles.update(slide_context.pop("_shape_styles")) + + return { + "styles": { + "colors": _top_items(colors, self.max_items), + "fonts": _top_items(fonts, self.max_items), + "font_sizes": _top_items(font_sizes, self.max_items), + "shape_styles": _top_items(shape_styles, self.max_items), + }, + "brands": _brand_context(colors, fonts, theme, self.max_items), + "template": _template_context(presentation, slide_size, theme, layout_names, master_names, self.max_items), + "layout": { + "analyzed_slide_count": len(slide_layouts), + "layout_usage": _top_items(layout_names, self.max_items), + "master_usage": _top_items(master_names, self.max_items), + "slides": slide_layouts, + }, + } + + +def extract_pptx_style_master(path: str | Path, max_slides: int = 12, max_items: int = 10) -> dict[str, Any]: + return PptxStyleMaster(max_slides=max_slides, max_items=max_items).analyze(path) + + +def _slide_design_context(slide, slide_index: int, slide_size: dict[str, float], max_items: int) -> dict[str, Any]: + colors: Counter[str] = Counter() + fonts: Counter[str] = Counter() + font_sizes: Counter[float] = Counter() + shape_styles: Counter[str] = Counter() + object_counts: Counter[str] = Counter() + regions: Counter[str] = Counter() + placeholders: list[dict[str, Any]] = [] + object_samples: list[dict[str, Any]] = [] + boxes: list[dict[str, float]] = [] + + for shape in _iter_shapes(slide.shapes): + kind = _shape_kind(shape) + bbox = _bbox(shape) + boxes.append(bbox) + object_counts[kind] += 1 + regions[_region(bbox, slide_size)] += 1 + + shape_colors = _shape_colors(shape) + colors.update(shape_colors.values()) + shape_text = _text_preview(shape) + text_styles = _text_styles(shape) + fonts.update(text_styles["fonts"]) + font_sizes.update(text_styles["font_sizes"]) + colors.update(text_styles["colors"]) + + style_signature = _style_signature(shape_colors, text_styles) + if style_signature: + shape_styles[style_signature] += 1 + + if getattr(shape, "is_placeholder", False) and len(placeholders) < max_items: + placeholders.append(_placeholder_context(shape, bbox)) + + if len(object_samples) < max_items: + sample: dict[str, Any] = { + "kind": kind, + "role": _shape_role(shape, kind), + "bbox": bbox, + "region": _region(bbox, slide_size), + } + if shape_text: + sample["text"] = shape_text + if shape_colors: + sample["colors"] = shape_colors + if text_styles["fonts"]: + sample["fonts"] = _top_items(text_styles["fonts"], 3) + if text_styles["font_sizes"]: + sample["font_sizes"] = _top_items(text_styles["font_sizes"], 3) + object_samples.append(sample) + + return { + "index": slide_index, + "template_layout": _slide_layout_name(slide), + "template_master": _slide_master_name(slide), + "object_counts": dict(sorted(object_counts.items())), + "placeholder_count": len(placeholders), + "placeholders": placeholders, + "dominant_regions": _top_items(regions, max_items), + "dominant_flow": _dominant_flow(boxes, slide_size), + "occupied_area_ratio": _occupied_area_ratio(boxes, slide_size), + "objects": object_samples, + "_colors": colors, + "_fonts": fonts, + "_font_sizes": font_sizes, + "_shape_styles": shape_styles, + } + + +def _template_context( + presentation, + slide_size: dict[str, float], + theme: dict[str, Any], + layout_names: Counter[str], + master_names: Counter[str], + max_items: int, +) -> dict[str, Any]: + masters: list[dict[str, Any]] = [] + try: + for master_index, master in enumerate(presentation.slide_masters, start=1): + masters.append( + { + "index": master_index, + "name": str(getattr(master, "name", f"Master {master_index}") or f"Master {master_index}"), + "layout_count": len(master.slide_layouts), + } + ) + except (AttributeError, TypeError): + masters = [] + + return { + "slide_size": slide_size, + "theme": theme, + "masters": masters[:max_items], + "layout_usage": _top_items(layout_names, max_items), + "master_usage": _top_items(master_names, max_items), + } + + +def _brand_context(colors: Counter[str], fonts: Counter[str], theme: dict[str, Any], max_items: int) -> dict[str, Any]: + theme_colors = theme.get("colors", {}) if isinstance(theme.get("colors"), dict) else {} + theme_accents = [value for name, value in theme_colors.items() if str(name).startswith("accent")] + palette = _ranked_colors(colors, include_neutral=False) + if not palette: + palette = [color for color in theme_accents if _is_hex_color(color)] + neutral_palette = _ranked_colors(colors, include_neutral=True, only_neutral=True) + font_values = [str(item["value"]) for item in _top_items(fonts, max_items)] + primary_color = palette[0] if palette else None + accent_colors = _dedupe([*palette, *theme_accents])[:max_items] + + return { + "theme_name": theme.get("name"), + "primary_color": primary_color, + "accent_colors": accent_colors, + "neutral_colors": neutral_palette[:max_items], + "fonts": font_values[:max_items], + "theme_colors": theme_colors, + "theme_fonts": theme.get("fonts", {}), + } + + +def _theme_from_package(path: Path) -> dict[str, Any]: + theme_paths: list[str] + try: + with zipfile.ZipFile(path) as package: + theme_paths = sorted(name for name in package.namelist() if name.startswith("ppt/theme/theme") and name.endswith(".xml")) + if not theme_paths: + return {"name": None, "colors": {}, "fonts": {}} + root = ElementTree.fromstring(package.read(theme_paths[0])) + except (zipfile.BadZipFile, KeyError, ElementTree.ParseError): + return {"name": None, "colors": {}, "fonts": {}} + + theme = { + "name": root.attrib.get("name"), + "path": theme_paths[0], + "colors": {}, + "fonts": {}, + } + color_scheme = root.find(f".//{DRAWING_NS}clrScheme") + if color_scheme is not None: + colors: dict[str, str] = {} + for color_node in list(color_scheme): + color_value = _theme_color_value(color_node) + if color_value: + colors[color_node.tag.rsplit("}", 1)[-1]] = color_value + theme["colors"] = colors + + font_scheme = root.find(f".//{DRAWING_NS}fontScheme") + if font_scheme is not None: + fonts: dict[str, str] = {} + for key, node_name in (("major", "majorFont"), ("minor", "minorFont")): + latin = font_scheme.find(f".//{DRAWING_NS}{node_name}/{DRAWING_NS}latin") + if latin is not None and latin.attrib.get("typeface"): + fonts[key] = latin.attrib["typeface"] + theme["fonts"] = fonts + return theme + + +def _theme_color_value(color_node: ElementTree.Element) -> str | None: + srgb = color_node.find(f".//{DRAWING_NS}srgbClr") + if srgb is not None and srgb.attrib.get("val"): + return _normalize_hex(srgb.attrib["val"]) + system = color_node.find(f".//{DRAWING_NS}sysClr") + if system is not None and system.attrib.get("lastClr"): + return _normalize_hex(system.attrib["lastClr"]) + return None + + +def _count_theme_tokens(theme: dict[str, Any], colors: Counter[str], fonts: Counter[str]) -> None: + for color in theme.get("colors", {}).values() if isinstance(theme.get("colors"), dict) else []: + if _is_hex_color(color): + colors[color] += 1 + for font in theme.get("fonts", {}).values() if isinstance(theme.get("fonts"), dict) else []: + if font: + fonts[str(font)] += 1 + + +def _iter_shapes(shapes) -> Iterable[Any]: + for shape in shapes: + yield shape + if hasattr(shape, "shapes"): + yield from _iter_shapes(shape.shapes) + + +def _shape_kind(shape) -> str: + shape_type = str(getattr(getattr(shape, "shape_type", "unknown"), "name", "unknown")).lower() + if getattr(shape, "has_table", False): + return "table" + if getattr(shape, "has_chart", False): + return "chart" + if "picture" in shape_type or _has_image(shape): + return "image" + if "line" in shape_type or "connector" in shape_type or "freeform" in shape_type: + return "line" + if getattr(shape, "has_text_frame", False) and getattr(shape, "text", "").strip(): + return "text" + if hasattr(shape, "shapes"): + return "group" + return "shape" + + +def _has_image(shape) -> bool: + try: + return getattr(shape, "image", None) is not None + except (AttributeError, TypeError, ValueError): + return False + + +def _shape_role(shape, kind: str) -> str: + if getattr(shape, "is_placeholder", False): + try: + return str(shape.placeholder_format.type).split(".")[-1].lower() + except (AttributeError, ValueError): + return "placeholder" + name = str(getattr(shape, "name", "") or "").strip().lower() + if "title" in name: + return "title" + return kind + + +def _shape_colors(shape) -> dict[str, str]: + colors: dict[str, str] = {} + fill = _format_color(_safe_attr(_safe_attr(shape, "fill"), "fore_color")) + if fill: + colors["fill"] = fill + line = _format_color(_safe_attr(_safe_attr(shape, "line"), "color")) + if line: + colors["line"] = line + return colors + + +def _safe_attr(value: Any, name: str) -> Any: + if value is None: + return None + try: + return getattr(value, name) + except (AttributeError, TypeError, ValueError): + return None + + +def _text_styles(shape) -> dict[str, Counter[Any]]: + fonts: Counter[str] = Counter() + font_sizes: Counter[float] = Counter() + colors: Counter[str] = Counter() + text_frame = getattr(shape, "text_frame", None) + if text_frame is None: + return {"fonts": fonts, "font_sizes": font_sizes, "colors": colors} + + for paragraph in text_frame.paragraphs: + _count_font(paragraph.font, fonts, font_sizes, colors) + for run in paragraph.runs: + _count_font(run.font, fonts, font_sizes, colors) + return {"fonts": fonts, "font_sizes": font_sizes, "colors": colors} + + +def _count_font(font, fonts: Counter[str], font_sizes: Counter[float], colors: Counter[str]) -> None: + name = getattr(font, "name", None) + if name: + fonts[str(name)] += 1 + size = getattr(font, "size", None) + if size is not None: + font_sizes[round(size.pt, 2)] += 1 + color = _format_color(getattr(font, "color", None)) + if color: + colors[color] += 1 + + +def _format_color(color_format) -> str | None: + if color_format is None: + return None + try: + rgb = color_format.rgb + except (AttributeError, TypeError, ValueError): + rgb = None + if rgb is not None: + return _normalize_hex(str(rgb)) + try: + theme_color = color_format.theme_color + except (AttributeError, TypeError, ValueError): + theme_color = None + if theme_color: + token = str(theme_color).split(".")[-1].lower() + return f"theme:{token}" + return None + + +def _style_signature(shape_colors: dict[str, str], text_styles: dict[str, Counter[Any]]) -> str: + parts: list[str] = [] + if shape_colors.get("fill"): + parts.append(f"fill={shape_colors['fill']}") + if shape_colors.get("line"): + parts.append(f"line={shape_colors['line']}") + font = _top_value(text_styles["fonts"]) + if font: + parts.append(f"font={font}") + font_size = _top_value(text_styles["font_sizes"]) + if font_size: + parts.append(f"font_size={font_size}") + return "; ".join(parts) + + +def _placeholder_context(shape, bbox: dict[str, float]) -> dict[str, Any]: + context: dict[str, Any] = { + "name": str(getattr(shape, "name", "") or ""), + "bbox": bbox, + } + try: + context["type"] = str(shape.placeholder_format.type).split(".")[-1].lower() + context["idx"] = int(shape.placeholder_format.idx) + except (AttributeError, ValueError): + context["type"] = "placeholder" + return context + + +def _text_preview(shape, max_chars: int = 120) -> str: + text = " ".join(str(getattr(shape, "text", "")).split()) + return text[:max_chars] + + +def _bbox(shape) -> dict[str, float]: + return { + "x": _inches(getattr(shape, "left", 0)), + "y": _inches(getattr(shape, "top", 0)), + "width": max(0.0, _inches(getattr(shape, "width", 0))), + "height": max(0.0, _inches(getattr(shape, "height", 0))), + } + + +def _region(bbox: dict[str, float], slide_size: dict[str, float]) -> str: + width = max(slide_size["width"], 0.01) + height = max(slide_size["height"], 0.01) + center_x = (bbox["x"] + bbox["width"] / 2) / width + center_y = (bbox["y"] + bbox["height"] / 2) / height + horizontal = "left" if center_x < 0.34 else "right" if center_x > 0.66 else "center" + vertical = "top" if center_y < 0.34 else "bottom" if center_y > 0.66 else "middle" + return f"{vertical}_{horizontal}" + + +def _dominant_flow(boxes: list[dict[str, float]], slide_size: dict[str, float]) -> str: + if len(boxes) < 2: + return "single" + centers_x = [(box["x"] + box["width"] / 2) / max(slide_size["width"], 0.01) for box in boxes] + centers_y = [(box["y"] + box["height"] / 2) / max(slide_size["height"], 0.01) for box in boxes] + spread_x = max(centers_x) - min(centers_x) + spread_y = max(centers_y) - min(centers_y) + if len(boxes) >= 4 and spread_x > 0.32 and spread_y > 0.32: + return "grid" + if spread_x > 0.42 and spread_y > 0.42: + return "grid" + if len(boxes) >= 3 and spread_x > 0.42: + return "grid" + if spread_x > spread_y * 1.4: + return "row" + if spread_y > spread_x * 1.4: + return "column" + return "overlay_or_balanced" + + +def _occupied_area_ratio(boxes: list[dict[str, float]], slide_size: dict[str, float]) -> float: + slide_area = max(slide_size["width"] * slide_size["height"], 0.01) + object_area = sum(box["width"] * box["height"] for box in boxes) + return round(min(object_area / slide_area, 1.0), 3) + + +def _slide_layout_name(slide) -> str: + try: + return str(slide.slide_layout.name or "unnamed_layout") + except AttributeError: + return "unknown_layout" + + +def _slide_master_name(slide) -> str: + try: + master = slide.slide_layout.slide_master + return str(master.name or "unnamed_master") + except AttributeError: + return "unknown_master" + + +def _top_items(counter: Counter[Any], limit: int) -> list[dict[str, Any]]: + return [{"value": value, "count": count} for value, count in counter.most_common(limit)] + + +def _top_value(counter: Counter[Any]) -> Any | None: + if not counter: + return None + return counter.most_common(1)[0][0] + + +def _ranked_colors(colors: Counter[str], include_neutral: bool, only_neutral: bool = False) -> list[str]: + ranked: list[str] = [] + for color, _count in colors.most_common(): + if not _is_hex_color(color): + continue + neutral = _is_neutral(color) + if only_neutral and not neutral: + continue + if not include_neutral and neutral: + continue + ranked.append(color) + return ranked + + +def _is_neutral(color: str) -> bool: + if not _is_hex_color(color): + return False + red = int(color[1:3], 16) + green = int(color[3:5], 16) + blue = int(color[5:7], 16) + return max(red, green, blue) - min(red, green, blue) <= 18 + + +def _is_hex_color(value: Any) -> bool: + return isinstance(value, str) and len(value) == 7 and value.startswith("#") + + +def _normalize_hex(value: str) -> str: + stripped = value.strip().lstrip("#") + if len(stripped) >= 6: + return f"#{stripped[:6].upper()}" + return f"#{stripped.upper()}" + + +def _dedupe(values: Iterable[str]) -> list[str]: + deduped: list[str] = [] + for value in values: + if value and value not in deduped: + deduped.append(value) + return deduped + + +def _inches(value: int) -> float: + return round(int(value) / EMU_PER_INCH, 4) diff --git a/skills/pptify-tooling/references/toolkit-setup.md b/skills/pptify-tooling/references/toolkit-setup.md new file mode 100644 index 000000000..40814228a --- /dev/null +++ b/skills/pptify-tooling/references/toolkit-setup.md @@ -0,0 +1,48 @@ +# PPTify Toolkit Reference + +This directory is for lightweight guidance that preserves tooling abilities without reintroducing heavy scripts. + +## Scope + +- Keep only static guidance and notes. +- Do not place runtime setup scripts, model assets, or generated artifacts here. +- Keep implementation centered on the bundled `references/` import APIs. + +## Core Tooling Recipes + +### 1. Prompt Context Recipe + +Use `PptxExtractor.prompt_context(...)` to build compact LLM-ready context from a reference deck. + +Expected result: + +- `slide_count`, `slide_size` +- `styles`, `brands`, `template`, `layout` +- Per-slide title/text snippets and shape counts + +### 2. Full Extraction Recipe + +Use `PptxExtractor.extract_file(...)` for full JSON extraction including: + +- `summary` complexity metrics +- `slides[*].layout_tree` with groups/objects +- `ooxml_elements` for render-aware inspection + +### 3. Folder Batch Recipe + +Use `PptxExtractor.extract_path(...)` on a folder to produce: + +- One `.pptify.json` file per deck +- A `manifest.json` to track outputs + +### 4. Style Master Recipe + +Use `extract_pptx_style_master(...)` when you need style-only analysis for design lock: + +- Palette and accent colors +- Typography and font-size distribution +- Master/layout usage and flow patterns + +## Adapter Guidance + +When document summarization or image generation is needed, use external adapters and pass normalized outputs into the deck-spec contract. Keep provenance fields explicit (`provider`, `model_or_deployment`, `status`, `error`). diff --git a/skills/pptify-visual-assets/SKILL.md b/skills/pptify-visual-assets/SKILL.md new file mode 100644 index 000000000..79dba9e9f --- /dev/null +++ b/skills/pptify-visual-assets/SKILL.md @@ -0,0 +1,73 @@ +--- +name: pptify-visual-assets +description: "Plan and place visual assets for pptify PPTX decks. Use when adding icons, images, SVGs, infographics, image placeholders, or asset-backed slide objects." +--- + +# PPTify Visual Assets + +Use this skill when a deck needs icons, images, diagrams, infographics, or media-backed slide objects. This skill provides **placement and decision guidance** plus **runnable how-to guidelines**; the skill itself bundles no helper scripts. + +Each asset capability has an inline, self-contained guideline in +[references/visual-asset-adapters.md](references/visual-asset-adapters.md) (icon search, web image +search, raster→SVG, text→infographic, NotebookLM bridge). Run the snippet in an ephemeral scratch +file or terminal at request time, then feed the resulting local asset path into `layout_tree.objects`. + +## Workflow + +1. Choose the asset type: icon, image, SVG, or generated infographic. +2. Run the matching guideline snippet from the reference (do not save it into the skill). +3. Add the asset to `layout_tree.objects` with final `bbox`, `z_index`, `content.alt`, and `classification`. +4. Recheck layering so assets do not cover readable text. + +If [references/visual-asset-adapters.md](references/visual-asset-adapters.md) cannot be read — or the specific guideline section required for the current asset type is absent or unreadable — halt that asset's acquisition, report what is missing to the user, and do not attempt to reconstruct guideline snippets from memory. + +When replacing an existing asset (for example, swapping an icon for an infographic), remove the existing object from `layout_tree.objects` before inserting the replacement. Remove superseded objects rather than hiding them; the only `hidden: true` objects that should remain are intentional reference copies defined by these guidelines (such as an SVG trace alongside its visible raster — see **SVG and Raster Handling**). + +## Icons + +- Use simple, single-color SVG icons that match the theme accent. +- Icons must always appear alongside a visible text label or caption; do not use an icon as the sole representation of a concept that the slide outline specifies in words. +- Acquire icons with the **Icon Search** guideline (see [references/visual-asset-adapters.md](references/visual-asset-adapters.md)) and store local paths. If the search returns multiple candidates, prefer the one with the fewest total shape elements (``, ``, ``, ``, ``, ``) that still clearly represents the concept, then the closest match to the theme accent color; if candidates are otherwise equal, select the first result. If the Icon Search returns no candidates, omit the icon object and note in the slide outline that a suitable icon could not be found for the concept. + +## Images + +- Use local file paths in image objects: `content.path` plus `content.alt`. +- Give images an explicit `bbox`. Place a short caption of 10 words or fewer in the space adjacent to the image bbox; do not overlay text on the image itself. +- Match the `bbox` aspect ratio to the image's native aspect ratio (fit, or crop-to-fill); a mismatched bbox stretches the image. Keep captions in adjacent space, not overlaid on the image. +- Acquire candidates with the **Web Image Search** guideline (see [references/visual-asset-adapters.md](references/visual-asset-adapters.md)). Only select images confirmed royalty-free or licensed for commercial reuse. If a candidate's license status cannot be confirmed (license information is missing, ambiguous, or the source does not clearly state reuse terms), treat it as not meeting the bar — do not infer or assume a permissive license. If no candidate meets this bar, omit the image object and note in the slide outline that a licensed asset is needed. +- Never insert placeholder or stand-in graphics (e.g., dummy boxes or "image here" art) as a substitute for a real asset; when no approved asset is available, omit the image object. + +## SVG and Raster Handling + +- Convert raster art with the **Raster → SVG** guideline (see [references/visual-asset-adapters.md](references/visual-asset-adapters.md)). +- Keep SVGs suitable for PowerPoint editing: each individual SVG file must contain fewer than 50 total shape elements (``, ``, ``, ``, ``, ``) counted across the whole file, avoid embedded raster data, avoid clip-paths, and use only named fill colors rather than gradients. +- Vector tracing raster infographics can lose or distort text. Keep the original raster as the visible slide asset whenever the source image contains any text, labels, numbers, or legend entries that must remain legible, and add any traced SVG as a `hidden: true` object on the same slide for editability/reference. +- If a converted SVG violates any of the above constraints (element count, embedded raster data, clip-paths, or gradients), do not use it as a visible slide asset: keep the original raster as the visible object, add the non-compliant SVG as a `hidden: true` reference object, and note the specific violation in the slide outline. + +## Infographics (External Provider) + +Infographic generation runs through a user-managed provider; the skill bundles no generation scripts. Follow the **Text → Infographic** and optional **NotebookLM** guidelines in [references/visual-asset-adapters.md](references/visual-asset-adapters.md). + +**Infographic generation sequence (follow in order):** + +1. **Collect required inputs** via the VS Code prompt input dialog (`vscode_askQuestions` or equivalent): provider, model or deployment, prompt, image size, and output path. + - On failure (dialog unavailable, or returns without all required fields): do not proceed. List the missing fields in chat, ask the user to supply them, then retry from this step. +2. **Verify provider access.** Generate only through user-managed providers (e.g., OpenAI or Azure OpenAI). The optional NotebookLM bridge (see [references/visual-asset-adapters.md](references/visual-asset-adapters.md)) is available only when the user has configured it. + - On failure (no provider access): if a NotebookLM bridge is configured (it needs no image-generation provider), use it; otherwise omit the asset and tell the user that infographic generation requires a configured provider. Never silently switch to a model provider the user has not configured, including any on-device or local model. +3. **Never request secrets in chat.** Do not ask the user to paste API keys, tokens, or connection strings into chat or the prompt dialog. If Entra auth is preferred, tell them to run `az login` in a terminal. +4. **Call the provider** to generate the infographic. + - On failure (call errors or returns no usable output): record the attempt manifest as failed (step 5) and omit the infographic object — never place a local placeholder or stub. If that manifest write also fails, follow step 5's halt rule and report both the provider failure and the write failure in chat. +5. **Record an attempt manifest** beside the image output with provider, model or deployment, auth mode, prompt path, output path, success status, and error details. + - On failure (manifest cannot be written, e.g., output directory missing or write permission denied): report the write failure and its reason in chat and halt; do not place the asset or skip manifest recording silently. +6. **Place the output.** Prefer the raster output as the visible slide asset. Add any raster-to-SVG vector trace as a `hidden: true` object on the same slide rather than replacing the visible infographic. + +## Asset Placement + +- Put decorative asset containers in `layout_tree.objects` with `classification: "layout_design"`. +- Put meaningful icons, diagrams, images, and infographics in `layout_tree.objects` with `classification: "content"`. +- Every asset object needs final inch-based `bbox` coordinates and a deliberate `z_index`. +- Validate that every `bbox` lies within the slide boundary, from (0, 0) to (slide_width, slide_height) in inches. If a computed `bbox` falls outside this range, clip it to the boundary and note the adjustment in the slide outline. If `slide_width` or `slide_height` cannot be read from the current `layout_tree`, assume standard widescreen dimensions (13.33 × 7.5 inches) and note the assumption in the slide outline. Do not proceed with bbox validation using unresolved dimensions. +- If clipping produces a `bbox` with zero or negative width or height, omit the asset object, note the degenerate bbox in the slide outline, and flag the placement conflict to the user. +- Use `z_index` so assets do not cover readable text. +- When overlapping assets include a `layout_design` (decorative) asset, give it the lower `z_index` so it never covers a `content` asset or readable text. +- When two `content` assets overlap, give the lower `z_index` to whichever is secondary or illustrative (e.g., a background texture or a supplementary icon). If neither is clearly secondary, keep readable text uncovered and flag the overlap to the user for a final decision rather than silently finalizing the order. \ No newline at end of file diff --git a/skills/pptify-visual-assets/references/visual-asset-adapters.md b/skills/pptify-visual-assets/references/visual-asset-adapters.md new file mode 100644 index 000000000..d9c888b1b --- /dev/null +++ b/skills/pptify-visual-assets/references/visual-asset-adapters.md @@ -0,0 +1,162 @@ +# Visual Asset Guidelines + +This skill cannot bundle helper scripts, so these guidelines show how to **perform each visual-asset +ability inline** using public APIs and short, self-contained snippets you run at request time +(scratch cell or terminal). + +How to use a guideline: + +1. Pick the ability you need below. +2. Run the inline snippet (adjust inputs) in an ephemeral scratch file or terminal — do not save it + into the skill, since the skill keeps only `references/`. +3. Place the returned local asset path into `layout_tree.objects` with `content.path`, `content.alt`, + `bbox`, `z_index`, and `classification`. + +Shared rules: + +- Always return a **local file path** for any placed asset, plus `content.alt`. +- Record provenance (`source`, `license`, `provider`, `model`) for audits. +- On failure, write a failure manifest; never substitute a placeholder and call it generated. +- Never request secrets in chat or a prompt dialog. For cloud auth use `.env` or `az login`. + +--- + +## 1. Icon Search + +Use the public Iconify API — no key required. + +- Search: `https://api.iconify.design/search?query=&limit=` +- Download SVG: `https://api.iconify.design//.svg?color=%23` + +```python +import json, urllib.parse, urllib.request +from pathlib import Path + +def icon_search(query, limit=8, prefix=None, color=None, out_dir="assets/icons"): + q = urllib.parse.quote(query) + url = f"https://api.iconify.design/search?query={q}&limit={limit}" + if prefix: + url += f"&prefix={urllib.parse.quote(prefix)}" + data = json.load(urllib.request.urlopen(url, timeout=15)) + Path(out_dir).mkdir(parents=True, exist_ok=True) + results = [] + for icon_id in data.get("icons", []): + pfx, name = icon_id.split(":", 1) + svg_url = f"https://api.iconify.design/{pfx}/{name}.svg" + if color: + svg_url += f"?color=%23{color}" + svg_path = Path(out_dir) / f"{pfx}_{name}.svg" + svg_path.write_bytes(urllib.request.urlopen(svg_url, timeout=15).read()) + results.append({"id": icon_id, "svg_path": str(svg_path), "license": "per-set (see iconify.design)"}) + return {"query": query, "results": results} +``` + +- Prefer simple, single-color icons matching the theme accent. +- Use icons as supporting cues, not replacements for required text. + +--- + +## 2. Web Image Search + +Prefer the VS Code fetch tools (`fetch_webpage`) or an MCP image-search tool you have available. +When you already have a direct image URL (from search results or the user), download it locally: + +```python +import urllib.request +from pathlib import Path + +def download_image(url, out_path="assets/images/img1.jpg"): + Path(out_path).parent.mkdir(parents=True, exist_ok=True) + req = urllib.request.Request(url, headers={"User-Agent": "pptify/1.0"}) + Path(out_path).write_bytes(urllib.request.urlopen(req, timeout=20).read()) + return {"url": url, "local_path": out_path} +``` + +- Capture `source` and `license` for attribution; verify usage rights before placing. +- Reference the saved file via `content.path`; do not hotlink remote URLs into the deck. +- Do not use image placeholders as fallback assets; select an approved asset or omit the object. + +--- + +## 3. Raster → SVG + +**Wrap mode (default, no dependencies)** — embed the raster inside an SVG so it can live in an +editable container: + +```python +import base64, mimetypes +from pathlib import Path + +def raster_to_svg_wrap(source, output_path): + src = Path(source) + mime = mimetypes.guess_type(src.name)[0] or "image/png" + b64 = base64.b64encode(src.read_bytes()).decode() + svg = ( + '' + f'' + ) + Path(output_path).write_text(svg, encoding="utf-8") + return {"source": source, "output_path": output_path, "mode": "wrap", "status": "ok"} +``` + +**Vector-trace mode (optional)** — only when a true vector result is needed and `vtracer` is +available in the environment: + +```python +# pip/uv add vtracer first; tracing can distort text. +import vtracer +vtracer.convert_image_to_svg_py("logo.png", "logo.svg") +``` + +- Vector tracing can lose or distort text. Keep the original raster on visible slides when text + fidelity matters, and place any traced SVG on a separate hidden appendix slide for reference. + +--- + +## 4. Text → Infographic + +Generate through a user-managed provider (OpenAI or Azure OpenAI). Read credentials from environment; +never accept secrets via chat. + +```python +import base64, json, os +from pathlib import Path +from openai import OpenAI, AzureOpenAI # provided by the user's environment + +def text_to_infographic(prompt, output_path, provider="openai", + model_or_deployment="gpt-image-1", size="1024x1024"): + manifest = {"provider": provider, "model_or_deployment": model_or_deployment, + "output_path": output_path} + try: + if provider == "azure-openai": + client = AzureOpenAI( + azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"], + api_key=os.environ.get("AZURE_OPENAI_API_KEY"), + api_version=os.environ.get("AZURE_OPENAI_API_VERSION", "2024-02-01"), + ) + else: + client = OpenAI(api_key=os.environ["OPENAI_API_KEY"]) + result = client.images.generate(model=model_or_deployment, prompt=prompt, size=size) + Path(output_path).parent.mkdir(parents=True, exist_ok=True) + Path(output_path).write_bytes(base64.b64decode(result.data[0].b64_json)) + manifest["status"] = "ok" + except Exception as exc: # report, never fake-generate + manifest.update(status="error", error=str(exc)) + Path(output_path).with_suffix(".manifest.json").write_text(json.dumps(manifest, indent=2)) + return manifest +``` + +- Collect missing values via `vscode_askQuestions`: provider, prompt, model/deployment, size, output path. +- Use `.env` or `az login` for auth; never ask for keys/tokens in chat or the dialog. +- Prefer the raster output as the visible slide asset; add any vector trace as `hidden: true`. + +--- + +## 5. NotebookLM Infographic Bridge + +NotebookLM has no public generation API, so treat this as an optional, user-configured bridge. + +- If the user has a NotebookLM/MCP bridge tool configured, call it with `source_refs` + `prompt`, + then save the returned image locally and record provenance. +- If no bridge is configured, **fall back to Text → Infographic (section 4)** or omit the asset. +- Apply the same provenance and failure-manifest rules as the other generation guidelines.