Skip to content

Conversation

@SAHIL-Sharma21
Copy link
Collaborator

@SAHIL-Sharma21 SAHIL-Sharma21 commented Dec 25, 2025

✨ What’s new

This PR adds Markdown (.md) output support using Microsoft MarkItDown as a new CLI-backed converter.

Supported conversions:

PDF → Markdown

DOCX → Markdown

PPTX → Markdown

HTML → Markdown

🤔 Why MarkItDown

MarkItDown provides:

High-quality, semantic Markdown output

Better structure preservation (headings, lists, links)

Automatic input type detection

A CLI interface that aligns well with ConvertX’s architecture

This makes it a strong alternative to existing document converters for Markdown output.


Summary by cubic

Adds Markdown output via Microsoft MarkItDown, integrated as a new CLI-backed converter. Converts PDF, DOCX, PPTX, HTML, and Excel files to .md with better structure preservation.

  • New Features

    • Added MarkItDown converter and registered it in the main converters map.
    • Inputs: pdf, docx, pptx, html, excel; Output: md.
    • Uses the MarkItDown CLI for structured, semantic Markdown.
  • Migration

    • Docker image now installs MarkItDown via pipx; markitdown is available in PATH.
    • For local runs, install the MarkItDown CLI and ensure markitdown is in PATH. No config changes required.

Written for commit f2dcc7f. Summary will update automatically on new commits.

@github-actions github-actions bot added Feature and removed Feature labels Dec 25, 2025
Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/converters/main.ts">

<violation number="1" location="src/converters/main.ts:131">
P2: Inconsistent naming convention: all other converter keys use lowercase (e.g., `pandoc`, `libreoffice`, `ffmpeg`), but `MarkitDown` uses PascalCase. This could cause lookup failures if callers pass `markitdown` (following the established pattern). Consider renaming to `markitdown` for consistency.</violation>
</file>

Since this is your first cubic review, here's how it works:

  • cubic automatically reviews your code and comments on bugs and improvements
  • Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
  • Ask questions if you need clarification on any suggestion

Reply to cubic to teach it or ask questions. Tag @cubic-dev-ai to re-run a review.

@github-actions github-actions bot added Feature and removed Feature labels Dec 25, 2025
@github-actions github-actions bot added Feature and removed Feature labels Dec 26, 2025
@SAHIL-Sharma21 SAHIL-Sharma21 linked an issue Dec 26, 2025 that may be closed by this pull request
2 tasks
Copy link
Owner

@C4illin C4illin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Feel free to merge when you want :)

@github-actions github-actions bot added Feature and removed Feature labels Dec 27, 2025
@SAHIL-Sharma21 SAHIL-Sharma21 merged commit f2a92aa into main Dec 27, 2025
10 checks passed
@SAHIL-Sharma21 SAHIL-Sharma21 deleted the feat/markitdown-399 branch December 27, 2025 07:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Converter Request] MarkItDown

3 participants