Skip to content

fix: add divider row to markdown tables with auto-generated integer headers#95

Open
vibeyclaw wants to merge 1 commit intoalphanome-ai:mainfrom
vibeyclaw:fix/table-to-markdown-missing-divider
Open

fix: add divider row to markdown tables with auto-generated integer headers#95
vibeyclaw wants to merge 1 commit intoalphanome-ai:mainfrom
vibeyclaw:fix/table-to-markdown-missing-divider

Conversation

@vibeyclaw
Copy link

Problem

Fixes #90

When an HTML table has no semantic <thead>, pd.read_html() assigns integer column names (0, 1, 2, ...). The previous code stripped both the integer header row and the separator row, producing output like:

| Revenue | 2024 |
| 5.2B    | 2023 |
| 4.8B    | 2022 |

This is invalid Markdown — no divider row means the table is not rendered as a proper table in GitHub, Obsidian, or other Markdown renderers.

Fix

After stripping the auto-generated integer headers, promote the first data row to serve as the visual header and insert a generated --- divider beneath it:

| Revenue | 2024 |
|---------|------|
| 5.2B    | 2023 |
| 4.8B    | 2022 |

The elif branch (for tables with meaningful column names) is unchanged — those already produce a valid divider.

Root Cause

Many SEC filing tables use <td> for both header and data rows without a <thead> element, so pandas cannot distinguish the visual header from data. The fix preserves the original intent of stripping integer placeholders while ensuring the output is always valid Markdown.

…eaders

When an HTML table has no semantic <thead>, pandas auto-assigns integer
column names (0, 1, 2, ...).  The previous code stripped both the integer
header row and the separator row, producing invalid Markdown (no divider).

Fix: after stripping the integer header row, treat the first data row as
the visual header and insert a generated '---' divider beneath it, so the
output is always a valid GitHub-Flavored Markdown table.

Fixes alphanome-ai#90
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Converted markdown table has no divider row below the column header, making it an invalid markdown table

1 participant