Alternative: render from_markdown via markdown-it-py by MattFisher · Pull Request #61 · ma2za/python-substack

MattFisher · 2026-06-26T02:37:51Z

What this is

An alternative implementation of Post.from_markdown() that delegates parsing to markdown-it-py (a CommonMark parser) plus the standard footnote plugin, with a small renderer (mdrender.py) that maps the syntax tree onto Substack's node schema. Node construction is centralised in a new nodes.py module so the (undocumented) schema lives in one place.

Why I'm opening it

First off — this approach wasn't discussed beforehand, and I realise it's a bigger change than a typical PR, so I'm putting it up purely for your consideration; entirely your call on whether it's a direction you want to take.

I started out adding footnote support to the existing hand-rolled parser (#56). As that grew — footnotes, fenced/inline-code edge cases, multi-paragraph definitions — it started to feel like we were re-implementing a Markdown parser. So I prototyped this alternative to see how it compared, and it turned out significantly simpler:

from_markdown() drops from a ~270-line hand-rolled parser to a few lines delegating to the renderer (net post.py ≈ −215 lines).
Footnotes come essentially for free from the footnote plugin (including multi-paragraph definitions), rather than via bespoke pre-parse text extraction.
A real CommonMark parser brings correctness for free (nested structures, edge cases) and removes the fragile overlapping-regex inline parsing.

Trade-offs (flagging honestly)

Adds two runtime dependencies: markdown-it-py and mdit-py-plugins (both widely used and well maintained). This is the main thing to weigh.
Two intentional, CommonMark-correct behaviour changes vs the old parser (tests updated to match, with comments):
- Consecutive > lines are one paragraph; blank > lines split paragraphs (standard CommonMark). The old parser made one paragraph per line.
- Footnote definitions that are never referenced are dropped (not appended to the end).
parse_inline() / tokens_to_text_nodes() remain as public helpers (still used by the manual footnote() builder), but from_markdown() no longer relies on them.

Tests

All existing from_markdown/parse_inline tests pass (the two semantic-difference tests above were updated).
Added test_from_markdown_features.py with end-to-end coverage of every feature listed in the from_markdown() docstring (headings 1–6, bold/italic/bold+italic/inline-code/strikethrough, links, images, linked images, code blocks with/without language, blockquotes, bullet/ordered lists, horizontal rules, paragraphs).
Footnote tests include references/definitions, named labels, numbering, multi-paragraph definitions, and code-span safety. 86 passing (excludes the pre-existing live-API tests that require credentials).

Relationship to #56

This is an alternative to #56 — if you'd prefer this approach, #56 can be closed in its favour; if not, no harm done and #56 stands on its own.

Replace the hand-rolled Markdown parser in from_markdown() with markdown-it-py plus the standard footnote plugin, and a small renderer (mdrender.py) that maps the syntax tree to Substack's node schema. Node construction is centralised in a new nodes.py module so the schema lives in one place. Footnotes (including multi-paragraph definitions) come from the footnote plugin. Adds end-to-end from_markdown feature tests covering every documented feature. Two intentional, CommonMark-correct behaviour changes vs the old parser: consecutive '>' lines are one paragraph (blank '>' lines split them), and unreferenced footnote definitions are dropped rather than appended.

MattFisher mentioned this pull request Jun 26, 2026

Add Markdown footnote support to from_markdown #56

Open

MattFisher changed the title ~~Alternative: render from_markdown via markdown-it-py (simpler, footnotes for free)~~ Alternative: render from_markdown via markdown-it-py Jun 26, 2026

MattFisher force-pushed the markdown-it-from-markdown branch from f257c26 to 97ccc9e Compare June 26, 2026 02:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alternative: render from_markdown via markdown-it-py#61

Alternative: render from_markdown via markdown-it-py#61
MattFisher wants to merge 1 commit into
ma2za:mainfrom
MattFisher:markdown-it-from-markdown

MattFisher commented Jun 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MattFisher commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this is

Why I'm opening it

Trade-offs (flagging honestly)

Tests

Relationship to #56

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

MattFisher commented Jun 26, 2026 •

edited

Loading