Markdown Rendering Pipeline: From MDX to HTML

技术架构(Updated May 29, 2026)

Challenges in Markdown Rendering

ToolsKu’s blog and tutorial system must turn MDX/Markdown into safe HTML. It looks simple, but involves:

  • Syntax extensions: GFM tables, task lists, strikethrough
  • Sanitization: XSS prevention for untrusted content
  • Code highlighting: syntax-colored <pre><code> blocks
  • Heading anchors: auto slugs for table-of-contents links
  • Internal links: /pdf/merge resolved with locale prefix at build time

unified Pipeline

MDX/Markdown source
     ↓ remark-parse        (Markdown → MDAST)
     ↓ remark-gfm          (GFM extensions)
     ↓ remark-rehype        (MDAST → HAST)
     ↓ rehype-slug          (ids on headings)
     ↓ rehype-stringify     (HAST → HTML)
HTML output

ToolsKu implements this in src/lib/blog/render-mdx.ts:

import { unified } from 'unified';
import remarkParse from 'remark-parse';
import remarkGfm from 'remark-gfm';
import remarkRehype from 'remark-rehype';
import rehypeSlug from 'rehype-slug';
import rehypeStringify from 'rehype-stringify';

const processor = unified()
  .use(remarkParse)
  .use(remarkGfm)
  .use(remarkRehype, { allowDangerousHtml: false })
  .use(rehypeSlug)
  .use(rehypeStringify);

Frontmatter Parsing

Article metadata is extracted from the MDX header with gray-matter:

---
title: "Article title"
description: "SEO description"
author: "Zhang (ToolsKu Founder)"
---

Body content...
import matter from 'gray-matter';

const { data: frontmatter, content } = matter(rawMdx);
// frontmatter.title → "Article title"
// content → body Markdown

Frontmatter feeds SEO metadata, Open Graph tags, and JSON-LD structured data.


Security

Block raw HTML injection

.use(remarkRehype, { allowDangerousHtml: false })

With false, <script> in Markdown is escaped as text, not executed.

External links should get rel="noopener noreferrer" and target="_blank":

// rehype plugin: add safe attrs to external links
function rehypeExternalLinks() {
  return (tree) => {
    visit(tree, 'element', (node) => {
      if (node.tagName === 'a' && isExternal(node.properties.href)) {
        node.properties.target = '_blank';
        node.properties.rel = 'noopener noreferrer';
      }
    });
  };
}

Multilingual Content Layout

ToolsKu content files:

src/content/blog/my-article/
  ├── meta.json          ← shared metadata (category, tags, date)
  ├── zh-CN.mdx          ← Simplified Chinese body
  ├── zh-TW.mdx          ← Traditional Chinese body
  ├── en.mdx             ← English body
  └── ja.mdx             ← Japanese body

meta.json holds locale-agnostic data; *.mdx holds per-locale body and frontmatter. At build time, generateStaticParams enumerates all slug × locale pairs for pre-rendering.


Performance

Build-time vs runtime rendering

ToolsKu uses build-time rendering (SSG):

next build → read MDX → unified → HTML written to out/
runtime → serve static HTML (zero Markdown cost)

Compared to per-request Markdown processing, SSG cuts TTFB from ~200ms to ~50ms.

Incremental content updates

New or edited posts require next build. For 18+ articles × 4 locales = 72 pages, a full build takes about 3–5 minutes.


Relation to MDX Components

ToolsKu currently uses plain Markdown (not React-component MDX), which means:

  • Simple pipeline, no React SSR overhead for content
  • Files editable in any Markdown editor
  • Translations are plain text only

To embed interactive components (e.g. live demos) later, upgrade to @next/mdx + React components.


Summary

The Markdown pipeline is core infrastructure for content sites. remark/rehype provides a composable processor chain; gray-matter handles metadata; build-time rendering maximizes performance. ToolsKu’s blog is a practical application of this stack.

Try these browser-local tools — no sign-up required →

#Markdown#MDX#remark#rehype#内容渲染