I’ve used many content formats over the years, and while I love Markdown, I encounter its limitations daily when working on large documentation projects.
In this issue, you’ll look at Markdown and find out why it might not be best suited for technical content, and what else might work instead.
Markdown is everywhere. It’s human-readable, accessible, and has enough syntax to make documents look good in a GitHub or static site. Its ease of use has made it the default choice for developer documentation. I’m currently using Markdown to write this newsletter issue. I love it.
But Markdown’s biggest advantage is also its biggest drawback: It doesn’t describe content like other formats.
Think about how your content is consumed. Your content isn’t just for human readers. Machines also use it. Your content is indexed by search engines, and parsed by LLM, and these things parse the well-formed HTML published by your system. The basic syntax of Markdown emits only a small subset of the available semantic tags allowed by HTML.
IDE integration can also use your documents. And AI agents rely on the structure to answer developer questions. If you’re just feeding them plain-text Markdown documents to reduce the number of tokens you send, you’re not providing as much context as you could.
Even worse, when you want to reuse your content or syndicate the content to another system, you quickly discover that Markdown is the lowest common denominator compared to the source of truth, because not all Markdown flavors are the same.
There are other options you can use that give you more control. But first, let’s take a deeper look at why you should move away from markdown for serious work.
There is “implicit typing” for markdown content
If you are a developer, you know all about type systems in programming languages. Some languages use implicit typing, in which the compiler or interpreter infers the data type from the value. These languages give you flexibility, but no guarantees. That’s why many developers prefer languages that use explicit typing, where you predefined data types as you write the code. In those languages, the compiler doesn’t just build your code; This guarantees that specific rules are followed. This is the main reason TypeScript thrives on JavaScript: compile-time guarantees.
Markdown has built-in typingThis lets you write faster, but without any constraints or guarantees, There is no schema, No way to enforce consistency, A file can have a title a conceptin another it may be stepAnd there is no machine-readable difference between the two.
To make things even more complicated, there are several flavors of Markdown, each with its own features and markup. Here are just a few:
You may think you’re writing “Markdown”, but what works in one tool may not render in another. Some Markdown processors allow footnotes, others ignore soft line breaks. And some even require different formatting for code blocks. Inconsistency makes Markdown an unstable base for anything beyond the most basic documentation.
And then there’s MDX, which people often use to extend Markdown to support things it doesn’t:
Here is a typical MDX snippet:
# Install
npm install my-library
He The tag is not markdown at all; This is a React component. Instead of using code blocks, the author chose to create a special component to standardize how all commands will be displayed in the documentation.
This works beautifully on their site because their publishing system knows what Meaning. But if they try to syndicate this content to another system, it breaks because that system also needs to implement that component. And even if it is supported elsewhere, there is no guarantee that the component is implemented the same way.
MDX shows that even in a Markdown-centric ecosystem, people intuitively add more expressive markup. They know that plain markdown is not enough. They’re reinventing semantic markup, but in a way that’s custom, brittle, and not portable.
Why does semantic markup matter
Semantic markup describes what is the materialnot only what should it look likeThere’s a difference in saying “Here’s a bullet with some text” and “Here’s a step in a process,” To a human, they may look identical on a page, For a machine or a publishing pipeline, they are completely different,
Web developers have already done all this with HTML. Before HTML5, you had
,
,
and many other elements that describe the content.
Semantic markup matters for two important and related reasons:
- Transformation and ReuseWith semantic markup, you can publish the same content to HTML, PDF, ePub, or even plain Markdown, With Markdown as your source, you can't easily move to another format, You can't turn a pill into a bullet
or in a paragraph aWithout guessing. You can't add a reference if it wasn't there initially, but when you convert the document you can remove what you don't need, and you can choose how to present everything in a consistent manner.
- machine consumptionLLMs and agents can make better use of structured content, marked as a step
It is unambiguous. A bullet point can be a step, or a note, or just a list item. The machine has to guess. This is why XML was a preferred mechanism for web services for a long time, and why JSON Schema exists.
Let's explore four formats that give you more control over structure than plain Markdown.
restructured text
reStructuredText is a plain-text markup language from the Python/Docutils ecosystem that supports directives, roles, and structural semantics. This is the default format used by Sphinx to generate documentation.
Installation
============
.. code-block:: bash
npm install my-library
.. note::
This library requires Node.JS ≥ 22.
See also :ref:`usage-guide`.
here you see one code-block instructions, a warning (note), and through an explicit cross-reference :ref:You'll also get support for images, shapes, themes, sidebars, pull quotes, epigraphs, and citations,
They encode all the semantics, not just the representation.
AsciiDoc
AsciiDoc aims to be human-readable but semantically expressive. This includes attributes, conditional content, mechanisms, and more.
Here's an example from AsciiDoc:
= Installation
:revnumber: 1.2
:platform: linux
:prev_section: introduction
:next_section: create-project
[source,bash]
----
npm install my-library
----
NOTE: This library requires Node.JS ≥ 22.
See <> for examples.
AsciiDoc has native support for document front-matter. qualities like :revnumber: Or :platform: Lets you standardize content.
< There is a cross-reference syntax.
Like ReStructuredText, supports warnings like AsciiDoc NOTE And WARNING So you don't need to create your own custom renderer. It also has support for sidebars, and you can add line highlighting and callouts to your code blocks without additional extensions.
And if you're writing technical documentation, there's explicit support for marking up UI elements and keyboard shortcuts.
Using AsciiDoctor, you can convert AsciiDoc to other formats, including HTML, PDF, ePub, and DocBook, which you'll see next.
DocBook (XML)
DocBook is an XML-based document model explicitly designed for technical publishing. It expresses hierarchical and semantic structure in tags and attributes, enabling industrial-grade transformations.
Here is an example
id="install-library">
Installation
npm install my-library
This library requires Node.JS >= 22
linkend="usage-chapter">Usage Guide
Each tag is meaningful: versus , versus You'll find predefined tags for function names, variables, application names, keyboard shortcuts, UI elements, and more, Being able to mark specific product names and terminology you use makes creating glossaries and indexes much easier, And DocBook also has tags to define index terms,
DocBook's rich ecosystem of XSLT stylesheets supports converting to HTML, PDF, man pages, and even Markdown. Using DocBook ensures structuring and validation at scale, as long as you use the tags it provides.
Then there is DITA.
DITA (Darwin Information Typing Architecture)
DITA is a standard for authoring, managing, and publishing content. It is a theme-based XML architecture with built-in reuse, specialization, and modular content design. It is an open standard, and it is widely used in enterprises for multi-channel, structured content that requires standardization and reuse.
Here is an example:
id="install">
Installation
npm install my-library
This library requires Node.js >= 22
DITA defines types like this And Which clearly maps the procedural structure. You can compose topics, reuse them through content references (conrefs), and specialize as your domain grows.
One of the more interesting features DITA offers is the ability to filter content and create multiple versions of the same document.
The DITA open toolkit and several enterprise tools handle rendering, transformation, and reuse pipelines.
Eve. xml.
Yes, XML. The syntax is more verbose than Markdown. The tooling is less ubiquitous than Markdown. Migration requires effort, and your team may resist the learning curve. For small documents, Markdown's features are often sufficient.
But if you're already adding semantics to Markdown with MDX or plugins or custom scripts, you're paying that complexity cost anyway, and you don't get the benefit of standardization or portability. You're building a fragile, custom semantic layer rather than adapting one that already works.
So, where does that leave you?
If you are writing quickly README Or a short-term doc, markdown is fine. It's fast, accessible and it works. If you're creating a developer documentation site that needs some structure, reStructuredText or AsciiDoc are better choices. They balance expressiveness with usability. And if you're managing a large document set that requires syndication, reuse, and multi-channel publishing, DocBook and DITA provide you with the semantics and tooling to make that process more manageable.
The key is to start with the richest format you can manage and export downwards. Markdown creates a great output for developers. It's accessible and familiar. But be careful not to lock yourself into it as a source of truth, because you can't add context back in as easily as you can remove it.
- I have a new book out. Check out Write Better with Vale. This book guides you through applying Vale, the Prose Linter, to your next writing project to create consistent, quality content.
- Tidewave.ai is a full-stack coding agent from the creators of the Elixir programming language. It supports Ruby on Rails, Phoenix, and React applications and has a free tier. You will need an API key for OpenAI, Anthropic, or GitHub Copilot to use it.
- Google's Chrome for Developers blog has a post on creating an accessible carousel. It's worth a read if you need to implement one of these on your site.
Before the next issue, here are some things you should try to get some practical experience with a different format.
As always, thanks for reading. Share this issue with someone you think would help.
I would love to talk to you about this issue at BlueSky, Mastodon. TwitterOr LinkedIn. Let's connect!
Please support this newsletter and my work by encouraging others to subscribe and by referring a friend to Write Better with Vale, tmux 3, Exercises for Programmers, Small, Sharp Software Tools, or any of my other books.