Last month I published Think Weirder: The Best Science Fiction Ideas of the YearA 16-story compilation featuring Greg Egan, Isabel J. Kim, Ray Naylor, Caroline M. Yoachim, and twelve other wonderful authors. The book was the #1 new release in the short story anthology category on Amazon for a short time, beating out many other newly released short story anthologies published by large NYC publishers with large marketing departments.
I am not a professional publisher. I have a full-time job and two small children, so all of this work was done after my kids went to bed. I had to use my time judiciously, which meant creating an efficient process. Luckily I’m a programmer, and it turns out that programming skills translate surprisingly well to book publishing. This post is about how I built an entire publishing pipeline using Python, YAML files, and LaTeX – and why you might want to do something like this if you’re considering publishing a book. I know writing this will cause professional designers to question my choices, but hopefully the software concepts will be helpful.
My initial thought: Can I really do all this?
I had some concerns when I started this project. Professional publishers have entire departments of experts. How can I possibly handle all this myself?
The answer turned out to be: create tools that automate repetitive parts, and use simple file formats that make everything transparent and debuggable.
Step 1: Tracking Stories with Plain Text Files
The first challenge was to keep track of hundreds of candidate stories from different magazines. I read 391 stories published in 2024 before selecting the final 16. That’s a lot of stories to keep organized.
I could have used a spreadsheet, but I used plain YAML files instead. Here’s why it worked well for me:
- git-friendly: Every decision I made was tracked in version control
- human readable:I can open any file in a text editor and understand what I’m seeing
- It’s easy to create a script around: I wrote several Python functions to perform different types of metadata introspection, which I will study
The structure looks like this:
data/
story-progress.yaml # Central tracking file
markets.yaml # Magazine metadata
themes.yaml # Theme occurrence tracking
subgenres.yaml # Subgenre tallies
stories/
clarkesworld-magazine/
nelson_11_24.yaml # Individual story files
pak_06_24.yaml
reactor-magazine/
larson_breathing.yaml
...Each story file is pure YAML containing the full story text and metadata:
title: "Twenty-Four Hours"
author: H.H. Pak
market: clarkesworld-magazine
url: https://clarkesworldmagazine.com/pak_06_24/
word_count: 4540
year: 2024
slug: pak_06_24
summary: ...Not all stories have public URLs available, but that’s okay because all fields are optional. central story-progress.yaml Tracks editorial status:
clarkesworld-magazine-nelson_11_24:
title: "LuvHome™"
author: Resa Nelson
market: clarkesworld-magazine
status: accepted # or: not_started/relevant/rejected
date_added: '2024-09-08T08:22:47.033192'Step 2: A Simple Command-Line Tool
I created a small Python CLI tool (se.py) to help me navigate all this data. Since I do all this work at night after my kids go to sleep, I wanted something quick that could mirror other tasks I do on the command line. The tool is simple:
python se.py —help
usage: se.py [-h] {markets,stories,relevant,decide,accepted,compile} ...
Story Evaluator CLI
positional arguments:
{markets,stories,relevant,decide,accepted,compile}
Available commands
markets List markets
stories Manage stories
relevant List URLs for stories marked as relevant
decide Make accept/reject decisions on relevant stories
accepted Manage accepted stories
compile Show anthology compilation statistics
optional arguments:
-h, —help show this help message and exit compile The command proved really useful – it gave me immediate feedback on the compile size and structure:
ANTHOLOGY COMPILATION STATISTICS
============================================================
Total Stories: 16
Total Word Count: 115,093 words
Average Word Count: 7,193 words
Unique Authors: 16
Markets Represented: 4
STORIES BY MARKET:
analog-magazine: 2 stories (12.5%)
asimovs-magazine: 2 stories (12.5%)
clarkesworld-magazine: 10 stories (62.5%)
reactor-magazine: 2 stories (12.5%)This was really helpful during the selection process. I can instantly check how far I am toward my ~120k word goal, and make sure I haven’t accidentally included multiple stories by the same author.
Step 3: Typeseting the Print Book
This part surprised me the most. I initially thought I would have to learn Adobe InDesign or pay someone to do the typesetting. But I decided to use LaTeX instead, because I had some previous experience with it (another publishing friend had sent me some of his example files, and I had some academic experience). This process went better than expected.
I used XeLaTeX with this memoir Document class. What I like about this approach:
- reproducible: I can rebuild an entire book from source in a few seconds, and I can use the same template next year
- business typography: LaTeX handles ligatures, kerning and line breaking better than my manual way
- custom fonts: I used Crimson Pro for the body text and Capital for the headings
- Again, the version control I’m used to: The entire book is just text files in Git
The main parts of the master file for a book are really simple:
\documentclass[final,11pt,twoside]{memoir}
\usepackage{compelling}
\begin{document}
\begin{frontmatter}
\include{title}
\tableofcontents
\end{frontmatter}
\begin{mainmatter}
\include{introduction}
\include{death-and-the-gorgon}
\include{the-best-version-of-yourself}
% ... 14 more stories
\include{acknowledgements}
\end{mainmatter}
\end{document}All formatting rules are present compelling.styA custom style package. Here’s a link to the full, disorganized file. Some key points:
% 6x9 inch trade paperback size
\setstocksize{9in}{6in}
\settrimmedsize{9in}{6in}{*}
% Margins
\setlrmarginsandblock{1.00in}{0.75in}{*}
\setulmarginsandblock{0.75in}{0.75in}{*}
% Typography nerding
\usepackage[final,protrusion=true,factor=1125,
stretch=70,shrink=70]{microtype}
% Custom fonts loaded from local files
\setromanfont[
Ligatures=TeX,
Path=./Crimson_Pro/static/,
UprightFont=CrimsonPro-Regular,
BoldFont=CrimsonPro-Bold,
ItalicFont=CrimsonPro-Italic,
BoldItalicFont=CrimsonPro-BoldItalic
]{Crimson Pro}
\setsansfont[
Path=./Rajdhani/,
UprightFont=Rajdhani-Bold,
BoldFont=Rajdhani-Bold,
ItalicFont=Rajdhani-Bold,
BoldItalicFont=Rajdhani-Bold
]{Rajdhani}
% Chinese font family for CJK characters
\newfontfamily\chinesefont{PingFang SC} microtype The package does a very subtle job with character spacing and line breaking making text look professionally typeset.
I wanted the story titles in bold sans-serif and the author names in light gray below. Here’s how I set it up:
\renewcommand{\chapter}[2]{
\pagestyle{DefaultStyle}
\stdchapter*{
\sffamily
\LARGE
\textbf{\MakeUppercase{#1}}
\\
\large
\color{dark-gray}
{\MakeUppercase{#2}}
}
\addcontentsline{toc}{chapter}{
\protect\parbox[t]{\dimexpr\textwidth-3em}{
\sffamily#1
\\
\protect\small
\protect\color{gray}
\protect\textit{#2}
}
}
\def\leftmark{#1}
\def\rightmark{#2}
}it redefines chapter The command taking two arguments, title and byline, and both chapter formatting, sets the TOC formatting, and ensures that the title and byline are printed in headers on alternating pages.
Now every story file just says:
\chapter{Death and the Gorgon}{by Greg Egan}
[story content]Most writers send me stories as HTML, PDF, or Word, so I needed a way to convert them to LaTeX. To do this I wrote a simple Python script, which saved me a huge amount of manual formatting work.
Step 4: Creating the eBook
Printing was one thing, but I also needed an ebook. This proved to be easier than I expected because I could reuse all the LaTeX sources I had already created.
I used Pandoc to convert from LaTeX to EPUB:
# Convert LaTeX to EPUB
pandoc 2025.tex -o Think_Weirder_2025.epub \
—toc \
—epub-cover-image=cover_optimized.jpg \
—css=epub-style.css \
—metadata title="Think Weirder" \
—metadata author="Edited by Joe Stech"Pandoc’s default table of contents showed only story titles. But I also wanted author names, like you see in print anthologies. EPUB is simply a zipped archive of XHTML files, so I wrote a small post-processing script:
def modify_toc(nav_content, authors):
"""Add author bylines to TOC entries."""
pattern = r'([^<]+)'
def add_author(match):
href, title = match.group(1), match.group(2)
chapter_id = extract_id_from_href(href)
if chapter_id in authors:
author = authors[chapter_id]
return f'{title}
\n' \
f'{author}'
return match.group(0)
return re.sub(pattern, add_author, nav_content)The script unzips the EPUB, finds the navigation file, adds the author’s byline, and rezips everything. The eBook table of contents now matches the print version.
what i learned
The whole process took longer than I expected – several months of night work. However, the simple software I wrote actually made it a viable one-person project, and inspired me to go through the entire process again the following year.
It is important to stay organized. When hundreds of stories are involved, it’s easy to forget the details, so use se.py Saving metadata in that moment that could be sliced and diced later was very important.
Reproducible constructs were a lifesaver. I made changes to the book layout until just a week before publication. Because I could rebuild the entire book in seconds, and everything was backed up in Git, I could experiment freely without worrying about breaking things.
Simple file formats put me at ease. When something went wrong, I could always open a YAML file or look at the LaTeX source and understand what was happening. I never got to a point where tools were black boxes.
I didn’t need to understand everything in advance. As I progressed, I learned the details of LaTeX (admittedly I still don’t really understand LaTeX). Same is the case with Pandaok. First I did some basic work, then gradually improved it.
Can you do this too?
If you’re thinking about publishing a book – whether it’s an anthology, a novel, or a collection of technical writing – I think this approach is worth considering. There is something inspiring about having a detailed understanding of each step of the production process. If you have any questions feel free to reach out, I love talking about this hobby! You can email me at joe@thinkweirder.com.
And if you enjoy concept-driven science fiction that’s heavy on innovative ideas, check out Think weird!