Making a simple Static Site Generator
Static Site Generators can be very simple. You take your markdown files, you convert them to HTML and you stick them in a template. Sure, there are some extras that are nice to have, and some of them can also be easily scripted. In this article I will explain how I generate this site using a Makefile and a shell script.
TL;DR
I show you how I build this site using a script called from a Makefile, that converts markdown files to HTML with some additional processing.
Converting Markdown to HTML
If you're feeling adventurous, implement it from scratch. That's not what I did here, I felt this was a wheel not worth reinventing. There are several tools you can use for this. I chose cmark-gfm (GitHub's fork of cmark).
The syntax is as simple as:
cmark-gfm file.md > file.html
It can also read from stdin in case you need to do some preprocessing:
my-preprocessing-function | cmark-gfm > file.html
Putting it in a Makefile
Makefiles are great since they only rebuild files if their source was updated. Check out the great tutorial in the references1 to learn more. Let's start putting one together with what we've learned so far. What we want is to take all the markdown files in the current directory and convert them to HTML files with the same name and different extension. We start by listing the files that we will be working with:
MD := $(wildcard *.md)
HTML := $(MD:%.md=%.html)
Here we're first using a wildcard to match all .md files in the current
directory and put them in the $(MD) variable. Then we use a Substitution
Reference2 to replace all the .md extensions in $(MD) with .html, and save
it in the $(HTML) variable, which now contains all our targets.
Now we need a rule to build our targets:
all: $(HTML)
%.html: %.md
cmark-gfm $< > $@
Here we're saying that the .html files should be built from the corresponding
.md files using cmark-gfm. The special variables $< and $@ mean the name
of the first prerequisite (.md file) and the name of the target (.html file),
respectively. 3
The "all" target (or whatever you decide to name it) must be specified and
depend on the files in the $(HTML) variable.
Preprocessing
That's cool and all, but our HTML files are still lacking some important stuff,
like hmm... I don't know, a <html> tag maybe? I suppose you could do
something as ugly as echoing/cating some template HTML to prepend/append to
the generated one, with all your headers and stuff, but let's do something
slightly less ugly. Oh wait, never mind, that's almost exactly what I did.
Let's put it in a shell script, we'll get back to the Makefile soon. Here is the
main thing (we'll get to the functions later):
cat <<-EOF
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<link rel="stylesheet" type="text/css" href="/css/style.css"/>
<title>~jlucas/$(gettitle "$input")</title>
</head>
<body>
<header>
$(cat "${srcdir}/templates/header.html")
</header>
<main>
$(preprocess "$input" | cmark-gfm --unsafe -e strikethrough -e table -e footnotes)
</main>
<footer>
$(cat "${srcdir}/templates/footer.html")
</footer>
</body>
EOF
Yes, a lot of it is hardcoded. Yes, I could've used a real template engine. But this is specific for this site and I don't plan on changing this structure for specific files, so who cares?
With that out of the way, let's break it down. In the middle of this hardcoded
mess, you can find some $(shell substitutions). They are used to insert the
page title in the <head>, the navigation bar in the <header>, the converted
HTML in <main> and the footer in the <footer>.
The header and footer are pretty simple. I just do a little catception to
include them from static manually written files.
As for the title, I'm taking it from the heading on the first line of the markdown file. Here's the function:
gettitle() {
sed -n '1s/^#* *//p' "$1"
}
The regex is just removing any '#' symbols that might be prepended.
The final piece we're missing here is the preprocess function. This will
depend on the use case and might not always be needed. All I use it for at the
time of writing is to auto generate a list with all posts. For that I make it
search each line of the input file for the expression "@POSTS@" and replace it
with the actual list. Here it is:
preprocess() {
while IFS= read -r line; do
case "$line" in
'@POSTS@') listposts ;;
*) printf '%s\n' "$line" ;;
esac
done < "$1"
}
Basically it reads each line of the markdown file, and either prints it as is or
lists the posts. The "IFS=" is needed to prevent it from removing leading
whitespace. And what's that listposts thing? Oh, right, it's yet another
function. I promise it's the last one. Here it is:
listposts() {
find "${srcdir}/posts" -type f -regextype posix-extended \
-regex '.*/[0-9]{8}-.*\.md' -printf '%f\n' |
sort -r |
while read -r file; do
filedate="$(date -d "${file%%-*}" +%F)"
printf '+ [%s] [%s](%s)\n' \
"$filedate" \
"$(gettitle "${srcdir}/posts/${file}")" \
"/posts/${file%md}html"
done
}
I decided to name my posts with a prepended date, in the format YYYYMMDD-title-goes-here.md. That way I avoid having to store some metadata elsewhere. Then I just list all the files in the posts dir, sort them as latest first, and write the date, title and link in a markdown list.
Then this preprocessed markdown gets fed to cmark-gfm with some additional
options to enable some nice extensions and allow raw HTML.
Putting it all together
So far we have an outdated Makefile and a shell script that takes a markdown
file, does some preprocessing to it and spits out the resulting HTML to stdout.
We'll name the shell script src/generate.sh. Let's update that Makefile.
Previously we were calling cmark-gfm directly in the Makefile. That's now
handled by the script, so just replace cmark-gfm with src/generate.sh. And
that's all there is to it, unless of course you want to go a little bit further
with organizing the directory structure. You do? Cool, let's do that.
Since I wanted to have the HTML and markdown in separate directories while
mirroring the directory structure, I had to make some changes to the way I get
the list of targets, as well as the rule to build them. Instead of a wildcard
like we had before, we can invoke a shell with the find command to list
markdown files recursively. Also I keep the markdown files in the src directory,
so that also needs to be handled when we do the substitution for the HTML
targets:
MD := $(shell find src -type f -name '*.md')
HTML := $(MD:src/%.md=%.html)
And finally the rule to build our HTML should look like this:
$(HTML): %.html: src/%.md
@mkdir -p $(@D)
src/generate.sh $< > $@
Notice I added a mkdir command to create directories if needed. The $(@D)
variable evaluates to the directory containing $@. Also I had to use Static
Pattern Rules4 to allow the pattern matching to work properly with
subdirectories.
Now add some more dependencies to have certain files rebuild when scripts or templates change, and we're all set (check out the complete Makefile below).
Wrapping up
Wow, that came out a bit longer than I was expecting. Still, the point is that if you don't need all the complexity of modern SSGs, you might be better off making your own thing. All we did here convert markdown to HTML with a little {pre,post}processing and the help of a Makefile. Oh, and you might want to sprinkle some nice CSS on top. But not JS. Don't do that. That's bad, mkay?
I'm leaving the complete Makefile and script here for convenience, though you can also view the source on Codeberg (or even here and here since I don't keep the source in a separate repo).
Makefile
MD := $(shell find src -type f -name '*.md')
HTML := $(MD:src/%.md=%.html)
TEMPLATES := $(wildcard src/templates/*.html)
SCRIPTS := $(wildcard src/*.sh)
.PHONY: all clean
all: $(HTML)
# Rebuild post lists when posts are updated
index.html posts/index.html: $(wildcard src/posts/*.md)
$(HTML): %.html: src/%.md $(TEMPLATES) $(SCRIPTS)
@mkdir -p $(@D)
src/generate.sh $< > $@
clean:
rm -f $(HTML)
src/generate.sh
#!/bin/sh
srcdir="$(dirname $0)"
die() {
printf '%s\n' "$1" >&2
exit "${2:-1}"
}
gettitle() {
sed -n '1s/^#* *//p' "$1"
}
listposts() {
find "${srcdir}/posts" -type f -regextype posix-extended \
-regex '.*/[0-9]{8}-.*\.md' -printf '%f\n' |
sort -r |
while read -r file; do
filedate="$(date -d "${file%%-*}" +%F)"
printf '+ [%s] [%s](%s)\n' \
"$filedate" \
"$(gettitle "${srcdir}/posts/${file}")" \
"/posts/${file%md}html"
done
}
preprocess() {
while IFS= read -r line; do
case "$line" in
'@POSTS@') listposts ;;
*) printf '%s\n' "$line" ;;
esac
done < "$1"
}
input="$1"
echo "$input" | grep -sq '\.md$' || die "Usage: $0 INPUT_FILE.md"
cat <<-EOF
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<link rel="stylesheet" type="text/css" href="/css/style.css"/>
<title>~jlucas/$(gettitle "$input")</title>
</head>
<body>
<header>
$(cat "${srcdir}/templates/header.html")
</header>
<main>
$(preprocess "$input" | cmark-gfm --unsafe -e strikethrough -e table -e footnotes)
</main>
<footer>
$(cat "${srcdir}/templates/footer.html")
</footer>
</body>
EOF