HTML to PDF

When you search for HTML to PDF you see two common answers:

Well maybe I don't have the capacity to remember all the LaTeX commands because I've still got all of the Minecraft recipes taking space in my brain. But I can remember most of the HTML tags and CSS rules to make a decent document.

chrome --print-to-pdf

For printing HTML documents to PDF, chrome's --print-to-pdf option is absolutely rubbish. It will just decide for you what kind of margins, background graphics, paper size that it thinks you want.

pandoc

pandoc is a fantastic swiss army knife of converting documents via LaTeX. Which is fine if you want to use the default LaTeX style, but for more exotic document layouts or formats, I've found it really lacking. You can inject LaTeX directives in a weird YAML header but I could never get it to do what I needed. "You can always drop to raw LaTeX in a pandoc document". But if I knew how to do what I wanted in LaTeX, then I wouldn't need to convert an HTML document to PDF in the first place.

CSS styles

The first thing you'll need is a stylesheet which turns your HTML into pages. Sadly I have not found a way for the stylesheet to break the content across pages automagically, so you need a div element for each page in the PDF. Manually breaking pages is annoying, but it's less effort than learning latex.

/* Reset CSS */ *, *::before, *::after { box-sizing: border-box; } * { margin: 0; } /* Typographic tweaks! 3. Add accessible line-height 4. Improve text rendering */ body { line-height: 1.5; -webkit-font-smoothing: antialiased; } img, picture, video, canvas, svg { display: block; max-width: 100%; } input, button, textarea, select { font: inherit; } p, h1, h2, h3, h4, h5, h6 { overflow-wrap: break-word; } body { font-family: sans-serif; font-size: 10pt; background-color: darkcyan; display: flex; flex-direction: column; flex-wrap: wrap; align-content: center; } div.page { width: 210mm; height: 297mm; padding: 2rem; margin: 2rem; background-color: white; overflow: hidden; background-repeat: repeat; background-clip: border-box; } div.content { border: 1px solid greenyellow; width: 100%; height: calc(100% - 2rem); overflow: hidden; } footer { border-top: solid black thin; width: 100%; height: 2rem; display: grid; grid-template-columns: 3fr 1fr; grid-template-areas: "page-number page-number" "charity-number document-version"; } @media print { div.page { margin: 0; } .noprint { display: none; } div.content { border: none; } }

The CSS sets up an HTML document which has div.page elements as pages, and they need a div.content for the actual content, and a footer element for the page's footer. The start of the CSS is from Josh Comeau's CSS reset. I set the body background to darkcyan and leave the div.page background as white so I can see each page as I'm iterating, and the @media print rules change the page style to remove the margins so it all lines up correctly when in the print preview.

The overall HTML document should be something like this, these documents are for a charity but you can change the footer.

html head title link style.css body div.page div.content footer span style="grid-area: page-number" span style="grid-area: charity-number" span style="grid-area: document-version" div.page div.content footer span style="grid-area: page-number" span style="grid-area: charity-number" span style="grid-area: document-version"

Printing the document

We can still use chrome to print the PDF, but rather than using the --print-to-pdf command line option, there is a better way. With selenium webdriver, you can specify the paper size, scale, margins, and whether background graphics are included. See the script below but the gist is that you load webdriver, load up the page, set the print options, then print the page to PDF which gives you the PDF in base64. Decode the base64 and write it to a file. I set the orientation to portrait, page size to A4, margins to 0 because the CSS includes appropriate margins, scale to 1, and shrink_to_fit to False.

import base64 import sys import time import selenium.webdriver as webdriver from selenium.webdriver.common.print_page_options import PrintOptions driver = webdriver.Chrome() driver.get(sys.argv[1]) options = PrintOptions() options.orientation = "portrait" options.page_height = 29.7 options.page_width = 21.0 options.margin_top = 0 options.margin_bottom = 0 options.margin_left = 0 options.margin_right = 0 options.scale = 1.0 options.background = True options.shrink_to_fit = False pdf = driver.print_page(options) with open(f"output/{sys.argv[2]}.pdf", 'wb') as f: f.write(base64.b64decode(pdf)) driver.quit()

Install selenium package for python and run the script with the right arguments. chrome will flash briefly and your PDF will be created.

$ uv venv --python 3.13 $ source .venv/bin/activate.fish (venv) $ uv pip install selenium (vent) $ python3 render.py "http://localhost:8080/document.html" "output-filename.pdf"