Converting HTML to XSL-FO: best approach / tooling to preserve basic layout?
13:41 02 Feb 2026

I need to generate PDFs and my current pipeline expects XSL-FO (rendered by an FO engine), but my input content is HTML.

I’m trying to understand the right way to convert HTML → XSL-FO, ideally preserving common formatting like:

  • headings (h1-h6)

  • paragraphs, bold/italic

  • lists (ul/ol)

  • tables

  • basic CSS (margins/padding, font sizes, alignment)

What I’m looking for

  1. Is there a reliable conversion approach (library/tool) for HTML → XSL-FO?

  2. If direct conversion is not recommended, what’s the best practice pipeline to go from HTML to PDF when I have existing FO-based infrastructure?

  3. How do people handle CSS, especially for tables and spacing, during the conversion?

Context / constraints

  • Input HTML may be user-generated, so it can be messy.

  • I can restrict the HTML/CSS subset if needed.

  • I can run conversion server-side (Java/Python/Node are all possible).

  • Output is XSL-FO XML, then rendered to PDF by an FO engine.

What I tried

  • Searching for “HTML to XSL-FO” mostly returns outdated references or partial converters.

  • I’m unsure whether I should:

    • convert HTML → well-formed XHTML → transform to FO (XSLT?)

    • use a dedicated converter

    • avoid FO and use an HTML-to-PDF renderer instead

xslt