Concept · Long-term archiving

What is PDF/A,
and why does it matter?

PDF/A is the version of PDF built to last. It is a stricter, ISO-standardized subset of normal PDF: everything the file needs in order to look right, from fonts to color, is sealed inside it, so the document opens the same way today, next year, and decades from now. This guide explains it in plain language, with analogies and examples.

The simplest analogy: a time capsule

Think of a normal PDF as a recipe card that says "use the flour from the shop on the corner." It works perfectly today, while that shop is open and stocks that flour. But hand the card to someone in twenty years, after the shop has closed, and the recipe can no longer be reproduced faithfully.

PDF/A is the same recipe with every ingredient sealed in the box: the flour, the spices, the exact measuring cups. It does not rely on anything from the outside world. That is the whole idea, a document that is self-contained and therefore reproducible forever. It is the difference between a photo of a meal and a vacuum-sealed kit that anyone can cook, identically, on any day in the future.

What goes wrong with an ordinary PDF

A regular PDF can reference a font instead of carrying it. If the reader does not have that exact font, it substitutes another one, and your carefully laid-out document shifts, reflows, or loses characters. Watch the left card: the same line, rendered on a machine that is missing the font.

An ordinary PDF

Trusts the reader to already have the right fonts.

Café Façade · Año 2050 Caf□ Fa□ade · A□o 2050
Font missing · characters substituted

The same file as PDF/A

Carries its fonts and colors inside, sealed.

Café Façade · Año 2050
Identical on any device, in any year

What turns a PDF into a PDF/A

PDF/A adds a set of rules whose single goal is to remove every dependency on the outside world. Everything required to render the page is embedded; anything that could change or phone home is banned.

  • Fonts must be embedded: the file carries every glyph it uses, so nothing is ever substituted.
  • Color must be device-independent: an ICC profile and output intent pin down the exact colors.
  • Metadata in XMP: title, author and dates are stored in a standard, machine-readable form.
  • No external references: no links to files or fonts that live somewhere else.
  • No JavaScript, audio or video: nothing that executes or depends on a plugin.
  • No encryption: an archive must stay openable without a password or key.

Why it matters: documents outlive their software

Records often must stay readable for 5 to 30 years or more. Fonts get removed, renderers change, operating systems are retired. An ordinary PDF can quietly degrade over that span; a PDF/A is built to render identically the whole time.

As the marker sweeps forward in time, the ordinary file drifts from green toward red. The PDF/A stays faithful, because it never depended on anything that could disappear.

The parts: PDF/A-1, 2, 3 and 4

PDF/A grew in parts over time. Each newer part is based on a newer version of PDF and relaxes a few restrictions while keeping the archival guarantee.

PDF/A-1 2005

Based on PDF 1.4. The strictest and most universally supported. No transparency, no layers, no embedded files.

PDF/A-2 2011

Based on PDF 1.7. Adds JPEG2000, transparency, layers, and embedding other PDF/A files. The common default today.

PDF/A-3 2012

Like A-2, but lets you embed any file type. This is what e-invoices (Factur-X, ZUGFeRD) use to attach the source XML.

PDF/A-4 2020

Based on PDF 2.0, the modern foundation. Simplifies the conformance model for new archival systems.

The levels: b, a and u

On top of a part, a conformance level says how much the file guarantees. You combine them, for example PDF/A-2b or PDF/A-2a.

LevelNameWhat it guarantees
bBasicThe document looks the same everywhere. Visual fidelity only.
uUnicodeEverything in b, plus every character maps to Unicode, so text is reliably searchable and copyable.
aAccessibleEverything in u, plus a tagged logical structure (headings, lists, tables, alt text) for assistive technology and PDF/UA.

rust-pdf produces and validates PDF/A-1b, 2b, 2a, 3b and 3a, with the accessible a levels building a full tagged structure tree.

Where PDF/A is used in the real world

Anywhere a document must remain trustworthy long after it was created, and in most places this is not optional but legally required.

E-invoicing

Factur-X and ZUGFeRD embed the invoice XML inside a PDF/A-3, so one file is both human-readable and machine-readable.

Legal & courts

Contracts and case filings must be archived in a form that cannot silently change and stays readable for decades.

Banking & finance

Statements and disclosures are retained for years under regulation, and must render identically on audit.

Healthcare

Patient records and reports are kept long-term and must stay faithful, with no external dependencies.

Government & archives

National archives and public administrations mandate PDF/A for digital preservation of official records.

Accessible records

PDF/A-2a and 3a carry a tagged structure, meeting accessibility duties (PDF/UA) at the same time.

Ordinary PDF vs PDF/A at a glance

 Ordinary PDFPDF/A
FontsMay be referencedAlways embedded
ColorCan be device-dependentICC + output intent
External links to contentAllowedForbidden
JavaScript / audio / videoAllowedForbidden
EncryptionAllowedForbidden
MetadataOptionalRequired XMP
Opens identically in 30 yearsNot guaranteedBy design

How to create a valid PDF/A with rust-pdf

One method call embeds the sRGB ICC profile, adds the output intent, writes the XMP metadata and the document ID, and enforces the rules. The output is checked by veraPDF, the reference open-source validator.

# pip install rustpdf
import rustpdf

with rustpdf.Document() as doc:
    doc.pdfa(rustpdf.PdfaLevel.A2B).set_info(title="Q3 Report")
    f = doc.add_font_file("Roboto-Regular.ttf")   # embedded & subset
    doc.add_page()
    doc.show_text(f, 22, 72, 760, "Archival report")
    doc.save("report_pdfa.pdf")                    # validates as PDF/A-2b
// dotnet add package RustPdf
using RustPdf;

using var doc = new Document();
doc.Pdfa(PdfaLevel.A2a).Tagged().SetInfo(title: "Q3 Report"); // accessible
int f = doc.AddFontFile("Roboto-Regular.ttf");
doc.AddPage();
doc.ShowText(f, 22, 72, 760, "Archival report", headingLevel: 1);
byte[] bytes = doc.ToBytes();                                 // PDF/A-2a
// go get github.com/rustpdf/rustpdf-go@latest
doc, _ := rustpdf.New()
defer doc.Close()
doc.PdfaLevel(rustpdf.A2b)
doc.SetInfo(rustpdf.Info{Title: "Q3 Report"})
f, _ := doc.AddFontFile("Roboto-Regular.ttf")
doc.AddPage()
doc.ShowText(f, 22, 72, 760, "Archival report", 0)
data, _ := doc.ToBytes()                            // PDF/A-2b
// npm install rustpdf
const { Document, PdfaLevel } = require("rustpdf");

const doc = new Document();
doc.pdfa(PdfaLevel.A2b).setInfo({ title: "Q3 Report" });
const f = doc.addFontFile("Roboto-Regular.ttf");
doc.addPage();
doc.showText(f, 22, 72, 760, "Archival report");
const bytes = doc.toBytes();                         // PDF/A-2b
Conformance validated by: veraPDFqpdfmutool

Need archival plus attachments (e-invoices) or accessibility? Use A3B/A3A for embedded files, or pair any level with tagging. Full details in the documentation.

PDF/A FAQ

What is PDF/A?

PDF/A is an ISO-standardized version of PDF (ISO 19005) designed for long-term archiving. It is a stricter subset of normal PDF: everything needed to display the document, such as fonts and color information, must be embedded inside the file, and anything that could change over time or depend on the outside world (external links, JavaScript, audio, video, encryption) is forbidden. The result is a self-contained file that will open and look the same decades from now.

Why is PDF/A important?

Many laws and regulations require records to be kept readable for 5 to 30 years or longer: invoices, contracts, bank statements, medical and court records, and government archives. A normal PDF can render differently or break as fonts and software change. PDF/A removes those external dependencies so the document stays faithful and verifiable over the long term, which is why it is mandated for e-invoicing, public-sector archiving and regulated industries.

What is the difference between PDF/A-1, PDF/A-2, PDF/A-3 and PDF/A-4?

These are the parts of the standard. PDF/A-1 (2005) is based on PDF 1.4 and is the strictest. PDF/A-2 (2011) is based on PDF 1.7 and adds JPEG2000, transparency, and embedding other PDF/A files. PDF/A-3 (2012) additionally allows embedding any file type, which is what e-invoice formats like Factur-X and ZUGFeRD use to attach the source XML. PDF/A-4 (2020) is based on PDF 2.0.

What do the conformance levels a, b and u mean?

Level b (basic) guarantees the document looks the same everywhere. Level a (accessible) adds a tagged logical structure and Unicode mapping, so the document is also accessible to assistive technology and reliably searchable. Level u guarantees Unicode mapping for all text without requiring full tagging. So PDF/A-2b is visually faithful, while PDF/A-2a is visually faithful and accessible.

How do I create a valid PDF/A file?

Generate the document with a library that embeds all fonts, attaches an ICC color profile and output intent, writes XMP metadata, and avoids forbidden features. With rust-pdf you call one method, for example doc.pdfa(PdfaLevel.A2B), in Python, C#, Go, Node, Java, PHP, Ruby, Delphi or Swift. The output is validated with veraPDF for levels 1b, 2b, 2a, 3b and 3a.

Generate archival PDF/A in your language

One core, the same archival output across nine languages. Prototype for free, license the corporate features when you ship.