Concept · Metadata

What is XMP metadata
in PDF?

XMP (Extensible Metadata Platform) is an XML packet embedded inside the PDF file itself, describing the document: title, author, dates, rights, and arbitrary custom schemas. It is standardised and machine-readable, so any document management system, validator, or archiving tool can read it without parsing the page content. This guide explains what XMP is, why it matters, and how it underpins standards like PDF/A, ZUGFeRD, and PDF/UA.

The simplest analogy: a catalog card inside the book

Think of a library catalog card: it records the title, author, subject, call number, and publication date so the librarian can find and classify the book without opening every page. In a traditional library the card lives in a separate drawer and can get lost or fall out of date.

XMP metadata is that catalog card, but stored inside the book's own cover so it can never be separated. When you send a PDF, the metadata travels with it. Any tool that receives the file, whether a DMS, a validator, an archiving system, or a search engine crawler, can read the title, the author, the creation date, and the conformance identifiers without rendering a single page.

The /Info dictionary and XMP: old and new

PDFs have always had a place for metadata. The older mechanism is the /Info dictionary. The modern, extensible mechanism is XMP. Both can be present in the same file, and for archival PDFs they must be.

The /Info dictionary

A flat key-value store in the PDF trailer. Defined in PDF 1.0, still supported everywhere.

/TitleAnnual Report 2026
/AuthorFinance Team
/CreationDateD:20260115...
/Producerrust-pdf

Fixed schema, no namespaces, no custom fields. Cannot hold conformance identifiers for PDF/A, ZUGFeRD, or PDF/UA.

XMP metadata

A full XML packet embedded as a stream. Standardised by ISO 16684, extensible by design.

<dc:title>Annual Report 2026</> <dc:creator>Finance Team</> <pdfaid:part>2</> <pdfaid:conformance>B</>

Open, extensible namespaces (Dublin Core, PDF/A, ZUGFeRD, PDF/UA, and your own). Machine-readable by any XML tooling.

Modern PDFs use XMP. PDF/A requires both to be present and consistent: the same title and author must appear in both places, and neither may contradict the other.

Where XMP carries conformance identifiers

XMP is how a PDF announces which standard it follows. Validators check the XMP schemas first. If the identifier is missing or wrong, the file fails conformance regardless of how its content is structured.

PDF/A

pdfaid:part / pdfaid:conformance

The pdfaid namespace tells validators the conformance level (1b, 2b, 2a, 3b, 3a). Without it, a PDF that follows every other rule still fails PDF/A. See What is PDF/A?

ZUGFeRD / Factur-X

fx: DocumentType / ConformanceLevel

The Factur-X fx schema records the document type, version, and profile (MINIMUM through EXTENDED). Accounting systems read this to know how to parse the embedded XML. See What is ZUGFeRD / Factur-X?

PDF/UA

pdfuaid:part

The pdfuaid identifier declares accessibility conformance (PDF/UA-1). Assistive technologies and validators use it to confirm the file's structure tree and tagging requirements. See What is PDF/UA?

Document management

dc: / xmp: namespaces

DMS platforms index XMP fields (title, author, subject, keywords, dates) to power full-text search, filtering, and classification without parsing page content.

Digital preservation

xmpMM: / xmpRights:

Archiving systems record provenance, rights, and modification history in XMP so a document's origin and lineage remain readable for decades, independent of the application that created it.

Custom schemas

any namespace

XMP is open by design. Any organisation can define a namespace for its own metadata, from workflow status to approval signatures, and embed it in a standard-compliant packet.

Keeping /Info and XMP in sync

PDF/A section 6.7.3 is strict: every field that appears in both the /Info dictionary and the XMP must carry the same value. If the title in /Info says "Draft" and the XMP says "Final", the file fails conformance. This is a common authoring mistake when metadata is set in two separate places by two separate code paths.

rust-pdf builds both from a single source

Every call to set_info updates a shared entry list. When to_bytes runs, rust-pdf generates both the /Info dictionary and the XMP packet from that one list. They are guaranteed to agree. There is no separate "XMP step" that can drift from the PDF trailer metadata.

For cases where you supply a fully custom XMP packet via set_xmp, you take responsibility for keeping the namespaces and values consistent with any /Info fields you also set. The veraPDF validator will catch any mismatch.

How to set XMP metadata with rust-pdf

Set standard fields, or embed a full custom XMP packet.

# pip install rustpdf
import rustpdf

ed = rustpdf.EditableDoc.load(open("document.pdf", "rb").read())
ed.set_info("Title", "Annual Report 2026")
ed.set_info("Author", "Finance Team")
ed.set_xmp(open("metadata.xmp", "rb").read())   # full custom XMP packet
ed.save("document_meta.pdf")
// dotnet add package RustPdf
using RustPdf;

using var ed = EditableDoc.Load(File.ReadAllBytes("document.pdf"));
ed.SetInfo("Title", "Annual Report 2026");
ed.SetInfo("Author", "Finance Team");
ed.SetXmp(File.ReadAllBytes("metadata.xmp"));
ed.Save("document_meta.pdf");
// go get github.com/rustpdf/rustpdf-go@latest
ed, _ := rustpdf.Load(mustRead("document.pdf"))
defer ed.Close()
ed.SetInfo("Title", "Annual Report 2026")
ed.SetInfo("Author", "Finance Team")
ed.SetXMP(mustRead("metadata.xmp"))
ed.Save("document_meta.pdf")
// npm install rustpdf
const { EditableDoc } = require("rustpdf");
const fs = require("fs");

const ed = EditableDoc.load(fs.readFileSync("document.pdf"));
ed.setInfo("Title", "Annual Report 2026");
ed.setInfo("Author", "Finance Team");
ed.setXmp(fs.readFileSync("metadata.xmp"));
ed.save("document_meta.pdf");
Conformance validated by: veraPDFqpdf

When you also enable PDF/A via pdfa(), rust-pdf automatically generates both the /Info dictionary and the XMP conformance block from the same entry list, so they are always in sync. Full details in the documentation.

XMP metadata FAQ

What is XMP metadata?

XMP (Extensible Metadata Platform) is an XML metadata packet embedded directly inside a PDF file. It describes the document in a machine-readable way: title, author, creation and modification dates, subject, keywords, and arbitrary custom schemas. Because it is embedded in the file itself, the metadata travels with the document and can be read by any conforming PDF reader or document management system without opening the page content.

How is XMP different from the Info dictionary?

The /Info dictionary is an older PDF structure that stores basic fields (Title, Author, CreationDate, and so on) as key-value pairs inside the PDF trailer. XMP is a modern XML-based alternative that is extensible, richer, and standardised. Modern and archival PDFs use XMP. PDF/A requires both to be present and to agree with each other. rust-pdf builds both from a single source so they are always in sync and can never disagree.

Why does PDF/A require XMP?

PDF/A is the archival PDF standard and it requires XMP so that metadata is stored in a standardised, open, machine-readable format that remains accessible in the future. The pdfaid conformance identifier in the XMP tells any reader or validator which part and conformance level the file adheres to. PDF/A section 6.7.3 also requires the XMP and /Info dictionary to be kept in sync so there is no ambiguity between the two metadata sources.

How do XMP and standards like ZUGFeRD / PDF/UA relate?

XMP is the declaration layer that every modern PDF standard uses to announce its conformance. PDF/A writes the pdfaid identifier, ZUGFeRD and Factur-X add the fx schema with document type, version and profile, and PDF/UA adds the pdfuaid identifier. Any validator (veraPDF, Acrobat, a DMS) checks XMP first to know which rules to apply. Without the correct XMP identifiers, a file fails conformance even if all the structural rules are met.

How do I set PDF metadata in code?

With rust-pdf, call set_info on an EditableDoc to set individual fields like Title or Author, or call set_xmp with a full custom XMP packet in bytes to embed your own XML. Both methods are available across all nine supported languages: Python, C#/.NET, Go, Node.js, Java, PHP, Ruby, Delphi, and Swift.

Set and embed XMP metadata in your language

One Rust core, nine language bindings, and a single source of truth for your metadata. Prototype for free, license the corporate features when you ship.