The discovery came six years after deployment: thirty-eight thousand archived invoices declared PDF/A-1b in their XMP metadata, and a sample run through veraPDF failed nearly all of them — device-dependent RGB with no OutputIntent, two fonts never embedded, annotation flags that ISO 19005-1 forbids. The generating application had written the conformance identifier without ever validating against the standard, and every downstream system trusted the label. Self-declared compliance is worth exactly what it costs to write. The durable alternative is preflight at the moment a file enters the archive, and losLab PDF Library (PDFlibPas) makes that practical for Delphi and C++Builder systems by building PDF/A and PDF/UA validators into the library itself, with no external service in the loop.
Two standards that fail files for opposite reasons
ISO 19005 (PDF/A) is a reproduction contract: a conforming file must render identically decades from now, on software that has never seen the originating system. Its rules therefore attack external dependencies — every font embedded, color anchored to an embedded ICC OutputIntent or expressed in device-independent spaces, no encryption in PDF/A-1, no JavaScript, XMP metadata consistent with the document information dictionary.
ISO 14289 (PDF/UA) is a semantics contract: assistive technology must be able to traverse the document meaningfully. Its rules live in a different layer entirely — a complete structure tree, alternate text on figures, a document title configured for display, heading levels that never skip, table header relationships that hold up off screen.
The two are orthogonal, and the dangerous files sit in the middle. An archive-perfect document can be unreadable to a screen reader; a flawlessly tagged document can reference a desktop font that will not exist in ten years. Pipelines that need both — public-sector publishing is the canonical case — must run both checks and route the findings to different owners, because unembedded fonts are a generation-code defect while missing alternate text belongs to whoever maintains the templates.
Part choice matters as much as the verdict. PDF/A-1, frozen on PDF 1.4, rejects transparency and JPEG2000 — features modern reporting output uses freely. PDF/A-2 (ISO 19005-2, built on ISO 32000-1) accepts both and is the sensible default for new archives. PDF/A-3 additionally allows embedded files of arbitrary type, which regulated e-invoicing formats depend on. A team still standardizing on PDF/A-1b in 2026 is usually inheriting a requirement written fifteen years ago, and the cheapest fix is often renegotiating the target part rather than stripping transparency out of every chart.
Structured findings at ingestion time
The flat-API entry point is CheckFileCompliance, with the test selector 1 for PDF/A and 2 for PDF/UA. It returns a string-list handle whose items are individual findings, which makes it the right shape for an automated gate:
function GateArchiveUpload(Pdf: TPDFlib; const FileName: string): Boolean;
var
ListId, I: Integer;
begin
ListId := Pdf.CheckFileCompliance(FileName, '', 1, 0); // 1 = PDF/A
if ListId = 0 then
begin
// 0 means "no findings" OR "file unreadable" -- disambiguate before passing
Result := Pdf.LastErrorCode = 0;
Exit;
end;
for I := 0 to Pdf.GetStringListCount(ListId) - 1 do
LogFinding(FileName, Pdf.GetStringListItem(ListId, I));
Pdf.ReleaseStringList(ListId);
Result := False;
end;
Two implementation details decide whether this works unattended. First, CheckFileCompliance returns 0 both when the file is fully compliant and when the file could not be opened at all — internally, an empty result list produces 0 either way — so a gate that reads 0 as a pass will wave corrupt uploads straight into the archive. Disambiguate with LastErrorCode, as above. Second, the checker runs on the library's streaming reader rather than the full document model: it opens the file directly with read sharing and never needs LoadFromFile, which is why it handles multi-gigabyte files without building object trees — but it will fail while another process still holds the file open for writing, which is precisely the state of an upload in progress. Gate after the transfer completes, not during.
Throughput follows from the same design. Because each check opens its input read-only and shares the file for reading, a corpus audit parallelizes naturally across worker threads or processes, one TPDFlib instance per worker. The string-list handles are the one resource to stay disciplined about: every non-zero handle from CheckFileCompliance remains allocated until ReleaseStringList, and a long-running gate that leaks them degrades slowly rather than failing loudly.
Reports for humans, diffs for build gates
Finding lists are the right shape for a gate and the wrong shape for an email to the template team. CreatePreflightReport renders the same analysis as a readable report, CreatePreflightReportEx adds a report-format selector, and SavePreflightReport writes the report to disk so it can ship inside the document package — a contractual requirement in plenty of archival delivery agreements.
The underrated member of the family is ComparePreflightReports. Compliance results are a regression surface like any other: a template tweak, a new corporate font, or a library upgrade can each introduce findings that were absent last release. Keep golden reports for a corpus of representative documents under version control, regenerate them after every change, and let ComparePreflightReports compute the delta. An empty diff becomes a release artifact; a surprise finding fails the build instead of failing the audit.
Generating output that passes on the first run
Preflight earns its keep on inbound files; for documents your own code produces, repairing findings after generation is backwards. PDFlibPas carries a generation-side mode for each standard, and the two compose:
var
Pdf: TPDFlib;
Diag: WideString;
begin
Pdf := TPDFlib.Create;
try
Pdf.NewDocument;
Pdf.SetPDFAMode(1);
Pdf.LoadOutputIntentProfile('sRGB-IEC61966-2.1.icc', 'RGB');
Pdf.SetPDFUAMode('en-US');
Pdf.SetInformation(1, 'Quarterly Statement'); // /Title: required for PDF/UA
// ... draw tagged content here ...
Diag := Pdf.GetPDFUADiagnostics;
if Diag <> '' then
Writeln('fix before shipping: ', Diag);
Pdf.SaveToFile('statement.pdf');
// the preflight that counts runs on the saved file:
Writeln(Pdf.CreatePreflightReport('statement.pdf', '', 1, 0));
finally
Pdf.Free;
end;
end;
The trap hides at save time. Several conformance repairs — forcing the print flag on annotations, writing the default AFRelationship for PDF/A-3 embedded files, normalizing tab order and form-field descriptions for PDF/UA — are applied while the document is serialized, not when the mode is enabled. The in-memory document is not byte-identical to what reaches disk, so the only preflight verdict that counts is the one computed from the saved file. Validate statement.pdf; never infer compliance from the document object still in memory.
Invoice scenarios that embed machine-readable XML beside the visual document — the ZUGFeRD and Factur-X pattern, built on PDF/A-3 — should set the attachment relationship explicitly through SetPDFA3DefaultAFRelationship, because ISO 19005-3 requires every embedded file to declare its role relative to the document.
Independent referees: veraPDF and Acrobat
Never let a producer grade its own homework. The PDFlibPas checkers give fast, structured, in-process verdicts; the release gate for archival batches should still include an independent validator. veraPDF is the community-maintained reference implementation for PDF/A validation and the tool most archives name in their acceptance criteria; Acrobat's preflight profiles make a useful third opinion when two tools disagree. Record the validator name and version next to every stored report — a claim of passing veraPDF means little without knowing which veraPDF.
When validators do disagree, and around the edges of the standards they occasionally will, shrink the file to a minimal sample and rule against the standard text rather than against either tool. The exercise takes an hour and usually surfaces either a tool bug worth reporting or a misread clause worth documenting in the team's compliance notes.
Encrypted inputs deserve a special rule. Both checkers accept a password argument, but for PDF/A-1 the encryption dictionary itself is already a conformance violation — ISO 19005-1 forbids it outright — so an encrypted submission can be rejected before any deeper analysis runs. Auditing what an encryption dictionary actually permits is a separate exercise, covered in PDF encryption and permissions auditing.
PDF/UA findings nearly always trace back to how the structure tree was authored; the tagging techniques are covered in building tagged PDF structure trees in Delphi. Archives that also require digital signatures should pair this gate with the workflow in PAdES signing and validation. The full preflight API reference lives on the losLab PDF Library for Delphi product page.