技術記事

PDFlibPas: Delphi での large-PDF merge and split with direct access

losLab PDF Library は、Delphi/C++Builder チーム向けにソース提供の PDF エンジンを提供します。デスクトップ、サーバー、DLL、ActiveX、Dylib ワークフローで使え、PDF/A・PDF/UA チェック、PAdES 署名、複数レンダラーを外部 PDF サービスなしで利用できます。

この記事は teams assembling statements, packets, evidence bundles, or page extracts from large customer PDFs 向けです。large-PDF merge and split with direct access を単なるコンポーネント呼び出しではなく、本番向けのドキュメントエンジニアリングとして扱います。

実務上のリスクは merge and split tools often preserve pages but lose bookmarks, named destinations, metadata, page labels, or error evidence when files become large です。そのため、明確な契約、観測可能な診断、実際の顧客ファイルに近い回帰サンプルが必要です。

アーキテクチャ上の判断

Define what must follow the page. page-range syntax, validation behavior, and empty-range handling / bookmark, destination, page-label, annotation, and form preservation rules

  • page-range syntax, validation behavior, and empty-range handling
  • bookmark, destination, page-label, annotation, and form preservation rules
  • metadata ownership when multiple source documents are merged
  • temporary storage, rollback, progress, and cancellation policy for large files

実装フロー

Plan ranges and retained structures up front. The order below keeps the workflow reviewable for Delphi and C++Builder teams.

  1. validate all input files, page ranges, and output destinations before writing
  2. create a page mapping that records source file, source page, and output page
  3. copy or rebuild supporting structures according to the assembly profile
  4. write to a temporary output and validate the result before atomic replacement
  5. save the page map and warnings with the completed job

検証エビデンス

Merge and split evidence for support. Keep these fields with the output or support record.

  • input file list, hashes, sizes, page counts, selected ranges, and output page count
  • page map plus retained or dropped bookmark and destination counts
  • temporary path, cancellation point, rollback result, and elapsed time
  • warnings for damaged pages, unsupported structures, or signature implications

Pages are not the only content being moved

Large-document assembly should consider bookmarks, destinations, annotations, forms, attachments, metadata, page labels, and signatures. Direct access helps performance, but product policy decides which structures are preserved, rebuilt, or dropped.

Regression files worth keeping

Keep more than successful samples. A useful large-PDF merge and split with direct access regression set contains normal files, boundary files, and intentional failure files so the behavior is stable across releases.

  • signed source documents may lose signature trust when pages are extracted
  • bookmarks can point to pages that are removed or reordered
  • forms with shared field names can collide after merge
  • large output files need atomic replacement to avoid partial delivery
  • validate all input files, page ranges, and output destinations before writing
  • create a page mapping that records source file, source page, and output page

Engineering review notes for large-PDF merge and split with direct access

Use these review notes to make sure the feature has moved beyond a demo and can be defended during release, support, and customer escalation.

  • Decision: page-range syntax, validation behavior, and empty-range handling. Implementation pressure point: create a page mapping that records source file, source page, and output page. Acceptance evidence: temporary path, cancellation point, rollback result, and elapsed time. Regression trigger: large output files need atomic replacement to avoid partial delivery
  • Decision: bookmark, destination, page-label, annotation, and form preservation rules. Implementation pressure point: copy or rebuild supporting structures according to the assembly profile. Acceptance evidence: warnings for damaged pages, unsupported structures, or signature implications. Regression trigger: signed source documents may lose signature trust when pages are extracted
  • Decision: metadata ownership when multiple source documents are merged. Implementation pressure point: write to a temporary output and validate the result before atomic replacement. Acceptance evidence: input file list, hashes, sizes, page counts, selected ranges, and output page count. Regression trigger: bookmarks can point to pages that are removed or reordered
  • Decision: temporary storage, rollback, progress, and cancellation policy for large files. Implementation pressure point: save the page map and warnings with the completed job. Acceptance evidence: page map plus retained or dropped bookmark and destination counts. Regression trigger: forms with shared field names can collide after merge
  • Decision: page-range syntax, validation behavior, and empty-range handling. Implementation pressure point: validate all input files, page ranges, and output destinations before writing. Acceptance evidence: temporary path, cancellation point, rollback result, and elapsed time. Regression trigger: large output files need atomic replacement to avoid partial delivery
  • Decision: bookmark, destination, page-label, annotation, and form preservation rules. Implementation pressure point: create a page mapping that records source file, source page, and output page. Acceptance evidence: warnings for damaged pages, unsupported structures, or signature implications. Regression trigger: signed source documents may lose signature trust when pages are extracted
  • Decision: metadata ownership when multiple source documents are merged. Implementation pressure point: copy or rebuild supporting structures according to the assembly profile. Acceptance evidence: input file list, hashes, sizes, page counts, selected ranges, and output page count. Regression trigger: bookmarks can point to pages that are removed or reordered

境界ケース

  • signed source documents may lose signature trust when pages are extracted
  • bookmarks can point to pages that are removed or reordered
  • forms with shared field names can collide after merge
  • large output files need atomic replacement to avoid partial delivery

Delphi / C++Builder notes

PDFlibPas should sit behind a small service boundary that receives files, streams, profiles, and credentials, then returns output paths, warnings, metrics, and validation status. Important terms include merge, split, direct access, page range, bookmark, page map.

Delphi コード例

次の Delphi スケッチは、このテーマに対する実用的なサービス境界を示します。ポリシー確認、ログ記録、検証を製品呼び出しの狭い部分の外側に置くと、ワークフローをテストしやすくなります。

procedure MergeLargePdfSet(const ListFile, OutputFile: string);
var
  Pdf: TPDFlib;
begin
  Pdf := TPDFlib.Create;
  try
    RequireSortedInputList(ListFile);
    Pdf.MergeFileListFast(ListFile, OutputFile);
    VerifyMergedPageRanges(OutputFile);
  finally
    Pdf.Free;
  end;
end;

本番チェックリスト

  • Run the workflow on an empty file, a normal customer file, and a worst-case file
  • Open the generated PDF with the target viewer, validator, printer, or downstream application
  • Log product version, profile version, input hash, output path, elapsed time, and warning count
  • Keep passwords, certificates, temporary files, and customer data under explicit retention rules
  • Add regression documents when a customer file exposes a new edge case

Product documentation

PDFlibPas