Teknisk artikel

PDFlibPas: sammanfogning och delning av stora PDF-filer med direktåtkomst

losLab PDF Library ger Delphi- och C++Builder-team en PDF-motor med tillgänglig källkod för skrivbord, server, DLL, ActiveX och Dylib, med inbyggda PDF/A- och PDF/UA-kontroller, PAdES-signering och valbara renderare utan extern PDF-tjänst.

Den här artikeln är skriven för teams assembling statements, packets, evidence bundles, or page extracts from large customer PDFs. Den behandlar large-PDF merge and split with direct access som produktionsnära dokumentteknik, inte som ett isolerat komponentanrop.

Den praktiska risken är att merge and split tools often preserve pages but lose bookmarks, named destinations, metadata, page labels, or error evidence when files become large. Därför behöver flödet ett skrivet kontrakt, observerbar diagnostik och realistiska regressionsfiler.

Arkitekturbeslut

Define what must follow the page. page-range syntax, validation behavior, and empty-range handling / bookmark, destination, page-label, annotation, and form preservation rules

  • page-range syntax, validation behavior, and empty-range handling
  • bookmark, destination, page-label, annotation, and form preservation rules
  • metadata ownership when multiple source documents are merged
  • temporary storage, rollback, progress, and cancellation policy for large files

Implementeringsflöde

Plan ranges and retained structures up front. Ordningen nedan gör arbetsflödet granskbart för Delphi- och C++Builder-team.

  1. validate all input files, page ranges, and output destinations before writing
  2. create a page mapping that records source file, source page, and output page
  3. copy or rebuild supporting structures according to the assembly profile
  4. write to a temporary output and validate the result before atomic replacement
  5. save the page map and warnings with the completed job

Valideringsbevis

Merge and split evidence for support. Behåll dessa fält tillsammans med utdata eller supportunderlaget.

  • input file list, hashes, sizes, page counts, selected ranges, and output page count
  • page map plus retained or dropped bookmark and destination counts
  • temporary path, cancellation point, rollback result, and elapsed time
  • warnings for damaged pages, unsupported structures, or signature implications

Pages are not the only content being moved

Large-document assembly should consider bookmarks, destinations, annotations, forms, attachments, metadata, page labels, and signatures. Direct access helps performance, but product policy decides which structures are preserved, rebuilt, or dropped.

Regression files worth keeping

Keep more than successful samples. A useful large-PDF merge and split with direct access regression set contains normal files, boundary files, and intentional failure files so the behavior is stable across releases.

  • signed source documents may lose signature trust when pages are extracted
  • bookmarks can point to pages that are removed or reordered
  • forms with shared field names can collide after merge
  • large output files need atomic replacement to avoid partial delivery
  • validate all input files, page ranges, and output destinations before writing
  • create a page mapping that records source file, source page, and output page

Tekniska granskningsnoteringar för large-PDF merge and split with direct access

Använd dessa granskningsnoteringar för att säkerställa att funktionen har passerat demo-nivån och kan försvaras under leverans, support och kundeskalering.

  • Beslut: page-range syntax, validation behavior, and empty-range handling. Implementeringspresspunkt: create a page mapping that records source file, source page, and output page. Acceptansbevis: temporary path, cancellation point, rollback result, and elapsed time. Regressionsutlösare: large output files need atomic replacement to avoid partial delivery
  • Beslut: bookmark, destination, page-label, annotation, and form preservation rules. Implementeringspresspunkt: copy or rebuild supporting structures according to the assembly profile. Acceptansbevis: warnings for damaged pages, unsupported structures, or signature implications. Regressionsutlösare: signed source documents may lose signature trust when pages are extracted
  • Beslut: metadata ownership when multiple source documents are merged. Implementeringspresspunkt: write to a temporary output and validate the result before atomic replacement. Acceptansbevis: input file list, hashes, sizes, page counts, selected ranges, and output page count. Regressionsutlösare: bookmarks can point to pages that are removed or reordered
  • Beslut: temporary storage, rollback, progress, and cancellation policy for large files. Implementeringspresspunkt: save the page map and warnings with the completed job. Acceptansbevis: page map plus retained or dropped bookmark and destination counts. Regressionsutlösare: forms with shared field names can collide after merge
  • Beslut: page-range syntax, validation behavior, and empty-range handling. Implementeringspresspunkt: validate all input files, page ranges, and output destinations before writing. Acceptansbevis: temporary path, cancellation point, rollback result, and elapsed time. Regressionsutlösare: large output files need atomic replacement to avoid partial delivery
  • Beslut: bookmark, destination, page-label, annotation, and form preservation rules. Implementeringspresspunkt: create a page mapping that records source file, source page, and output page. Acceptansbevis: warnings for damaged pages, unsupported structures, or signature implications. Regressionsutlösare: signed source documents may lose signature trust when pages are extracted
  • Beslut: metadata ownership when multiple source documents are merged. Implementeringspresspunkt: copy or rebuild supporting structures according to the assembly profile. Acceptansbevis: input file list, hashes, sizes, page counts, selected ranges, and output page count. Regressionsutlösare: bookmarks can point to pages that are removed or reordered

Gränsfall

  • signed source documents may lose signature trust when pages are extracted
  • bookmarks can point to pages that are removed or reordered
  • forms with shared field names can collide after merge
  • large output files need atomic replacement to avoid partial delivery

Delphi / C++Builder notes

PDFlibPas should sit behind a small service boundary that receives files, streams, profiles, and credentials, then returns output paths, warnings, metrics, and validation status. Important terms include merge, split, direct access, page range, bookmark, page map.

Delphi-kodexempel

Följande Delphi-skiss visar en praktisk servicegräns för detta ämne. Håll policykontroller, loggning och validering utanför det smala produktanropet så att arbetsflödet går att testa.

procedure MergeLargePdfSet(const ListFile, OutputFile: string);
var
  Pdf: TPDFlib;
begin
  Pdf := TPDFlib.Create;
  try
    RequireSortedInputList(ListFile);
    Pdf.MergeFileListFast(ListFile, OutputFile);
    VerifyMergedPageRanges(OutputFile);
  finally
    Pdf.Free;
  end;
end;

Produktionschecklista

  • Kör arbetsflödet på en tom fil, en normal kundfil och en värstafallfil
  • Öppna den genererade PDF-filen med rätt visare, validator, skrivare eller nedströmsapplikation
  • Logga produktversion, profilversion, inmatningshash, utdatasökväg, förfluten tid och antal varningar
  • Håll lösenord, certifikat, tillfälliga filer och kunddata under tydliga lagringsregler
  • Lägg till regressionsdokument när en kundfil avslöjar ett nytt gränsfall

Produktdokumentation

PDFlibPas

Fler kodexempel

PageRef := Lib.DAFindPage(Handle, 250);          // page number -> object handle
if PageRef <> 0 then
begin
  Text := Lib.DAExtractPageText(Handle, PageRef, 0);
  Lib.DARenderPageToFile(Handle, PageRef, 5, 150, 'page250.png');
end;
Lib.AddToFileList('Statements', 'jan.pdf');
Lib.AddToFileList('Statements', 'feb.pdf');
Lib.AddToFileList('Statements', 'mar.pdf');
Lib.MergeFileList('Statements', 'q1-statements.pdf');

// Verify the result the cheap way: direct access again
Handle := Lib.DAOpenFileReadOnly('q1-statements.pdf', '');
Writeln('merged pages: ', Lib.DAGetPageCount(Handle));
Lib.DACloseFile(Handle);