Teknisk artikel

PDFium: PreflightReportCli batch validation in Delphi

Integrera PDFium VCL Component-flöden i Delphi- och C++Builder-applikationer, eller PDFium LCL Component-flöden i Lazarus/FPC, med källkodskomponenter för visning, rendering, formulär, utskrift, preflight-rapporter och standardinriktad validering.

Den här artikeln är skriven för teams integrating command-line PDF validation into build, intake, archive, or customer-processing pipelines. Den behandlar PreflightReportCli batch validation som produktionsnära dokumentteknik, inte som ett isolerat komponentanrop.

Den praktiska risken är att batch validation is only useful if exit codes, timeouts, report paths, severity thresholds, and retry behavior are stable enough for automation. Därför behöver flödet ett skrivet kontrakt, observerbar diagnostik och realistiska regressionsfiler.

Arkitekturbeslut

Design the CLI as a contract for automation. profile selection, fail-on severity, report format, and output directory layout / exit-code mapping for pass, warning, failure, timeout, bad input, and internal error

  • profile selection, fail-on severity, report format, and output directory layout
  • exit-code mapping for pass, warning, failure, timeout, bad input, and internal error
  • parallelism, maximum file size, quarantine path, and retry rules
  • how structured reports are retained with tickets, builds, or customer jobs

Implementeringsflöde

Map validation findings to process outcomes. Ordningen nedan gör arbetsflödet granskbart för Delphi- och C++Builder-team.

  1. standardize command-line arguments before wiring the tool into schedulers
  2. run one file per isolated job so timeouts and failures do not poison the batch
  3. write human and machine reports to deterministic locations
  4. translate findings into pass, review, block, or quarantine states
  5. summarize the batch with counts by severity, profile, and outcome

Valideringsbevis

Batch evidence that makes failures actionable. Behåll dessa fält tillsammans med utdata eller supportunderlaget.

  • command arguments, profile version, file count, elapsed time, and exit code
  • per-file report path, status, issue count, highest severity, and failure reason
  • timeout, retry, quarantine, and cleanup actions
  • machine-readable summary attached to the automated job record

Exit codes should reflect business policy

A preflight CLI is a bridge between PDF analysis and automation. It should produce predictable reports for people, structured output for machines, and clear exit behavior for schedulers and CI systems.

Regression files worth keeping

Keep more than successful samples. A useful PreflightReportCli batch validation regression set contains normal files, boundary files, and intentional failure files so the behavior is stable across releases.

  • one corrupt file should not prevent reports for all remaining files
  • warnings may need to fail CI but only flag intake jobs for manual review
  • parallel processing should not write reports with colliding names
  • operators need a stable difference between validation failure and tool failure
  • standardize command-line arguments before wiring the tool into schedulers
  • run one file per isolated job so timeouts and failures do not poison the batch

Tekniska granskningsnoteringar för PreflightReportCli batch validation

Använd dessa granskningsnoteringar för att säkerställa att funktionen har passerat demo-nivån och kan försvaras under leverans, support och kundeskalering.

  • Beslut: profile selection, fail-on severity, report format, and output directory layout. Implementeringspresspunkt: run one file per isolated job so timeouts and failures do not poison the batch. Acceptansbevis: timeout, retry, quarantine, and cleanup actions. Regressionsutlösare: operators need a stable difference between validation failure and tool failure
  • Beslut: exit-code mapping for pass, warning, failure, timeout, bad input, and internal error. Implementeringspresspunkt: write human and machine reports to deterministic locations. Acceptansbevis: machine-readable summary attached to the automated job record. Regressionsutlösare: one corrupt file should not prevent reports for all remaining files
  • Beslut: parallelism, maximum file size, quarantine path, and retry rules. Implementeringspresspunkt: translate findings into pass, review, block, or quarantine states. Acceptansbevis: command arguments, profile version, file count, elapsed time, and exit code. Regressionsutlösare: warnings may need to fail CI but only flag intake jobs for manual review
  • Beslut: how structured reports are retained with tickets, builds, or customer jobs. Implementeringspresspunkt: summarize the batch with counts by severity, profile, and outcome. Acceptansbevis: per-file report path, status, issue count, highest severity, and failure reason. Regressionsutlösare: parallel processing should not write reports with colliding names
  • Beslut: profile selection, fail-on severity, report format, and output directory layout. Implementeringspresspunkt: standardize command-line arguments before wiring the tool into schedulers. Acceptansbevis: timeout, retry, quarantine, and cleanup actions. Regressionsutlösare: operators need a stable difference between validation failure and tool failure
  • Beslut: exit-code mapping for pass, warning, failure, timeout, bad input, and internal error. Implementeringspresspunkt: run one file per isolated job so timeouts and failures do not poison the batch. Acceptansbevis: machine-readable summary attached to the automated job record. Regressionsutlösare: one corrupt file should not prevent reports for all remaining files

Gränsfall

  • one corrupt file should not prevent reports for all remaining files
  • warnings may need to fail CI but only flag intake jobs for manual review
  • parallel processing should not write reports with colliding names
  • operators need a stable difference between validation failure and tool failure

Delphi / C++Builder notes

PDFium Component should sit behind a small service boundary that receives files, streams, profiles, and credentials, then returns output paths, warnings, metrics, and validation status. Important terms include PreflightReportCli, batch validation, exit code, severity, report path, timeout.

Delphi-kodexempel

Följande Delphi-skiss visar en praktisk servicegräns för detta ämne. Håll policykontroller, loggning och validering utanför det smala produktanropet så att arbetsflödet går att testa.

procedure RunPdfiumBatch(const Files: TArray<string>; const OutputDir: string);
var
  FileName: string;
begin
  for FileName in Files do
  begin
    PdfView.LoadFromFile(FileName);
    WritePreflightJson(OutputDir, FileName, InspectDocument(PdfView));
    PdfView.Close;
  end;
  WriteBatchSummary(OutputDir);
end;

Produktionschecklista

  • Kör arbetsflödet på en tom fil, en normal kundfil och en värstafallfil
  • Öppna den genererade PDF-filen med rätt visare, validator, skrivare eller nedströmsapplikation
  • Logga produktversion, profilversion, inmatningshash, utdatasökväg, förfluten tid och antal varningar
  • Håll lösenord, certifikat, tillfälliga filer och kunddata under tydliga lagringsregler
  • Lägg till regressionsdokument när en kundfil avslöjar ett nytt gränsfall

Produktdokumentation

PDFium Component

Fler kodexempel

procedure RunPreflightBatch(const InputDir, ReportDir: string;
  out FilesWithFindings, ToolFailures: Integer);
var
  SR: TSearchRec;
  Pdf: TPdf;
  Report: TPdfPreflightReport;
begin
  FilesWithFindings := 0;
  ToolFailures := 0;
  if FindFirst(InputDir + '*.pdf', faAnyFile, SR) = 0 then
  try
    repeat
      Pdf := TPdf.Create(nil);   // fresh instance per file: no state bleed
      try
        try
          Pdf.FileName := InputDir + SR.Name;
          Pdf.Active := True;
          if not Pdf.Active then  // load failures are silent, not raised
            raise EPdfError.Create('Cannot open ' + SR.Name);
          Report := BuildPdfPreflightReport(Pdf, [ppsPdfA, ppsPdfUa]);
          Report.SaveJsonToFile(ReportDir + ChangeFileExt(SR.Name, '.json'));
          Report.SaveHtmlToFile(ReportDir + ChangeFileExt(SR.Name, '.html'));
          if Report.TotalIssueCount > 0 then
            Inc(FilesWithFindings);
        except
          on E: Exception do
          begin
            Inc(ToolFailures);   // exit-code-2 territory, not a validation verdict
            WriteLn(ErrOutput, SR.Name + ': ' + E.Message);
          end;
        end;
      finally
        Pdf.Free;
      end;
    until FindNext(SR) <> 0;
  finally
    FindClose(SR);
  end;
end;
begin
  RunPreflightBatch(ParamStr(1), ParamStr(2), Findings, Failures);
  if Failures > 0 then
    Halt(2)
  else if Findings > 0 then
    Halt(1);
  // falling through exits with 0: every file conformed
end.