HotPDF Component: Delphi での Unicode text shaping for complex scripts

HotPDF は Delphi/C++Builder アプリケーション向けのネイティブ VCL PDF ライブラリです。外部 PDF ランタイムを配置せずに、PDF 作成、編集、フォーム、注釈、暗号化、デジタル署名、Unicode フォント、標準対応出力、プリフライトレポートを扱えます。

この記事は developers producing multilingual invoices, certificates, labels, or reports from Delphi 向けです。Unicode text shaping for complex scripts を単なるコンポーネント呼び出しではなく、本番向けのドキュメントエンジニアリングとして扱います。

実務上のリスクは text can appear plausible in a sample PDF while ligatures, bidirectional order, fallback fonts, or copy-and-search behavior fail for real customer names です。そのため、明確な契約、観測可能な診断、実際の顧客ファイルに近い回帰サンプルが必要です。

アーキテクチャ上の判断

Make the text pipeline locale-aware. font fallback order for Arabic, Hebrew, Indic, CJK, and mixed Latin text / normalization rules for copied text, database values, and template placeholders

font fallback order for Arabic, Hebrew, Indic, CJK, and mixed Latin text
normalization rules for copied text, database values, and template placeholders
right-to-left paragraph handling and mixed-direction number policy
whether text must remain searchable, selectable, and accessible after output

実装フロー

Resolve fonts and shaping before pagination. The order below keeps the workflow reviewable for Delphi and C++Builder teams.

normalize source text and record the locale used for formatting
select fonts that contain the required glyphs before measuring layout
shape and position text before page breaks are finalized
embed or subset fonts according to licensing and PDF standard requirements
verify visual output and extracted text with multilingual regression samples

検証エビデンス

Proof that text is readable and extractable. Keep these fields with the output or support record.

font selected for every script range and fallback reason when it changed
glyph coverage warnings, embedding mode, and subset identifier
extracted Unicode text compared with the original application value
viewer screenshots for representative right-to-left and combining-mark cases

Visual output is not enough

Complex-script support involves character normalization, shaping, glyph positioning, embedding, ToUnicode maps, and reading order. A PDF that only looks right in one viewer can still fail search, selection, accessibility, or downstream extraction.

Regression files worth keeping

Keep more than successful samples. A useful Unicode text shaping for complex scripts regression set contains normal files, boundary files, and intentional failure files so the behavior is stable across releases.

database collation can alter composed characters before the PDF layer sees them
font substitution on a developer machine can hide missing embedded fonts
line breaks in bidirectional text can reorder punctuation and numbers
search may fail when ToUnicode data is missing even if the page renders correctly
normalize source text and record the locale used for formatting
select fonts that contain the required glyphs before measuring layout

Engineering review notes for Unicode text shaping for complex scripts

Use these review notes to make sure the feature has moved beyond a demo and can be defended during release, support, and customer escalation.

Decision: font fallback order for Arabic, Hebrew, Indic, CJK, and mixed Latin text. Implementation pressure point: select fonts that contain the required glyphs before measuring layout. Acceptance evidence: extracted Unicode text compared with the original application value. Regression trigger: search may fail when ToUnicode data is missing even if the page renders correctly
Decision: normalization rules for copied text, database values, and template placeholders. Implementation pressure point: shape and position text before page breaks are finalized. Acceptance evidence: viewer screenshots for representative right-to-left and combining-mark cases. Regression trigger: database collation can alter composed characters before the PDF layer sees them
Decision: right-to-left paragraph handling and mixed-direction number policy. Implementation pressure point: embed or subset fonts according to licensing and PDF standard requirements. Acceptance evidence: font selected for every script range and fallback reason when it changed. Regression trigger: font substitution on a developer machine can hide missing embedded fonts
Decision: whether text must remain searchable, selectable, and accessible after output. Implementation pressure point: verify visual output and extracted text with multilingual regression samples. Acceptance evidence: glyph coverage warnings, embedding mode, and subset identifier. Regression trigger: line breaks in bidirectional text can reorder punctuation and numbers

境界ケース

database collation can alter composed characters before the PDF layer sees them
font substitution on a developer machine can hide missing embedded fonts
line breaks in bidirectional text can reorder punctuation and numbers
search may fail when ToUnicode data is missing even if the page renders correctly

Delphi / C++Builder notes

HotPDF Component should sit behind a small service boundary that receives files, streams, profiles, and credentials, then returns output paths, warnings, metrics, and validation status. Important terms include Unicode, text shaping, font embedding, ToUnicode, bidirectional text, fallback font.

Delphi コード例

次の Delphi スケッチは、このテーマに対する実用的なサービス境界を示します。ポリシー確認、ログ記録、検証を製品呼び出しの狭い部分の外側に置くと、ワークフローをテストしやすくなります。

procedure DrawShapedRun(Pdf: THotPDF; const Text: UnicodeString; const Script: TScriptProfile);
begin
  Pdf.CurrentPage.SetFont(Script.FontName, [], Script.Size, 0, Script.Vertical);
  if Script.RequiresReorder then
    Pdf.CurrentPage.TextOut(Script.X, Script.Y, 0, ShapeUnicodeRun(Text, Script))
  else
    Pdf.CurrentPage.TextOut(Script.X, Script.Y, 0, Text);
  RecordGlyphCoverage(Script.FontName, Text);
end;

本番チェックリスト

Run the workflow on an empty file, a normal customer file, and a worst-case file
Open the generated PDF with the target viewer, validator, printer, or downstream application
Log product version, profile version, input hash, output path, elapsed time, and warning count
Keep passwords, certificates, temporary files, and customer data under explicit retention rules
Add regression documents when a customer file exposes a new edge case

Product documentation

HotPDF Component