Pass the Arabic phrase يوضح ملف PDF to TextOut and open the result. The letters run the wrong way, and each one sits in its isolated form with a visible gap before the next, as if someone typed English backwards and hit space between every character. No exception fired. No warning printed. The output is simply wrong, and it is wrong because two separate transformations that Arabic depends on never happened. Knowing what those two transformations are, and which call performs them, is most of what complex-script PDF output comes down to.
HotPDF is a native VCL PDF component for Delphi and C++Builder, and it does the right-to-left work for you through a distinct call. It also stops short in a few specific places that you want to know about before you commit a locale, so the honest boundaries get their own section near the end.
Why a correct string still prints wrong
Unicode keeps text in logical order, the order you type it and read it aloud. A renderer has to put glyphs down in visual order. For left-to-right scripts those orders coincide and nobody thinks about it. For Arabic and Hebrew they do not, and when a single line mixes directions, say an Arabic sentence carrying the Latin token "PDF" or a price written in digits, the Unicode Bidirectional Algorithm (UAX #9) decides exactly how the left-to-right fragments nest inside the right-to-left line. That is the first transformation, reordering, and skipping it is what flips the line.
The second is contextual shaping. An Arabic letter is drawn differently depending on where it falls in a word: initial, medial, final, or standing alone. The codepoint stays the same throughout; only the glyph changes. A pipeline that hands each codepoint straight to its default glyph produces exactly the disconnected, isolated-form output from the opening paragraph. Hebrew skips this step, since its letters do not join, but it still needs the reordering. Arabic needs both, and that is why Arabic, not Hebrew, is the string you test with.
On the desktop none of this is your problem. When a VCL form paints Arabic into a TEdit, the operating system's text stack quietly reorders and shapes it, which is precisely why the string that looks perfect on screen comes out broken in a naive PDF. A content stream does not store editable text. It stores positioned glyphs, so whoever emits the stream inherits the shaping job that the OS used to handle. RtLTextOut is the call that takes that job back.
RtLTextOut does the reordering and joining
HotPDF keeps the Latin path and the complex-script path as two different methods. TextOut prints what you give it in the order you give it. RtLTextOut reorders and runs contextual analysis first, then prints. Which script rules it applies comes from the charset argument to SetFont: 178 means Arabic, 177 means Hebrew.
// Arabic: pass logical order; RtLTextOut reorders and joins
Pdf.CurrentPage.SetFont('Arial Unicode MS', [], 12, 178);
Pdf.CurrentPage.RtLTextOut(400, 700, 0, 'يوضح ملف PDF');
// Hebrew: reordering only, no contextual joining
Pdf.CurrentPage.SetFont('Arial Unicode MS', [], 12, 177);
Pdf.CurrentPage.RtLTextOut(400, 660, 0, 'קובץ PDF זה');
One mistake eats more debugging hours than any other here. RtLTextOut reverses the string itself, so if you feed it text you already reversed by hand (usually a workaround left over from an earlier attempt with plain TextOut), it reverses again and you are back where you started. The cruel part is that double-reversed text can look right for a single all-Arabic test string and then fall apart the moment a line contains a Latin word or a number, because those embedded runs no longer follow UAX #9. Pass logical order, always, and let the call sort it out.
That same mixed-direction behavior trips up reviewers more than it trips up code. Inside a right-to-left line, digits and embedded Latin words still read left to right. Someone who has not worked with bidirectional layout will look at a rendered invoice, see the account number reading the "wrong" way relative to the Arabic around it, and write it up as a bug. It is the spec-correct result. A short note in your acceptance criteria, written before the first native-speaker pass, saves that round trip.
Glyph coverage is decided before shaping even runs
Shaping picks glyphs out of a font. If the font does not carry them, there is nothing to pick. This is the deployment failure that wastes an afternoon: the report is flawless on the developer's machine, where Arial Unicode MS happens to be installed, and comes out as a row of blank boxes on the customer's server, where Windows substituted some font with no Arabic at all. The cure is to stop trusting whatever fonts a given machine has and register one you ship with the application.
// Ship a known font instead of relying on installed system fonts
Pdf.RegisterUnicodeTTF('C:\Fonts\NotoSansArabic.ttf');
Pdf.CurrentPage.SetFont('NotoSansArabic', [], 12);
// Audit coverage for the codepoints your data actually uses
GID := Pdf.GetUnicodeGlyphForCodepoint($0628); // U+0628 ARABIC LETTER BEH
LogGlyphAudit($0628, GID);
Two boundaries come with this. A font registered through RegisterUnicodeTTF gets embedded, and HotPDF's embedded Unicode handling needs the document at PDF 1.5 or later. That only bites if something downstream insists on PDF 1.4, but when it does, the symptom is silent. The other boundary is legal: TrueType files carry embedding-permission bits, and a face that draws beautifully on screen can still be licensed in a way that forbids shipping it inside customer documents. Check before you ship, not after a complaint.
That second call, GetUnicodeGlyphForCodepoint, is your early-warning system. Walk the codepoint ranges your data actually uses when the service starts up and log what glyph IDs come back. A coverage gap then shows up as a line in a startup log during rollout, rather than as missing characters in an invoice that already reached a customer.
Text that is Unicode but not right-to-left, CJK strings, Vietnamese with its stacked diacritics, mixed European text, goes through the ordinary path. TextOut takes a WideString and draws it through the registered font with no bidirectional analysis at all. It pays to keep the two paths physically separate in report code, one routine for RTL runs and one for everything else, so the locale logic is visible in the call site instead of hidden behind a flag that someone will eventually forget to set.
Reading order belongs to the document, not the glyphs
Getting every glyph right still leaves one thing undone. ISO 32000-1 §12.2 defines a viewer preference called /Direction that states the document's overall reading order. It touches no glyphs. What it does is tell a viewer how to arrange two-up spreads, which side a facing-page layout should start from, and which way the reading UI should lean. None of that shows on a single page, which is exactly why it gets forgotten.
// Declare right-to-left reading order at the document level
Pdf.Direction := RightToLeft; // adds vpDirection to ViewerPreferences
Setting Direction is the whole job: the property setter adds vpDirection to the document's ViewerPreferences, so one line carries the preference into the file. The failure mode is leaving it out, which feels harmless because the single-page proof you are staring at looks identical either way. Then someone prints a duplex booklet, the spreads come out mirrored, and the cause is a missing one-liner from weeks earlier.
Where HotPDF's shaping ends
An honest map of the limits is worth a week of evaluation. RtLTextOut covers bidirectional reordering and Arabic contextual joining on its own. What it does not do automatically is general OpenType feature application. Optional ligatures and similar typographic features go through GetSingleSubstituteGlyph(GID, 'liga'), which resolves a single substitution at a time, glyph ID first and the feature tag second, and returns the input glyph unchanged when the feature does not apply. That is enough to drive a known, finite ligature list you maintain yourself. It is not a full GSUB engine. For scripts that need more than reordering and joining, Indic scripts with their reordering vowel signs being the standard example, run a real pilot on genuine customer strings before you promise the locale. Arabic working is not evidence that Devanagari will.
Verify end to end, because a page can look correct and still be useless to everything downstream. Three checks find most problems. Copy the text back out of Acrobat and compare the codepoints against your source string. Run the viewer's in-document search for a word that you can see on the page. And open the output on a machine that does not have your development fonts, the one most likely to expose a substitution. None of that replaces a native reader looking at one real document, which catches things no synthetic corpus will. Get that review on the calendar before the format ships.
Pick test strings on purpose instead of recycling whatever a translator sent last year. A workable minimum per locale: a pure-script sentence, a sentence with embedded Latin brand names, a line carrying digits and currency, and names with diacritics or combining marks. Real customer names break assumptions that filler text leaves untouched, so let the regression set grow by one string every time a support case turns up a pattern you had not seen.
Font registration, subsetting, and the everyday text-drawing API are covered in the article on report output, fonts, and images with HotPDF. When the same documents also have to meet accessibility profiles, the language tagging and structure rules in the PDF/A and PDF/UA validation article sit on top of the shaping work here.
The right-to-left and Unicode font APIs described above ship with the HotPDF Component for Delphi and C++Builder; the product page links the full text-output reference.