A designer picks a font with a single-story a for headings, or a slashed zero for tables, or a set of swash capitals for a cover. Those glyphs are in the font already. They are simply not the default. The default a maps from the character through the cmap table to one glyph, and the alternate sits a few glyph ids away, reachable only through a substitution rule. Producing that alternate in a PDF means reading the rule and emitting the substitute glyph in the content stream. This article is about reading those rules, the single-substitution kind, in Object Pascal with no native shaping library underneath.
The scope is narrow on purpose. Stylistic sets and alternates are single-glyph-in, single-glyph-out substitutions. They are the part of OpenType layout you can resolve with a small, deterministic table walk, which makes them a good fit for a Pascal engine that wants to stay free of C dependencies.
Why pure Delphi rather than HarfBuzz
HarfBuzz is the obvious answer to "shape this text", and for full bidirectional, Indic, or Arabic shaping it is the right answer. It is also a C library. Binding it into a Delphi or C++Builder product means shipping a native object for every target platform and architecture, matching its calling convention, tracking its release cadence, and reading its licence terms against your own. None of that is hard in isolation. All of it is friction that never goes away, and it buys nothing when the actual requirement is "give me the ss01 form of this letter".
Single substitution does not need a shaping engine. It needs a parser for a handful of GSUB subtable formats and a binary search or two. Writing that in Pascal keeps the whole toolchain inside one compiler. The honest limit is that this approach handles glyph substitution lookups and nothing else. It is not bidi resolution, it is not Indic reordering, and it is not automatic contextual shaping. Where those are needed, they are needed, and a single-substitution query will not stand in for them.
The GSUB hierarchy, top to bottom
The Glyph Substitution table is organised as a chain of indirections, and a substitution query walks the chain from the top. At the top is the ScriptList. A script tag such as latn selects an entry, and the special tag DFLT is the default script that applies when no more specific script matches. The script entry points at a LangSys, the language system, with a default LangSys for the common case and optional named ones for languages that need different behaviour. Turkish is the usual example, where the dotted and dotless i demand their own handling.
The LangSys names a set of feature indices. Each index points into the FeatureList, where a feature record carries a four-byte tag, ss01 among them, and a list of lookup indices. Those indices finally point into the LookupList, where the actual substitution subtables live. So resolving ss01 means: find the script, find its LangSys, find the feature whose tag is ss01, collect the lookups it names, and apply them. HotPDF defaults to the DFLT script and the default LangSys, which is what the vast majority of Latin text designs ship, and it exposes a way to override the script tag when a font wires its features under a specific script instead.
Coverage tables decide who participates
Every substitution subtable begins with the same question: does this input glyph take part in this rule, and if so, where does it sit in the rule's own indexing. That question is answered by a Coverage table, and the answer is a coverage index, a small ordinal that the rest of the subtable uses to look up what the glyph becomes.
Coverage comes in two formats. Format 1 is a list of glyph ids sorted in ascending order. You find a glyph with a binary search, and its position in the list is its coverage index. Format 2 is a list of range records, each a start glyph, an end glyph, and the coverage index that the start glyph maps to. A glyph inside a range gets its coverage index by offsetting from the range's start. Format 1 is compact when the participating glyphs are scattered, Format 2 when they fall into contiguous runs. Both are sorted, so both are searched in logarithmic time, and both return either a coverage index or a clean "not covered" that lets the engine leave the glyph alone.
Single Substitution, the two formats
Single Substitution is LookupType 1, and it maps one glyph to exactly one replacement. It also has two formats, and the split is a space optimisation. Format 1 stores a single signed delta. The output glyph id is the input glyph id plus that delta, modulo 65536. This is how a font encodes a substitution where every participating glyph sits at the same fixed offset from its alternate, for instance a block of lining figures placed a constant distance from the matching oldstyle figures. The Coverage table says which glyphs qualify, and the one delta serves all of them.
Format 2 stores an explicit array of substitute glyph ids. The coverage index from the Coverage table is the index into that array, so glyph at coverage index 0 becomes the first array entry, coverage index 1 the second, and so on. Format 2 is used when the alternates are not at a uniform offset, which is the common case for hand-built stylistic sets. The query is the same from the caller's side either way. Take the input glyph, run it through Coverage, and if it is covered, apply the delta or read the array slot.
var
Pdf: THotPDF;
BaseGID, AltGID: Word;
begin
Pdf := THotPDF.Create(nil);
try
Pdf.BeginDoc;
Pdf.RegisterUnicodeTTF('C:\Fonts\MyStylisticFace.ttf');
Pdf.SetFont('My Stylistic Face', 12, []);
// Default glyph for 'a' through the font's cmap.
BaseGID := Pdf.GetUnicodeGlyphForCodepoint(Ord('a'));
// Stylistic Set 1: resolve the alternate via GSUB LookupType 1.
AltGID := Pdf.GetSingleSubstituteGlyph(BaseGID, 'ss01');
// AltGID = BaseGID means the feature did not touch this glyph.
if AltGID <> BaseGID then
{ emit AltGID in the content stream };
finally
Pdf.Free;
end;
end;
The contract worth noticing is the pass-through. GetSingleSubstituteGlyph returns the input glyph id unchanged on every miss: no font, no GSUB table, no matching feature, no coverage hit. That means the call is safe to make unconditionally. You ask for the alternate, and if there is none, you get back exactly what you put in, so the calling code never needs to special-case a font that lacks the feature.
What the stylistic feature tags mean
The feature tag is the whole vocabulary of which alternate you are asking for, and the tags relevant to stylistic work are a short list. The headline pair is salt, stylistic alternates, the catch-all access to a glyph's alternate forms, and ss01 through ss20, the twenty numbered stylistic sets a font can define, each a named bundle of substitutions the designer groups together. A font might put a single-story a and a straight-leg R under ss03, for instance, so enabling that one set restyles both.
Around those sit several more single-substitution tags. aalt is access-all-alternates, the union of every alternate a glyph has, usually presented as a glyph-palette feature. titl selects titling capitals cut for large sizes. subs and sups swap in true subscript and superscript figures rather than scaled-down defaults. ordn produces ordinal forms, the raised letters in 1st and 2nd. frac builds fractions, though full diagonal fractions also lean on ligature and contextual logic that goes past plain single substitution. For the single-glyph cases, the mechanism is identical to ss01: pass the tag to the substitution query and read back the alternate glyph.
// Try a stylistic-set feature, then fall back to plain alternates.
function ResolveAlternate(Pdf: THotPDF; BaseGID: Word;
const PreferredTag: AnsiString): Word;
begin
Result := Pdf.GetSingleSubstituteGlyph(BaseGID, PreferredTag);
if Result = BaseGID then
Result := Pdf.GetSingleSubstituteGlyph(BaseGID, 'salt');
// Still BaseGID if neither feature covers this glyph.
end;
cmap format 12 and the supplementary planes
Before any substitution can run, a character has to become a glyph, and that is the cmap table's job. The substitution query starts from a glyph id, so the path is always character to glyph through cmap, then glyph to alternate through GSUB. The interesting part of cmap is its reach. A format 4 subtable covers the Basic Multilingual Plane, the first 65536 code points, and that is enough for most Latin text. It is not enough for code points from U+10000 upward, the supplementary planes, which is where mathematical alphanumerics, many symbols, and several living scripts now live.
Format 12 is the subtable that covers the full U+0000 to U+10FFFF range. It is a sorted list of groups, each group a start code point, an end code point, and a start glyph id, so a contiguous run of code points maps to a contiguous run of glyphs. HotPDF resolves code points with a hybrid strategy that matches how the data is shaped. Code points in the BMP are served from a direct array indexed by the code point, a single lookup with no search. Code points in the supplementary planes are served from a sparse table sorted by code point and searched with a binary search. The result is that GetUnicodeGlyphForCodepoint takes a full Cardinal and answers correctly across the whole range, returning glyph id 0, the .notdef glyph, for any code point the font does not map.
var
Pdf: THotPDF;
Cp: Cardinal;
GID, StyledGID: Word;
begin
// A supplementary-plane code point: U+1D49C MATHEMATICAL SCRIPT CAPITAL A.
Cp := $1D49C;
GID := Pdf.GetUnicodeGlyphForCodepoint(Cp); // format 12 lookup
if GID <> 0 then
StyledGID := Pdf.GetSingleSubstituteGlyph(GID, 'ss01')
else
StyledGID := 0; // font has no glyph for this code point
end;
Where these queries stop
The single-substitution APIs answer one shape of question, and it is worth being clear about what they do not answer. LookupType 1 is one of eight substitution types. The query does not handle LookupType 2 multiple substitution, where one glyph becomes several, nor LookupType 4 ligature substitution, where several glyphs become one. It does not handle the contextual and chaining-contextual types, LookupTypes 5 and 6, that fire only when a glyph appears in a particular neighbourhood, nor the extension and reverse-chaining types. A diagonal fraction, a Devanagari conjunct, or an Arabic initial-medial-final cascade is a sequence problem, and a per-glyph single-substitution lookup cannot express it.
It also does not perform automatic shaping. Nothing here inspects a run of text, decides which features to turn on, and applies them in the order the script requires. The caller chooses the feature tag and applies it glyph by glyph. That is exactly the right tool for stylistic sets and alternates, which are opt-in and local, and exactly the wrong tool for a script that needs reordering. Keeping the boundary sharp is what lets the substitution path stay small and predictable.
For the cases that do need sequence-level work, the complex-script story is taken up in our article on complex-script text shaping in Delphi. If your substitutions are part of a larger reporting job that also places images and other fonts on the page, the guide to report output with fonts and images covers how those pieces fit together. All of these run on the same engine, the HotPDF Component for Delphi and C++Builder, which carries the GSUB substitution queries alongside the font embedding, subsetting, and text APIs covered elsewhere on this blog.