How PDF Graphics Work: Content Streams and Operators

A PDF page does not store pixels, and it does not store a tree of shape objects the way SVG does. It stores a program. Every line, curve, fill, and placed image on the page is the result of executing a sequence of operators in a content stream, top to bottom, against a running graphics state. Understand that one fact and most of the format's behavior stops being surprising: why a fill needs a separate painting operator after the path is built, why colors and line widths leak from one shape into the next unless you bracket them, why the same drawing code can land in completely different places after a single coordinate transform. This is a tour of that execution model as defined in ISO 32000: the operators you meet when you open a content stream, and the rules that decide what shows up on the page.

The content stream is postfix bytecode

A content stream is a flat byte sequence of operands followed by operators. Operands come first, the operator that consumes them comes last, which is the reverse of a function call and identical to a stack machine: push the numbers, then issue the verb. There is no nesting, no expression syntax, no variables. A triangle outline is five lines of this:

100 100 m    % moveto: start a new subpath at (100, 100)
200 200 l    % lineto: add a segment to (200, 200)
300 100 l    % lineto: add a segment to (300, 100)
h            % closepath: connect back to the start
S            % stroke: paint the path outline

The operators are terse on purpose. A real page is thousands of these, usually compressed with FlateDecode. The cost of that compactness is that the stream carries no structure you can query: a viewer cannot ask "where is the heading on this page," it can only run the program and see what ink lands where. That is the root reason text extraction from arbitrary PDFs is hard.

The origin is bottom-left, and Y grows upward

Before any coordinate makes sense you have to know where (0, 0) is. PDF puts the origin at the bottom-left corner of the page, with X increasing to the right and Y increasing upward, measured in points at 72 points to the inch (ISO 32000-2 §8.3.2). On a US Letter page the top edge sits at y = 792, not at y = 0. Anyone arriving from screen graphics, where the origin is top-left and Y grows downward, gets this backwards on the first try and draws the first line off the bottom of the page. The unit is also independent of the medium: 72 units is one inch whether the page renders to a phone screen or an imagesetter.

Most page-drawing libraries inherit this convention directly. In HotPDF, for example, TextOut and the path calls all measure from the bottom-left in points, so a value near the page height puts content at the top:

// HotPDF, Delphi: y measured from the bottom edge upward, in points
Pdf.CurrentPage.SetLineWidth(2.0);
Pdf.CurrentPage.MoveTo(100, 700);   // near the top of the page
Pdf.CurrentPage.LineTo(300, 700);
Pdf.CurrentPage.Stroke;             // emits the moveto/lineto/stroke operators

That call sequence compiles down to exactly the m, l, and S operators above. The library is a typist for the content stream, nothing more, and knowing what it emits is what lets you reason about the output when a shape lands somewhere you did not expect.

Build the path, then paint it

PDF separates path construction from path painting, and the separation is not pedantry. You first describe a shape with construction operators that add nothing visible, then issue a single painting operator that decides what to do with the accumulated path. The same triangle can be an outline, a solid fill, or both, depending only on the verb you end with.

The construction operators are few. m starts a new subpath at a point. l adds a straight segment. c adds a cubic Bezier curve from six operands, two control points and an endpoint. re is a shortcut that adds a whole rectangle from an x, y, width, height quadruple. h closes the current subpath back to its start. None of them put ink on the page; they only accumulate geometry.

200 250 m                    % start the subpath
300 350 400 450 500 250 c    % cubic Bezier: two control points, then endpoint
150 200 re                   % a 150 x 200 rectangle, added as its own subpath
h                            % close

The original example used the now-obsolete y variant of the curve operator; c with its three explicit points is the form you will see in practice and the one to reach for. Once the path exists, one painting operator finishes it. The vocabulary is small and worth memorizing, because every shape on every page ends with one of these:

S strokes the path outline using the current line width and stroke color.
f fills the interior using the current fill color and the nonzero winding rule.
f* fills using the even-odd rule, which matters for self-intersecting shapes and shapes with holes.
B fills and then strokes in one operation; b closes the path first.
n paints nothing, which is how a path becomes a clip region without leaving a visible mark.

The winding rule is the part people get wrong. Nonzero (f, B) counts the signed crossings of a ray from the test point and fills wherever the count is not zero, so a hole only stays empty if its subpath winds opposite to the outer one. Even-odd (f*, B*) toggles on every crossing regardless of direction. If a "donut" shape comes out solid, the inner circle is wound the same way as the outer one, and you either reverse it or switch to even-odd.

Color is a mode, not a parameter

Color in a content stream is sticky. You set a color and it stays set until you set another one or restore an earlier state, which is why an unbracketed color change silently tints everything drawn after it. PDF also keeps fill color and stroke color as two independent settings, with lowercase operators for fill and uppercase for stroke. The device color spaces each have their own shorthand:

0.5 g                % DeviceGray fill, mid gray (0 = black, 1 = white)
0.2 0.6 0.8 rg       % DeviceRGB fill
0.8 0.2 0.1 RG       % DeviceRGB stroke (uppercase = stroke)
0.2 0.8 0.0 0.1 k    % DeviceCMYK fill

DeviceRGB suits screen output, DeviceCMYK is what print production expects, and DeviceGray is the smallest choice for monochrome content. The device spaces are convenient but uncalibrated: the same RGB triple can render differently on two monitors, which is the problem ICC-based color spaces and PDF/A output intents exist to solve. For color-critical work you select a calibrated space with cs and CS and set components with sc and scn, but for ordinary documents the device shorthands carry the load. A library wraps these in typed calls. HotPDF, for instance, takes a single TColor and emits the matching operators:

Pdf.CurrentPage.SetRGBFillColor(clRed);
Pdf.CurrentPage.Rectangle(100, 100, 200, 150);  // x, y, width, height
Pdf.CurrentPage.Fill;

Pdf.CurrentPage.SetRGBFillColor(RGB(0, 255, 0));
Pdf.CurrentPage.Circle(150, 400, 50);           // x, y, radius
Pdf.CurrentPage.Fill;

The graphics state and the q/Q stack

Everything that is not the path itself lives in the graphics state: current transformation matrix, fill and stroke colors, line width, dash pattern, clip region, alpha. The state is global and mutable, so the only safe way to make a local change is to save the whole thing, modify it, draw, and roll it back. That is what q and Q do. q pushes a copy of the current state onto a stack; Q pops it, discarding every change made since the matching q.

q                    % save the entire graphics state
2 0 0 2 100 100 cm   % concatenate a transform: scale 2x, translate to (100,100)
0.8 g                % gray fill, scoped to this block
% ... draw scaled, gray content ...
Q                    % restore: transform and color revert

Unbalanced q and Q are a common way a hand-built or stitched content stream goes wrong. A stray q with no matching Q leaves the stack deep when the page ends; an extra Q underflows it. Either way a viewer may keep an old clip or transform in force, and content disappears or lands in the wrong place. When graphics vanish for no reason the path can explain, audit the state stack first.

The CTM transforms every coordinate

The current transformation matrix sits between the numbers in your operators and the actual page. Every coordinate is multiplied by the CTM before anything is drawn, so changing the matrix changes where and how all subsequent drawing appears without touching a single path coordinate. The cm operator concatenates a new matrix onto the current one, taking six operands that map to the affine matrix [a b c d e f]:

1 0 0 1 100 50 cm        % translate by (100, 50): e and f carry the offset
2 0 0 1.5 0 0 cm         % scale x by 2, y by 1.5: a and d are the scale factors
0.707 0.707 -0.707 0.707 0 0 cm   % rotate 45 degrees (cos/sin in a, b, c, d)

Two things trip people up. First, cm composes rather than replaces, so transforms accumulate and order matters: scaling then translating is not the same as translating then scaling. Second, rotation and scaling pivot around the current origin, not the center of your shape, so to rotate something in place you translate it to the origin, rotate, then translate back, all wrapped in q/Q. This same matrix is what places images, the last piece worth seeing.

Images and reusable content are XObjects

Raster images do not live inline in the content stream. They are stored as image XObjects, external objects with their own dictionary describing width, height, bit depth, color space, and compression filter, and the content stream only references them. A JPEG-backed photo declares itself like this:

/Photo <<
  /Type /XObject
  /Subtype /Image
  /Width 640
  /Height 480
  /BitsPerComponent 8
  /ColorSpace /DeviceRGB
  /Filter /DCTDecode        % the image data is a JPEG stream
>>

An image XObject draws into the unit square: it always occupies the region from (0, 0) to (1, 1) in user space. You do not pass it a position or size. Instead you set the CTM so that unit square maps to the rectangle you want, then invoke it with Do. That is why placing an image is always a transform followed by an invocation, wrapped in a save/restore so the scale does not bleed into the next operation:

q
640 0 0 480 50 300 cm    % map the unit square to a 640x480 box at (50, 300)
/Photo Do                % paint the image XObject
Q

The same Do mechanism drives form XObjects, which hold a reusable chunk of graphics, a logo or a repeated stamp, as their own content stream with a bounding box. Define it once, invoke it many times with a different CTM, and the bytes appear in the file only once. Most libraries hide this behind a single placement call: HotPDF registers a bitmap with AddImage and places it with ShowImage, taking an explicit x, y, width, and height instead of asking you to build the matrix by hand:

var
  Bmp: TBitmap;
  ImgIndex: Integer;
begin
  Bmp := TBitmap.Create;
  try
    Bmp.LoadFromFile('logo.bmp');
    ImgIndex := Pdf.AddImage(Bmp, icFlate);
    // x, y (bottom-left), width, height, rotation angle
    Pdf.CurrentPage.ShowImage(ImgIndex, 50, 300, 200, 150, 0);
  finally
    Bmp.Free;
  end;
end;

Under that one line the library writes the image XObject dictionary, sets the CTM to size and position the unit square, and emits Do. The model underneath is the one worth knowing, because it explains every odd result: a stretched image is a CTM with mismatched scale factors, a logo identical on forty pages is one form XObject invoked forty times, and an image that renders upside down is a sign flip in the matrix, not a corrupt file.

Where this leads

The graphics model is small once you see its shape. A content stream is postfix bytecode running against a mutable state; coordinates start at the bottom-left and pass through the CTM; paths are built silently and painted with one deliberate operator; color and line settings persist until you bracket them with q/Q; images and reusable graphics are XObjects placed by transforming a unit square. Almost every confusing rendering result reduces to one of those five rules. If you want to see how these graphics operators sit inside the larger object model, the page dictionaries and the cross-reference table that point to them, the technical overview of PDF file structure covers that layer, and building a simple PDF from scratch walks the bytes end to end. Text drawing lives in its own operator family and has its own pitfalls, covered in the companion piece on PDF text and font handling.

The Delphi drawing calls shown here, MoveTo, LineTo, Stroke, Rectangle, Fill, SetRGBFillColor, AddImage, and ShowImage, are part of the HotPDF Component for Delphi and C++Builder, which emits these content-stream operators for you.