A PDF is a plain-text container at heart. Open most files in a hex editor and the top is readable: a version comment, then a run of numbered objects, then a small index and a pointer at the very bottom that tells a reader where to start. Strip away compression and the format is approachable enough that you can type a working document into a text editor and have a viewer open it. Doing that once teaches you more about how PDF holds together than any amount of reading the specification, because you have to wire the objects to each other by hand and the file refuses to open until you get the wiring right
This walkthrough builds the smallest PDF that actually renders something: one page, the words "Hello, World!" in a built-in font, on US Letter paper. The finished file needs exactly five objects and a few lines of bookkeeping around them. We will write the objects first, then assemble the header, cross-reference table, and trailer that bind them into a file a reader will accept
The five objects a viewer insists on
A reader does not scan a PDF top to bottom looking for content. It starts at the trailer, follows a reference to the document catalog, and walks a chain of objects from there. Every object on that chain has to exist or the open fails. For a one-page document the chain is short, and each link has a single job:
- Catalog is the root. It is the object the trailer points at, and its only required entry here is a reference to the page tree
- Pages is the page tree node. It lists the pages in the document and reports how many there are
- Page describes one physical page: its size, the resources it draws with, and which content stream paints it
- Content stream holds the drawing operators, the postfix commands that place text and graphics on that page
- Font declares the typeface the content stream refers to. Use one of the 14 standard fonts and you do not have to embed anything
Each object is numbered and addressable. An indirect object is written as N 0 obj ... endobj, where N is the object number and the 0 is its generation number (always 0 in a file you write fresh). Anywhere else in the file you point at that object with a reference: 5 0 R means "object 5." Those references are the wiring. The catalog holds 2 0 R in our numbering to reach the page tree, the page tree holds a reference back down to the page, and so on. Get a number wrong and the reader follows a dangling pointer into nothing
Names, dictionaries, and streams
Three pieces of syntax carry almost everything. A name starts with a slash: /Type, /Page, /F0. Names are case-sensitive identifiers, not strings, and PDF uses them for dictionary keys and for tagging what an object is. A dictionary is a set of key-value pairs wrapped in double angle brackets, where every key is a name: << /Type /Page /MediaBox [0 0 612 792] >>. Values can be numbers, names, arrays in square brackets, references, or nested dictionaries. Most PDF objects are dictionaries
A stream is a dictionary followed by a block of bytes between the keywords stream and endstream. That is where page-drawing operators live, and in real files where compressed images and embedded fonts live too. The stream dictionary describes the bytes; in a production file it must carry a /Length entry giving the exact byte count, and often a /Filter such as /FlateDecode when the data is compressed. We are going to lean on a tool to fill in /Length, because counting bytes by hand is the part of this exercise with no educational payoff and a high chance of an off-by-one that breaks the file
Writing the objects
Here are the five objects in order. The coordinate detail to keep in mind before reading the content stream: PDF measures from the bottom-left corner of the page in points, where one point is 1/72 inch, and Y grows upward. A US Letter page is 612 by 792 points, so 50 700 sits near the top-left, not the bottom
1 0 obj
<< /Type /Catalog
/Pages 2 0 R
>>
endobj
2 0 obj
<< /Type /Pages
/Kids [3 0 R]
/Count 1
>>
endobj
3 0 obj
<< /Type /Page
/Parent 2 0 R
/MediaBox [0 0 612 792]
/Resources << /Font << /F0 4 0 R >> >>
/Contents 5 0 R
>>
endobj
4 0 obj
<< /Type /Font
/Subtype /Type1
/BaseFont /Helvetica
>>
endobj
5 0 obj
<< /Length 44 >>
stream
BT
/F0 36 Tf
50 700 Td
(Hello, World!) Tj
ET
endstream
endobj
Read the references and the structure falls out. Object 1, the catalog, points its /Pages entry at object 2. Object 2, the page tree, lists object 3 in /Kids and declares /Count 1. Object 3, the page, points /Parent back up to object 2 (the tree and the page reference each other, which is required), sizes itself with /MediaBox, exposes the font under the local name /F0 in its /Resources, and names object 5 as its content. Object 4 is the font: /BaseFont /Helvetica picks one of the 14 standard typefaces every conforming reader already has, so there is nothing to embed. Object 5 is the content stream
What the content stream actually says
The stream body is a tiny program in PDF's page-description language, which is postfix: operands come first, then the operator that consumes them. Five lines do the work. BT and ET open and close a text object; everything that positions or shows text has to sit between them. /F0 36 Tf sets the current font to the resource named /F0 at 36 points (Tf is "set text font and size"). 50 700 Td moves the text position to (50, 700) in page coordinates. (Hello, World!) Tj shows the string, which PDF writes as literal text in parentheses, using Tj to paint it at the current position. Leave out BT/ET and a strict reader rejects the text operators; forget to set a font before Tj and there is no current font to draw with
The /Length 44 in the stream dictionary is the count of bytes between stream and endstream, and it has to be exact. This is the value worth handing off to a tool rather than counting newlines by hand, especially since whether your editor writes line endings as LF or CRLF changes the total
Header, xref, and trailer
The objects are the content. Three structural pieces turn them into a file. The first is the header, the very first line, naming the format and version:
%PDF-1.7
The % begins a comment in PDF syntax, but a reader treats this particular comment as the format signature and reads the version from it. A real writer follows it immediately with a second comment line of high-bit bytes, a hint to file-transfer tools that the file is binary and must not be mangled as text
At the end of the file comes the cross-reference table, the index that makes random access possible. It records the byte offset of every object from the start of the file, so a reader can seek straight to object 3 without parsing objects 1 and 2 first. The table is rigid: entries are fixed-width, 20 bytes each including the line ending, formatted as a 10-digit offset, a 5-digit generation, a keyword (n for in-use, f for free), and a two-byte terminator. A correct table for our six entries (object 0 is always the free-list head) looks like this:
xref
0 6
0000000000 65535 f
0000000009 00000 n
0000000058 00000 n
0000000115 00000 n
0000000235 00000 n
0000000308 00000 n
trailer
<< /Size 6
/Root 1 0 R
>>
startxref
408
%%EOF
Those offsets are the brittle part of writing PDF by hand. Each one is the exact byte position where the corresponding N 0 obj begins, and every offset shifts the moment you add a character anywhere above it. The trailer is the entry point a reader uses last and first: /Root 1 0 R names the catalog, /Size 6 states the object count, and startxref 408 gives the byte offset of the word xref itself. A reader opens the file, jumps to the end, reads startxref, seeks to the cross-reference table, and from there reaches the catalog and everything below it. %%EOF marks the last byte
Let a tool fix the byte counts
The offsets above are illustrative; in practice they will be wrong by the time you finish typing, because they depend on the exact byte layout of your file. Rather than recompute them, write the structure with placeholder values and let a utility rebuild the cross-reference table and stream lengths. The free, cross-platform pdftk does this in one pass:
pdftk hello-draft.pdf output hello.pdf
It parses your objects, recalculates every byte offset, fills in the correct /Length values, writes a valid xref table and trailer, and emits hello.pdf. Open that in any viewer and you get one page with "Hello, World!" in 36-point Helvetica near the top. Qpdf does the same job, and many viewers will also repair a slightly malformed file on the fly. The point of leaning on a tool here is not laziness; it is that the offset arithmetic is the one part of the format with zero conceptual content and the highest error rate, so automating it lets the structure stay the thing you are learning
Why this scales to real documents
Nothing about a hundred-page report changes the shape you just built. The catalog still sits at the root, the page tree still gathers the pages, and each page still points at its resources and a content stream. What grows is breadth, not the spine: the page tree branches so a reader can skip whole subtrees, content streams carry hundreds of operators instead of five, fonts get embedded as their own stream objects with width tables and encodings, and images arrive as streams with image-specific filters. Modern files also tend to pack many objects into compressed object streams and replace the plain xref table with a cross-reference stream, which is why opening a real PDF in a text editor usually shows a wall of binary. The model underneath is identical to the one in your handmade file. For the wider object graph and how the catalog, page tree, and resource dictionaries relate across a larger document, the in-depth tour of PDF document structure picks up where this leaves off, and the file-structure overview covers incremental updates and how the trailer chains across revisions
From hand-writing to a library
Typing objects by hand is a learning exercise, not a production technique. The instant you need real fonts, wrapped text, images, or more than a trivial page, the byte bookkeeping that pdftk patched for you becomes the whole job, and you want a library that owns it. The same five objects still get written, but a library computes every offset, manages the font and resource dictionaries, and compresses the content streams without you tracking a single byte. In Delphi and C++Builder, the HotPDF Component reduces this entire file to a handful of calls: set the document up, call BeginDoc, SetFont and TextOut to place the same greeting, then EndDoc to write a correct catalog, page tree, xref, and trailer. Understanding the objects underneath is what lets you reason about the output when a document does not render the way you expected