Technical Article

Hardening a Delphi PDF Signer Against Malicious PKCS#12

When you sign a PDF, you usually think of the signing key as something you control. It lives in a .pfx file you generated, protected by a password you chose. The code that reads that file feels like plumbing, not a boundary. That intuition is wrong the moment the certificate stops being yours. A desktop tool that lets a user pick any .pfx, a server that accepts an uploaded credential, a batch signer fed certificates over the network, all hand attacker-influenced bytes to a parser before a single signature byte is produced. A PKCS#12 reader is attack surface, in the same sense that an image decoder or a font loader is.

This article walks through two real defects that lived in that reader, both in the path that imports a signing credential. Neither is exotic. Both come from the same root cause that hits almost every binary parser written in a language with fixed-width integers: a length or a count from the file is trusted one step further than it should be. One leads to an out-of-bounds read, the other to a process that hangs until you kill it.

Where the bytes travel

Importing a .pfx to sign a document is not one operation, it is a short pipeline, and each stage parses something an attacker may have written. The container is a PKCS#12 structure as defined in RFC 7292, a nest of AuthenticatedSafe bags wrapped around an encrypted shroud that holds the private key. Reading it means walking ASN.1, deriving a key from the password, decrypting, then handing the recovered RSA key to the code that builds the signature.

In HotPDF those stages map to distinct units. The PKCS#12 container logic lives in HPDFPFX. Every tag, length, and value it touches is decoded by the ASN.1 reader in HPDFASN1. Key derivation and the PBES2 decryption sit in HPDFCrypt alongside PBKDF2HMACSHA256. When the key is recovered, HPDFRSA and the CMS SignedData builder in HPDFCMS turn it into the detached signature embedded in the PDF. The public entry point that drives the whole chain is one call.

// Drives the full pipeline: load the placeholder PDF, parse the PFX,
// derive the key, build CMS SignedData, write the signed output.
if THotPDF.SignPDFWithPFX('Prepared.pdf', 'Signed.pdf',
     'signer.pfx', 'p@ssw0rd') then
  // signature embedded
else
  // signing did not complete
;

Every byte of signer.pfx flows through HPDFASN1 and HPDFPFX before any cryptography happens. If those two units are not careful about what the file claims, the cryptography downstream never gets the chance to matter.

Defect one: an ASN.1 length that wraps past the guard

ASN.1 in DER and BER encodes every element as a tag, a length, and that many content bytes. The length is the field you must trust but verify, because it tells the parser how far to read, and it was written by whoever produced the file. X.690 §8.1.3 defines two encodings. The short form packs a length of 0 to 127 into a single byte. The long form, used for anything larger, spends one lead byte whose low seven bits give the count of length bytes that follow, then that many big-endian bytes carry the actual value. Four length bytes can therefore declare a content size approaching four gigabytes.

After decoding such a value, the parser has to check that the content actually fits inside the buffer before it trusts it. The natural check is to confirm that the current position plus the content length does not run past the end of the data. Written in the obvious way, with the position, the content length, and the total all held in 32-bit signed integers, that guard is broken:

// The trap: signed 32-bit arithmetic. With ContentLen near MaxInt,
// Pos + ContentLen overflows to a NEGATIVE value, so the comparison
// is false and a forged ~2 GB length sails straight through.
if Pos + ContentLen > Total then
  raise EHPDFASN1Error.Create('content overruns buffer');

The problem is the addition, not the comparison. When ContentLen is close to MaxInt (2147483647), Pos + ContentLen overflows the signed 32-bit range and wraps around to a negative number. A negative sum is never greater than Total, so the guard reports that everything is fine and lets the parser proceed with a content length of roughly two gigabytes that the buffer does not contain. What happens next is the damage: the reader allocates a buffer for that claimed length and copies into it, a SetLength followed by a Move reading from the source. The source has only a few hundred bytes left, so the copy reads far past the end of the input, an out-of-bounds read that at best crashes and at worst leaks adjacent process memory into the parse.

The only correct guard widens the intermediate sum before the comparison, so the addition cannot overflow the type it is computed in. The fix promotes both operands to Int64:

// Correct: both operands widened to Int64 before the add, so the sum
// cannot wrap. A forged 2 GB length now fails the bounds check.
if ContentLen < 0 then
  raise EHPDFASN1Error.Create('negative content length after decoding.');
if Int64(Pos) + Int64(ContentLen) > Int64(Total) then
  raise EHPDFASN1Error.Create('content overruns buffer');

An Int64 holds the sum of two 32-bit values without loss, so the comparison sees the real number and rejects the forged length. The separate non-negative check on ContentLen closes the matching case where a decoded value lands negative on its own. In HotPDF this guard lives in HPDFASN1ParseNode, the function that produces the node every other helper builds on. Because HPDFASN1Content sizes its SetLength and Move directly from the node's content length, a node that passed a bad guard would have poisoned every read taken from it. Fixing the bound at the point of decode is what makes the helpers above it safe.

Defect two: a PBKDF2 iteration count used as a weapon

The second flaw is not a memory error, it is the file telling your CPU how hard to work. PKCS#12 protects its key material with PBES2, the password-based scheme from PKCS#5, specified in RFC 8018. PBES2 runs a key derivation function, here PBKDF2 with HMAC-SHA-256, then a cipher, here AES-256-CBC. PBKDF2 takes an iteration count, and that count is a parameter carried in the file. Its whole purpose is to be slow: more iterations means each password guess costs more, which is good against an offline attacker. RFC 8018 §4.2 is explicit that a larger count is better for security, and deliberately sets no ceiling.

That openness is fine when you generated the file. It is a weapon when the attacker did. The iteration count is an attacker-controlled work factor, and an attacker-controlled work factor is an algorithmic-complexity denial of service. A forged .pfx can encode an iteration count in the billions; the parser dutifully reads it and calls PBKDF2 for that many rounds of HMAC-SHA-256, and the process disappears into a loop that will not return for minutes or hours on one supplied file. On a signing server that handles one credential per request, a single crafted upload stalls a worker.

The count makes the wraparound worse before it makes the CPU spin. The iteration value lives in the file as an ASN.1 INTEGER, which has no fixed width, while the field PBKDF2 ultimately consumes is a 32-bit Integer. Decode the INTEGER straight into that field and a large value truncates, and a value crafted to land on the sign bit comes back negative or as some unrelated small number, so even the size of the work is no longer what the file appeared to ask for. The fix reads the value at full width and bounds it before narrowing:

// Read the iteration count as Int64 first, then clamp to a sane band
// BEFORE it is narrowed into the 32-bit Iterations field PBKDF2 uses.
LIter := HPDFASN1ToInteger(Data, Node);          // returns Int64
if (LIter < 1) or (LIter > 100000000) then
  raise EHPDFPFXError.CreateFmt(
    'PBKDF2 iteration count %d is outside the accepted range 1..100000000',
    [LIter]);
Iterations := Integer(LIter);                    // safe: already bounded

Reading into an Int64 means the decoded value is the real one, not a truncated ghost of it. The lower bound rejects zero and negative counts, which are nonsensical for a key derivation. The upper bound, one hundred million, sits well above any legitimate PKCS#12 file, which today uses tens to low hundreds of thousands of iterations, while capping the worst case to a bounded, survivable amount of work. Only after the value has passed that band is it narrowed to the 32-bit field, so the truncation can no longer surprise anyone. In HotPDF this clamp lives in ParsePBES2Params, where the PBKDF2 parameters are decoded on the way to PBKDF2HMACSHA256.

Why both fixes are the same fix

The two defects look different, one a buffer overrun and one a hung process, but they are the same mistake. In each case a number from an untrusted file was carried into a fixed-width type one step too early, before it had been checked against reality. The length was added in 32 bits before the bounds test; the iteration count was narrowed to 32 bits before the range test. Both yield to the same discipline: decode at full width, check against the real limit, and only then narrow. The intermediate Int64 is not a style choice, it is the only width in which the guard can see the value the attacker actually wrote. A bound that overflows is not a bound, and a count with no ceiling is not a parameter, it is a remote throttle on your own CPU.

Practical guidance for a signing pipeline

The narrow lesson is to validate untrusted certificate input the way you would validate any untrusted upload. Cap the size of a .pfx you accept, since a legitimate one is kilobytes, not megabytes. Treat a parse failure as routine rejected input, not an error worth a stack trace to the user. If you sign on a server, run the import where a stalled worker cannot take the service down with it, and put a timeout around the operation so an unexpectedly expensive file is bounded by wall-clock as well as by the iteration cap.

The broader lesson reaches past certificates. Parser hardening is not a one-time audit of one unit, it is a property of every place your library reads bytes it did not write. A PDF library parses a great deal from untrusted sources: fonts embedded in a document, images in half a dozen codecs, stream filters, and, on the signing path, certificates. Each of those is attack surface, and each deserves the same suspicion of every length and every count. HotPDF builds the import and signing path on the hardened HPDFASN1, HPDFPFX, HPDFCrypt, and HPDFCMS units described here, so that the credential you hand it, wherever it came from, is parsed defensively before it is ever trusted.

The signing workflow these checks protect is covered end to end in our walkthrough of PAdES digital signatures in Delphi, and the same defensive posture applied to document encryption, including the AES-256 key path that shares this codebase, is described in the article on AES-256 encryption and security. All of it ships as part of the HotPDF Component for Delphi and C++Builder, alongside the loading, editing, encryption, and signing APIs covered elsewhere on this blog.