Right-click any workbook your application ships, open Properties, and read the Details tab. If the Title says "Q1 template — DO NOT USE" and the Author is an analyst who left the company in 2019, your generator has been copying template metadata into every customer delivery. Nobody sees it in Excel's grid, but Windows Search indexes it, SharePoint surfaces it as the document title, and a records-management system will happily file four thousand statements under the wrong author. Metadata bugs are invisible at the point of creation and embarrassing at the point of discovery.
HotXLS exposes document properties as plain workbook-level properties on both of its engines — the BIFF facade for .xls and the OOXML facade for .xlsx — so fixing this class of bug is mostly a matter of knowing which fields exist, where they physically live, and the one gating behavior that decides whether they get written at all.
Where Excel actually stores document properties
The two file formats keep metadata in entirely different places, which is why a library needs two implementations and why half-finished tools often stamp one format correctly and ignore the other. A BIFF workbook stores its properties in OLE compound-file streams — the classic SummaryInformation property set that predates Excel itself — plus the in-stream WRITEACCESS record that records who last saved the file. An OOXML workbook stores them as XML parts inside the zip package: docProps/core.xml carries the Dublin Core fields (title, creator, subject, keywords, dates), and docProps/app.xml carries application-level fields such as company and generating application, as specified in ECMA-376 Part 1.
HotXLS flattens both storages into direct properties of the workbook object. You never touch a property-set stream or an XML part; you assign strings and dates, and the correct container is produced for the format being saved.
Stamping generated workbooks from the business record
On the XLSX side, TXLSXWorkbook exposes Title, Subject, Author, Keywords, Description, Category, LastModifiedBy, Company, Application, and AppVersion as strings, plus Created and Modified as TDateTime values where zero means unset. The generation rule that prevents the opening-paragraph bug is simple: assign every field on every run, sourcing values from the business record rather than trusting whatever the template carried.
var
Book: TXLSXWorkbook;
begin
Book := TXLSXWorkbook.Create;
try
if Book.Open('statement-template.xlsx') <> 1 then
raise Exception.Create('Template not available');
// Overwrite every field: anything left untouched is
// inherited from whoever designed the template.
Book.Title := 'Account Statement 2026-06 / ACME Corp';
Book.Subject := 'Monthly account statement';
Book.Author := 'Billing Service 4.2';
Book.LastModifiedBy := 'Billing Service 4.2';
Book.Company := 'Northwind Financial';
Book.Category := 'Customer Delivery';
Book.Keywords := 'statement;billing;2026-06;acct-10024';
Book.Description := 'Generated document - manual edits are not retained';
Book.Created := Now;
Book.Modified := Now;
Book.SaveAs('statement-10024.xlsx');
finally
Book.Free;
end;
end;
The Keywords field deserves more thought than it usually gets. Search infrastructure — Windows Search, SharePoint, most DMS products — indexes it verbatim, so a semicolon-separated convention carrying the account number and period turns every delivered workbook into a findable record without any database round trip. Keep personal data out of it, though: properties travel with every copy of the file, far beyond the access controls of the system that generated it.
The timestamp pair has its own semantics worth fixing in policy. Created should mark the moment your pipeline generated the document and then never change; Modified is the field Excel itself updates whenever a recipient saves the file, so a mismatch between the two after delivery is positive evidence that someone edited the workbook downstream — useful in disputes about whose numbers a forwarded spreadsheet contains. Because the unset state is the literal value zero rather than an exception or a null, audit code must compare against zero explicitly; formatting an unset TDateTime otherwise yields a confidently wrong December 1899 date in your logs.
DocPropsTouched: the workbook that ships without docProps
The XLSX property writer is gated by a read-only flag, DocPropsTouched. A workbook in which no property was ever assigned does not produce docProps parts at all — HotXLS deliberately avoids writing an empty metadata skeleton. That is tidy behavior, but it has two consequences worth engineering around.
First, intake code on the consuming side must not assume core.xml exists in every package; tools that hard-require it will reject perfectly valid minimal files. Second, if your compliance posture requires that every outbound document carry at least a generator identity, the requirement translates to code: assign Application and Author unconditionally in the save path, because an untouched workbook satisfies the format specification while silently failing your policy.
The legacy XLS surface and the Comments trap
The BIFF facade carries the older, smaller field set: Title, Subject, Author, Keywords, Comments, Company, and Manager, plus LastSavedBy — an alias of UserName — which writes the WRITEACCESS record Excel displays when a file is locked by another user.
var
Legacy: IXLSWorkbook; // reference-counted interface: no manual Free
begin
Legacy := TXLSWorkbook.Create;
if Legacy.Open('archive-1999.xls') <= 0 then
raise Exception.Create('Cannot open archive file');
Legacy.Title := 'FY1999 ledger (migrated copy)';
Legacy.Author := 'Archive Migration Batch';
Legacy.Company := 'Northwind Financial';
Legacy.Comments := 'Migrated 2026-06-11; source retained in cold storage';
Legacy.LastSavedBy := 'migration-svc'; // BIFF WRITEACCESS record
Legacy.SaveAs('archive-1999-stamped.xls');
end;
One naming collision causes regular confusion: the document-level Comments property here is the free-text remark shown in the file's property dialog. It has nothing to do with cell comments, which are drawing-layer objects attached to ranges through an entirely different API. Code reviews catch "we already write Comments" claims that turn out to reference the wrong feature surprisingly often — the two share a name and nothing else.
Reading metadata at intake — and the probe gap
Reading is symmetric: after Open, the same properties are populated from the file, which makes metadata audits of incoming workbooks a short loop.
var
Book: TXLSXWorkbook;
begin
Book := TXLSXWorkbook.Create;
try
if Book.Open(FileName) = 1 then
begin
Writeln(Format('%s | title="%s" author="%s" created=%s',
[ExtractFileName(FileName), Book.Title, Book.Author,
FormatDateTime('yyyy-mm-dd', Book.Created)]));
if Book.Created = 0 then
Writeln(' no creation date recorded');
end;
finally
Book.Free;
end;
end;
Plan for one limitation: there is no properties-only probe. GetSheetNames can list sheets without loading a workbook, but reading Title or Author requires a full Open, so metadata triage of a large archive pays full parse cost per file. On the BIFF side that cost can be trimmed for read-only audits by setting _DisableGraphics to true before opening, which skips parsing of the drawing layer entirely — appropriate when the loop only reads properties and cell statistics, and inappropriate the moment the same instance might save, since the skipped drawing content would be lost. If sheet structure alone can pre-filter the set — skipping single-sheet exports, say — the cheap techniques in our article on sheet listing and lightweight inspection reduce how many files need the expensive pass. For bulk stamping jobs where thousands of outputs are generated rather than inspected, the write-side throughput patterns in our article on streaming writes for batch jobs apply unchanged, since property assignment adds nothing measurable to save time.
Frequently asked questions
Do document properties survive format conversion?
Within one facade, yes — open an XLSX, modify it, save it, and the property set round-trips. Across formats, treat properties as part of your conversion checklist: the BIFF and OOXML field sets do not match one-to-one (BIFF has Manager; OOXML has Category, Description, and timestamps), so a converter should map fields explicitly rather than assume parity.
Can metadata leak information I did not intend to ship?
Yes, and template inheritance is the main vector — author names, internal project labels in keywords, draft titles. The defense is the overwrite-everything discipline shown above. Verify with the same Properties dialog your customers can open, or by unzipping an .xlsx and reading docProps/core.xml directly.
Which fields drive search results in SharePoint and Windows Search?
Title, Author, Keywords (exposed as Tags), and Comments or Description carry most of the indexing weight. A meaningful Title alone — distinct per document, carrying the period and account — does more for findability than any folder-naming convention layered on top.
Document properties are the cheapest professional polish a generated workbook can carry, and the most commonly shipped defect when nobody owns them. Both property surfaces described here are part of the HotXLS Component, which writes them natively for XLS and XLSX without Excel automation.