Lab 2.1

Token-count three documents.

24 min · pairs

Three pre-loaded documents — a short customer email, a thread with quoted history, and a Letter of Credit. Estimate the token count first (characters ÷ 4), then verify in a counter, then log which window each fits. The middle doc is the trap.

How to run Lab 2.1: Three documents are pre-loaded for you — a short customer email, an email thread with quoted history, and a Letter of Credit. For each one, write your estimate first (use the auto-fill or do chars ÷ 4 in your head). Then open a counter — try OpenAI's tokenizer or the Tokenizer Playground — copy the doc, paste it in, and log the actual number. The "Off by" indicator shows how close your estimate was.

0 / 3 documents logged

The middle doc is the one most people get wrong — quoted history quietly stacks up.

Document name

Paste the document text

294 characters

Your estimate (tokens)

Rule of thumb: characters ÷ 4, or words × 1.3.

Actual (from the counter)

Which window does it fit?

What surprised you? (one line)

Your log

Document	Chars	Estimate	Actual	Fits	Surprise
Doc A — short customer email	294	—	—	—	—
Doc B — email thread with quoted history	2,243	—	—	—	—
Doc C — Letter of Credit	3,113	—	—	—	—

Stretch: Find one document that's too big for a small window. What would you cut to make it fit — the quoted history, the appendix, the boilerplate? Note it in the surprise column.