This started while training a finance team to use AI on real document work. The first job everyone wanted help with was the same: the stack of invoices and receipts that has to be keyed in one by one. You photograph them, hand them to the AI, and yes, it reads almost every character.
But when you take what comes back and actually use it, you end up fixing it anyway. On one invoice the grand total lands in the before-VAT field. On another a 0 becomes an O. Dates come back in a format the accounting software rejects. You end up retyping most of the invoice. So where was the time saved?
That's when it clicks that we were measuring the wrong thing. The right question isn't "can the AI read the invoice." It's does this give me data I can import straight into my accounting software, with the numbers matching the source. This piece walks through it in order: why "readable" isn't enough, how to extract data that's genuinely import-ready, and how to put it to work on real accounting jobs.
Part 1Why "readable" isn't enough
The older tools we call OCR did one thing: turn characters in an image into text. You got a block of text back, and a person still had to work out which piece was the document number and which was the total, then type each into the right field.
Today's AI goes further. It can tell you this number is the before-VAT amount and that one is the tax ID. But that very ability lulls you into trusting it too much, because where it slips isn't where it can't read. It's where it reads something wrong, convincingly.
The three common slips
- Look-alike characters. 0 versus O, 1 versus l, a 1,050 that becomes 1,850. Your eye slides right past it, but in the system it's a wrong number.
- Right value, wrong field. The grand total lands in the before-VAT field, or the invoice date and due date get swapped. The number isn't wrong, but its meaning is.
- Format the software won't accept. A date comes back as 01/07/26 when the system wants 2026-07-01, or a tax ID is missing a digit. It can't be imported, so you fix it by hand.
All three share one thing: they don't make the AI look dumb, they make it look smart enough that you stop checking. So the goal of this job isn't "get the AI to read." It's data you can import as-is and trace back to the source.
Part 2Extracting data that's actually import-ready
The approach that works splits into three steps, each answering one of the slips above: extract into fields, shape to match, then check before import.
Step 1: Extract into fields, not text
Don't ask the AI "what does this invoice say," or you get a block of text to sort out yourself. Tell it exactly which fields you want: document number, date, seller tax ID, before-VAT amount, VAT, total, and have it fill each one. The rule that matters: if a field isn't found, leave it blank, never guess. A blank a human can see and fill is far safer than a number the AI invented.
Step 2: Shape it to match what the software accepts
Correct values aren't enough yet. They have to be shaped for import. Normalize every date to one format across the batch, make tax IDs a full 13 digits, keep before-VAT and VAT in separate fields, and round the way the system rounds. This step turns "correct data" into import-ready data, which is a different thing.
Step 3: Check before importing (vouch)
This is what separates the method from just handing invoices to an AI and importing whatever comes out. Before import, every invoice passes a check:
- Check the numbers add up on their own. Before-VAT plus VAT should equal the total. If it doesn't, a field was extracted wrong.
- Match the key fields to the source image. Document number and total, at least, should match what's actually on the invoice.
- Surface anything the AI wasn't sure about. Where a stamp or handwriting sits over a number, flag it, don't let it slide through silently.
The heart of this check is a single rule: numbers that don't add up get raised to a human, not imported first and fixed later. Once a bad row is in the system, hunting for it among hundreds costs far more than catching it at the door.
Anyone can follow these three steps by hand. What we put the work into is the tooling that makes all three fast and repeatable across hundreds of documents, built on one principle: the interface stays fixed, the extraction engine inside can be swapped. However far the AI models move, the numeric check keeps doing its job.
Part 3Putting it to work on real accounting jobs
Where this helps
- High-volume purchase invoices that get keyed in every month-end
- Expense receipts that need extracting and categorizing in one pass
- Bills with stamps or handwriting, exactly where the AI should flag low confidence and route to a human
- Three-way matching of purchase order, goods receipt, and invoice before you set up payment
The one rule to remember
If you take one thing from this article, take this: the goal isn't for AI to read the bill, it's a row your accounting software accepts as-is and you can trace back to the source. Everything else is just detail on how to make that rule real.
Where to start
Don't build automation on day one. Try it by hand on ten invoices first.
- Open the accounting software you use and note which fields its import asks for. That list is your schema.
- Have the AI extract invoices into exactly those fields. Anything it can't find, leave blank.
- Set one simple check to begin with: does before-VAT plus VAT equal the total?
- Set aside any invoice that fails the check for a human to look at; import the rest.
- Count how many of the ten needed a manual fix. That number tells you whether it's worth scaling.
Once you can see which pile sails through and which needs eyes, you'll know how to scale it to a whole month. We teach this to in-house finance teams hands-on from the first week, details on the AI training for accounting teams page.
- From Statement and Books to a Reconciliation You Can Check, let AI match transactions then classify what didn't
- Filing financial statements online with DBD e-Filing, the steps after closing the books, from preparing the file to a successful submission
- Written from real work: a training program for in-house finance teams using Claude Cowork to extract documents and check line items (productize.life/services-en)
- Three-way matching is a standard internal control in the purchase-to-pay cycle.
- Tax invoice and e-Tax invoice formats reference Thailand's Revenue Department e-Tax Invoice system (etax.rd.go.th).