Document parsing that
actually works.

OCR, tables, handwriting — one endpoint. Benchmarked against 20+ models so you don't have to.

500 free calls/month. No credit card.

Done. We'll send your key shortly.


Input a scan. Get structured data.

Layout detection finds regions first. Then a vision-language model reads each one separately. Not a single OCR pass over the whole page.

Input: scanned invoice (PDF)
INVOICE #2847
Acme Corp → Client Ltd.
March 1, 2026
------------------------------
  Item          Qty   Price
  API calls     10k   $49.00
  Priority      1     $29.00
  Support       1     $19.00
------------------------------
  Total              $97.00
Payment due within 14 days.
Bank: PL61 1090 1014 0000 0712 1981 2874
Handwritten note: "Paid 3/14 -- thx!"
Output: structured markdown + metadata
# Invoice #2847
Acme Corp → Client Ltd.
March 1, 2026
ItemQtyPrice
API calls10k$49.00
Priority1$29.00
Support1$19.00
Total$97.00
Payment due within 14 days.
Bank: PL61 1090 1014 0000 0712 1981 2874
Handwritten: "Paid 3/14 -- thx!"
regions: 4 (title, table, text, handwriting)
confidence: 0.94 – 0.99
time: 1240ms

Two-stage pipeline.

Most OCR tools run one model over the entire page. We run two. First, a layout model detects what's where. Then a vision-language model reads each region separately. Tables stay tables. Handwriting stays handwriting.

01
Upload
POST a PDF, PNG, JPG, TIFF, or HEIC. Up to 20MB per file.
02
Detect regions
Layout model finds text blocks, tables, formulas, figures, handwriting. Each gets a bounding box.
03
Read and return
VLM reads each region with the right context. You get Markdown + JSON with confidence scores.

One endpoint.

curl -X POST https://api.hardparse.com/v1/parse \ -H "Authorization: Bearer hp_your_key" \ -F "file=@invoice.pdf"

What it handles.

Scanned documents

Phone photos, faxes, 300dpi scans. The kind of files your users actually upload.

Tables

Rows, columns, cells. Output is a Markdown table you can parse or render.

Handwriting

Notes, filled-in forms, margin annotations. No templates.

Math

LaTeX output. Fractions, integrals, matrices.

Bounding boxes

Every region: type, coordinates, confidence score. Reading order preserved.

11 languages

EN, DE, FR, ES, PL, ZH, JA, KO. Same endpoint.


Pricing

Pro
$49/mo
10,000 calls/month.
  • Everything in Free
  • Priority queue
  • Webhooks
  • Email support

Get your API key

500 free calls/month. No credit card.

Done. We'll send your key shortly.