Layout-aware
Tables become real Markdown tables. Formulas become LaTeX. Handwriting becomes readable text. Column order preserved.
A free, keyless OCR API backed by open-weight VLMs. Markdown or compilable LaTeX out. Three lines of Python or TypeScript — no signup, no key, no BS. The first of many endpoints on CodeSOTA.
# pip install hardparse
from hardparse import parse
md = parse("invoice.pdf")
print(md) # → Markdown with tables, formulas, layout
Most OCR tools flatten documents into sloppy text. Hardparse preserves structure — so the Markdown you get is the Markdown you'd write.
Tables become real Markdown tables. Formulas become LaTeX. Handwriting becomes readable text. Column order preserved.
Markdown for RAG and LLM pipelines. A compilable standalone .tex for papers. Same API, swap one flag.
Open-weight models. Benchmarked publicly on CodeSOTA. No black box — you can audit, self-host, or fork.
Ship without waiting for procurement. The API is anonymous, rate-limited by IP, and the SDKs carry zero dependencies. Email hi@hardparse.com for higher limits.
# pip install hardparse
from hardparse import parse, parse_latex
# Markdown out
md = parse("invoice.pdf")
# Or LaTeX — compilable standalone document
tex = parse_latex("paper.pdf")
open("out.tex", "w").write(tex)
# $ xelatex out.tex → out.pdf
// npm i @codesota/ocr
import { parse, parseLatex } from "@codesota/ocr";
// Markdown out
const md = await parse(file); // File | Blob | path
// Or LaTeX — compilable standalone document
const tex = await parseLatex("paper.pdf");
await writeFile("out.tex", tex);
// $ xelatex out.tex → out.pdf
# Markdown out (default)
curl -F "file=@invoice.pdf" \
https://hardparse.com/v1/parse
# LaTeX out — compilable standalone .tex
curl -F "file=@paper.pdf" \
"https://hardparse.com/v1/parse?format=latex"
Zero deps. Runs in Node 18+, browsers, Python 3.9+. Source on GitHub.
Same pipeline, same models. No install. Supports PDF, images, and scans up to 200 MB.
Three stages, one request. We detect the layout, recognize the content, and reconstruct it as structured Markdown.
A document layout model finds tables, formulas, text blocks, figures, and their reading order — even across multi-column PDFs.
Each region goes to a vision-language model tuned for its type — table → OTSL, formula → LaTeX, text → plain. Concurrent on GPU.
Results merge back in the original reading order. You get a single Markdown string plus per-page structure for downstream LLMs or RAG.
Compared to the alternatives: vendor APIs charge per page and give you less structure. Open models match or beat them at a fraction of the cost.
| Service | Price | Quality |
|---|---|---|
| Hardparse ● live | $19/mo flat · 100/day free | 🟢 GPU VLM |
| Google Document AI | $0.10–1.50/page | 🟢 strong |
| Azure Doc Intelligence | $0.50–10/1K pages | 🟢 strong |
| Tesseract | free | 🔴 breaks on tables |
Our OCR model selection isn't vibes. CodeSOTA tracks 164+ models across 97 real-world benchmarks — with reproducible scores, cost, and licensing. Dig in before you trust us.
Browse the benchmarksYes. 5 pages/month free, no credit card. After that it's $19/mo flat for unlimited. I'll raise that once I have real traction.
Yes — free, no key. POST /v1/parse or use the Python / TypeScript SDK. 100 requests/day per IP. See the API section above.
Not yet. If you need on-prem for compliance reasons, email me — I'll figure something out.
I needed proper OCR to feed my own RAG pipeline — something that preserved tables, formulas, and layout instead of flattening everything into sloppy text. Existing vendors either lost structure or got expensive fast. So I built this. Now you can use it too.
One endpoint per AI task. Backed by benchmarks you can audit. TTS, STT, and segmentation are next.