Most OCR gives up on smudged handwriting, crumpled receipts, and columns that don't sit still. Hardparse retries the hard parts with more context until the output is clean — and shows you the reasoning it used to get there.
Every token gets a confidence score. Anything under 90% triggers a second pass with a widened context window — the surrounding lines, document type, domain lexicon, and visual similarity to training glyphs. If it still won't resolve, we flag it for human-in-the-loop with the top three candidates and the reasoning behind them. You see every step of the argument, not just the verdict.
Stream tokens as they resolve, or get the final structured JSON. Every response includes the agent trace and per-token confidences. Retries are transparent — pay only for tokens emitted, not for retries.
# Parse a tough doc, stream tokens as they resolve curl https://api.hardparse.ai/v1/parse \ -H "Authorization: Bearer $HARDPARSE_KEY" \ -F file=@consult-note.heic \ -F schema="medical.note" \ -F retry="aggressive" \ -F stream=true > event: token > data: {"t":"worsening","conf":0.96,"pass":2} > event: resolved > data: {"id":"t2","from":"wkening","to":"worsening","delta":+26} > event: complete > data: {"conf":0.978,"retries":3,"ms":1842}
Most docs parse cleanly on the first try. Hardparse exists for the rest — the ones that make your confidence interval wince. 5,000 free pages, no credit card.