One OCR'd page of a PDF.
confidence is Tesseract's mean per-word confidence for this page, 0-100. Low confidence (<60) typically indicates a rough render — try increasing scale or providing a better-matched lang.
confidence
scale
lang
Character count of text.
text
Tesseract's mean confidence for this page, 0-100.
True when OCR returned no text (blank page or render failure).
1-indexed page number.
OCR'd text content.
One OCR'd page of a PDF.
confidenceis Tesseract's mean per-word confidence for this page, 0-100. Low confidence (<60) typically indicates a rough render — try increasingscaleor providing a better-matchedlang.