One page of extracted PDF text.
is_empty true indicates either a blank page or — more commonly — an image-only page that needs OCR. Drives the has_text aggregate on the envelope.
is_empty
has_text
Character count of text (post-extraction, post-trim).
text
True when no text could be extracted (likely needs OCR).
1-indexed page number.
Extracted text content (may be empty for image-only pages).
One page of extracted PDF text.
is_emptytrue indicates either a blank page or — more commonly — an image-only page that needs OCR. Drives thehas_textaggregate on the envelope.