Function readPdf

readPdf(vault: Vault, args: ReadPdfArgs): Promise<ReadPdfResult>
Extract text from a PDF page-by-page, with optional page-range slicing and metadata.

Image-only / scanned PDFs surface has_text: false — agents should detect this and route through ocrPdf for OCR. Lazy-loads pdfjs-dist (optional dep) so markdown-only users pay zero cost. Out-of-range pages slice arguments are clamped rather than thrown (matches Array.prototype.slice semantics).
Parameters
- vault: Vault
  The vault.
- args: ReadPdfArgs
  ReadPdfArgs. path required.
Returns Promise<ReadPdfResult>
A ReadPdfResult with per-page text, full-text join, metadata, and original total_page_count.
Throws
If path is empty, the file is missing or excluded, or pdfjs-dist is not installed.

Throws
If path resolves outside the vault.
Example
```
// Read pages 1-5 of a long paper
const r = await readPdf(vault, {
  path: "Papers/2024-rag-survey.pdf",
  pages: [1, 5],
  include_metadata: true
});
if (!r.has_text) console.log("Scanned PDF — try ocrPdf()");
console.log(r.metadata?.title, r.full_text.slice(0, 200));
```
- Defined in tools/media.ts:538