enquire-mcp API reference - v3.9.0-rc.4
    Preparing search index...

    Function readPdf

    • Extract text from a PDF page-by-page, with optional page-range slicing and metadata.

      Image-only / scanned PDFs surface has_text: false — agents should detect this and route through ocrPdf for OCR. Lazy-loads pdfjs-dist (optional dep) so markdown-only users pay zero cost. Out-of-range pages slice arguments are clamped rather than thrown (matches Array.prototype.slice semantics).

      Parameters

      Returns Promise<ReadPdfResult>

      A ReadPdfResult with per-page text, full-text join, metadata, and original total_page_count.

      If path is empty, the file is missing or excluded, or pdfjs-dist is not installed.

      If path resolves outside the vault.

      // Read pages 1-5 of a long paper
      const r = await readPdf(vault, {
      path: "Papers/2024-rag-survey.pdf",
      pages: [1, 5],
      include_metadata: true
      });
      if (!r.has_text) console.log("Scanned PDF — try ocrPdf()");
      console.log(r.metadata?.title, r.full_text.slice(0, 200));