Skip to content

PDFReader

Overview

The PDFReader class is designed to read PDF files and convert each page into a Document object. This class is part of the Euler Graph Database and extends the BaseReader class.

Arguments

  • file (Path): Path to the PDF file to load.
  • extra_info (Optional[Dict]): Additional metadata to include with each document.

Example Usage

Here's an example demonstrating how to use the PDFReader class:

from pathlib import Path
from euler.reader.pdf_reader import PDFReader

Initialize the PDFReader

reader = PDFReader()

Load the documents from the PDF file with additional metadata

documents = reader.load(Path("example.pdf"), extra_info={"author": "John Doe"})

#Print the loaded documents

for doc in documents:
    print(doc.text)
    print(doc.metadata)