PDFReader¶
Overview¶
The PDFReader
class is designed to read PDF files and convert each page into a Document
object. This class is part of the Euler Graph Database and extends the BaseReader
class.
Arguments¶
file
(Path): Path to the PDF file to load.extra_info
(Optional[Dict]): Additional metadata to include with each document.
Example Usage¶
Here's an example demonstrating how to use the PDFReader
class:
from pathlib import Path
from euler.reader.pdf_reader import PDFReader
Initialize the PDFReader¶
reader = PDFReader()
Load the documents from the PDF file with additional metadata¶
documents = reader.load(Path("example.pdf"), extra_info={"author": "John Doe"})
#Print the loaded documents¶
for doc in documents:
print(doc.text)
print(doc.metadata)