Skip to content

ReadXlsx

Overview

The ReadXlsx class is designed to read Excel files (.xlsx) and convert each row into a Document object. This class is part of the Euler Graph Database and extends the BaseReader class.

Arguments

  • file_path (str): Path to the Excel file to load.
  • source_column (Optional[str]): The column name to use as the source identifier in the metadata. If None, the file path will be used as the source.
  • sheet_name (Optional[str]): The name of the sheet to load. If None, the first sheet will be used.
  • encoding (Optional[str]): The encoding to use for reading the file. This parameter is not typically needed for Excel files.

Example Usage

Here's an example demonstrating how to use the ReadXlsx class:

from euler.reader.xlsx_reader import ReadXlsx

Initialize the ReadXlsx with file path, source column, and sheet name

reader = ReadXlsx(file_path="example.xlsx", source_column="Source", sheet_name="Sheet1")

Load the documents

documents = reader.load()
for doc in documents:
    print(doc.text)
    print(doc.metadata)