Skip to content

ReadDataFrame

Overview

The ReadDataFrame class is designed to read a pandas DataFrame and convert each row into a Document object. This functionality is part of the Euler Graph Database and extends the BaseReader class.

Arguments

  • dataframe (pd.DataFrame): The pandas DataFrame to read from.
  • source_column (Optional[str]): The column in the DataFrame to use as the source metadata for each document.

Example Usage

Here's an example demonstrating how to use the ReadDataFrame class:

import pandas as pd
from euler.reader.read_dataframe import ReadDataFrame

Initialize the DataFrame

# Example set of data
data = {
    'Name': ['Prashant', 'John'],
    'Age': [28, 34],
    'Source': ['Employee Records', 'HR Database']
}
df = pd.DataFrame(data)

Create the ReadDataFrame Instance

reader = ReadDataFrame(dataframe=df, source_column="Source")

Load the Documents

documents = reader.load()
for doc in documents:
    print(doc.page_content)
    print(doc.metadata)

Functions

check_dataframe

The check_dataframe method validates that the dataframe attribute is a pandas DataFrame. It raises a TypeError if the value is not a DataFrame.

load

The load method reads the DataFrame and converts each row into a Document object. It returns a list of Document objects. If the DataFrame is empty, it raises a ValueError.

This documentation provides an overview, argument details, and a complete example of how to use the ReadDataFrame class for converting DataFrame rows into Document objects.