ReadDataFrame¶
Overview¶
The ReadDataFrame
class is designed to read a pandas DataFrame and convert each row into a Document
object. This functionality is part of the Euler Graph Database and extends the BaseReader
class.
Arguments¶
dataframe
(pd.DataFrame): The pandas DataFrame to read from.source_column
(Optional[str]): The column in the DataFrame to use as the source metadata for each document.
Example Usage¶
Here's an example demonstrating how to use the ReadDataFrame
class:
import pandas as pd
from euler.reader.read_dataframe import ReadDataFrame
Initialize the DataFrame¶
# Example set of data
data = {
'Name': ['Prashant', 'John'],
'Age': [28, 34],
'Source': ['Employee Records', 'HR Database']
}
df = pd.DataFrame(data)
Create the ReadDataFrame Instance¶
reader = ReadDataFrame(dataframe=df, source_column="Source")
Load the Documents¶
documents = reader.load()
for doc in documents:
print(doc.page_content)
print(doc.metadata)
Functions¶
check_dataframe
¶
The check_dataframe
method validates that the dataframe
attribute is a pandas DataFrame. It raises a TypeError
if the value is not a DataFrame.
load
¶
The load
method reads the DataFrame and converts each row into a Document
object. It returns a list of Document
objects. If the DataFrame is empty, it raises a ValueError
.
This documentation provides an overview, argument details, and a complete example of how to use the ReadDataFrame
class for converting DataFrame rows into Document
objects.