Retrieve data from configured sources as pandas DataFrames
get_df
function retrieves data from a configured source and returns it as a pandas DataFrame. For database sources (PostgreSQL, ClickHouse), a table name must be specified.
source_name
(str): Name of the data source as configured in preswald.toml OR a path to a file (supports CSV, Parquet, and JSON)table_name
(Optional[str]): Required for database sources, specifies which table to retrievepd.DataFrame
: Data from the specified source as a pandas DataFrameconnect
must be called before get_df
can be used.
table_name
is not required since the entire CSV file is treated as a single table:
table_name
is required:
table_name
is required:
table_name
query()
: For custom SQL queries against data sources