Data
connect
Initialize and connect to configured data sources in preswald.toml
The connect
function brings together all the data sources defined in preswald.toml
. It initializes connections to configured data sources and returns a DuckDB connection object that serves as the unified query interface.
Returns
duckdb.DuckDBPyConnection
: the same connect object returned if you runduckdb.connect()
. Note, while you can use this object directly, it’s not the supported path and it’s better to usepreswald.get_df
andpreswald.query
Supported Sources
CSV
CSV sources can be configured with the following parameters:
The CSV source supports:
- Automatic type inference via DuckDB’s
read_csv_auto
- Direct SQL querying capabilities
- Full DataFrame conversion
Postgres
PostgreSQL sources require the following configuration:
Features:
- Uses DuckDB’s postgres_scanner extension
- Supports schema-qualified table names
- Enables direct SQL querying of PostgreSQL tables
- Provides DataFrame conversion for specific tables
Clickhouse
ClickHouse sources can be configured with these parameters:
Features:
- Utilizes DuckDB’s ClickHouse scanner extension
- Supports both HTTP and HTTPS connections
- Enables direct SQL querying
- Provides DataFrame conversion capabilities
Usage Example
connect
only needs to be called once in the app, before any use of preswald.data
functions. Here’s an example of how to add connect
to your app:
Key Features
- Unified Query Interface: Access all data sources through a single DuckDB connection
- Automatic Extension Loading: Required DuckDB extensions are automatically installed and loaded
- Secure Configuration: Supports separate secrets file for sensitive credentials
- Error Handling: Robust error handling for connection and query operations
- Logging: Comprehensive logging of connection and query operations
Best Practices
- Call
connect()
once at application startup - Store credentials in
secrets.toml
separate from main configuration - Use
preswald.query()
andpreswald.get_df()
instead of direct DuckDB connection - Check returned source names to verify successful connections
Was this page helpful?