SQL Server
Connect SQL Server for dlt extraction
SQL Server
VAI Flux supports extracting tables from Microsoft SQL Server into the bronze layer (Parquet in S3) using dlt_extract.
SQL Server connections currently support SQL Authentication (username/password) only. Windows Authentication and Azure AD are not supported yet.
Prerequisites
- ODBC Driver: Install Microsoft ODBC Driver 17 or 18 for SQL Server on every machine/container that runs extraction workers.
- Network access: Ensure the SQL Server host/port is reachable from the VAI Flux worker environment.
Connection Fields
| Field | Required | Default | Notes |
|---|---|---|---|
host | Yes | — | SQL Server hostname or IP |
port | No | 1433 | TCP port |
database | Yes | — | Database name |
username | Yes | — | SQL login username |
password | Yes | — | SQL login password |
encrypt | No | false | Enable TLS encryption |
trust_server_certificate | No | false | Skip cert validation (use only if needed) |
connection_timeout | No | 30 | Connection timeout (seconds) |
Table Selection
Tables are specified manually using schema.table identifiers.
Examples:
dbo.orderssales.customers
If you omit the schema (e.g. orders), it defaults to dbo.
Bracketed identifiers are supported for special characters:
[sales].[order items]
UI: Selecting Tables
When configuring a dlt_extract action in the web UI:
- Select your SQL Server connection
- Use the Tables picker to choose which tables to extract (searchable)
- If discovery fails, you can still type
schema.tableand press Enter
Incremental Extraction
Incremental extraction is configured globally via:
cursor_column(recommended:datetimeorintcolumns)start_date(used as the initial cursor value on first run)
On subsequent runs, dlt resumes from saved state automatically.
Example: dlt_extract Action
{
"kind": "dlt_extract",
"connection": { "kind": "by_slug", "slug": "my-sqlserver" },
"source": "sqlserver",
"source_config": {
"tables": ["dbo.orders", "sales.customers"],
"cursor_column": "updated_at",
"start_date": "2026-01-01",
"full_resync": false
}
}Bronze Layer Data Model
Data extracted from SQL Server is organized by connection, not by action:
s3://bronze/tenant_id={id}/source=sqlserver/{schema.table}/year=YYYY/month=MM/day=DD/Multiple actions extracting the same table from the same connection write to the same S3 path. This is by design: bronze represents the canonical raw data from each source.
Key points:
- Avoid creating conflicting actions (e.g., one with
cursor_columnfor incremental, another withfull_resync=trueon the same tables) - If you need different extraction strategies for the same tables, create separate connections
Best Practice: One action per (connection, table-set) combination.
Stored Procedures
You can extract result sets from read-only stored procedures alongside tables.
Only use this for stored procedures that return data (like parameterized views). SPs that perform INSERT, UPDATE, or other side effects should not be used here.
Configuration
{
"kind": "dlt_extract",
"source": "sqlserver",
"source_config": {
"tables": ["dbo.orders"],
"stored_procedures": [
{
"name": "dbo.GetOrderHistory",
"params": { "start_date": "2026-01-01", "customer_id": 123 },
"resource_name": "order_history"
}
]
}
}Fields
| Field | Required | Description |
|---|---|---|
name | Yes | Full SP name with schema (e.g., dbo.GetOrderHistory) |
params | No | Parameters to pass to the SP |
resource_name | Yes | Name for the resource in bronze (e.g., order_history) |
Limitations
- Single result set only: If the SP returns multiple result sets, only the first is captured
- Read-only only: No side-effect validation; use responsibly
- No incremental support: SP results always use
write_disposition='replace' - Parameters must be JSON-serializable: Complex types like binary are not supported
Example: SP-Only Extraction
You can extract from stored procedures without any tables:
{
"source_config": {
"stored_procedures": [
{ "name": "dbo.GetDailyStats", "resource_name": "daily_stats" },
{ "name": "reports.CustomerSummary", "params": { "year": 2026 }, "resource_name": "customer_summary" }
]
}
}Progress Estimation
Before extracting each table, VAI Flux runs SELECT COUNT(*) to estimate row counts. This provides:
- Estimated rows: Approximate rows to be extracted (filtered by watermark for incremental)
- Estimated chunks: Based on
chunk_size(default 100,000 rows per chunk)
This appears in the action run progress for visibility into long-running extractions.
Progress estimation is not available for stored procedures (row count cannot be determined before execution).
Troubleshooting
- Authentication failed: Verify
username/passwordand that SQL Authentication is enabled on the server. - Database not found / access denied: Confirm
databasename and grant the user permissions. - Permission denied: The user must be allowed to connect and run SELECT queries.
- TLS handshake errors: Toggle
encryptand/ortrust_server_certificate, and verify the server certificate chain. - Driver errors: Confirm the ODBC driver is installed and available to the worker runtime.