Actions

Actions are discrete units of work that execute on connections. They represent the "verbs" of VirtuousAI — the things you want to do with your connected services.

Concept

An action combines:

Template — What to do (e.g., dlt_extract, web_search)
Connection — Where to connect (credentials)
Config — How to do it (parameters)

Examples:

Template	Connection	Config
`dlt_extract`	`conn_shopify`	`{ resources: ["orders"], incremental: true }`
`web_search`	`conn_rest_api`	`{ query: "customer data" }`
`call_agent`	—	`{ kind: "call_agent", agent: "data-analyst", question: "Summarize..." }`

Action Types

Actions fall into categories based on what they do:

Type	Description	Examples
Data Syncs	Pull data from external services into VirtuousAI	`dlt_extract`, `file_sync`
Transformations	Process and transform data	`duckdb_transform`
Integrations	Push data or call external APIs	`http_request`
Research	Search and fetch web content	`web_search`, `fetch_page`
AI Operations	Agent delegation and AI-powered tasks	`call_agent`, `generate_query`, `explain_schema`
Output	Generate deliverables	`create_flashboard`
Control Flow	Workflow routing and orchestration	`approval_gate`, `conditional_branch`, `call_automation`

ActionKind Reference

Every action has a kind that determines its behavior. Here is the complete reference:

Kind	Execution	Description
`dlt_extract`	Async	Extract data from external sources (Shopify, Klaviyo, etc.) to bronze layer
`file_sync`	Async	Sync files from remote storage (S3, Google Drive)
`duckdb_transform`	Async	SQL-based data transformation via DuckDB
`http_request`	Sync	Make arbitrary HTTP requests to external APIs
`web_search`	Sync	Search the web via Tavily API, returns titles/URLs/snippets
`fetch_page`	Sync	Extract readable content from web page URLs
`generate_query`	Sync	Generate and execute SQL queries against connected data schemas
`explain_schema`	Sync	Explain table structures and column meanings
`call_agent`	Async	Invoke an agent by slug for delegated work
`agent_search`	Sync	Agent-scoped web search (read-only variant)
`agent_fetch_page`	Sync	Agent-scoped page fetch (read-only variant)
`create_flashboard`	Async	Generate a persistent flashboard dashboard
`approval_gate`	Sync	Pause workflow and wait for human approval
`conditional_branch`	Sync	Evaluate conditions and route execution flow
`call_automation`	Async	Invoke another automation as a sub-workflow

Action Lifecycle

Status	Description	Can Transition To
`PENDING`	Created, awaiting execution	RUNNING, CANCELLED, AWAITING_APPROVAL
`AWAITING_APPROVAL`	Requires human approval	RUNNING (approved), REJECTED
`RUNNING`	Actively executing	COMPLETED, FAILED, CANCELLED
`COMPLETED`	Finished successfully	— (terminal)
`FAILED`	Encountered an error	— (terminal, can retry)
`CANCELLED`	Manually cancelled	— (terminal)
`REJECTED`	Approval denied	— (terminal)

Execution Model

VirtuousAI uses two execution modes:

Mode	When Used	Characteristics
SYNC	Fast operations (under 30s)	Inline execution, immediate response
ASYNC_QUEUE	Long-running operations	SQS + Dramatiq workers, lease-based

Lease-Based Ownership

For ASYNC_QUEUE operations:

Lease Duration: 90 seconds
Heartbeat Interval: Every 30 seconds
Watchdog Grace: 180 seconds before marking abandoned

If a worker crashes, the watchdog detects the stale lease and marks the run as failed, preventing zombie jobs.

ActionRun Structure

Each execution creates an ActionRun — an immutable record:

{
  "id": "run_xyz789",
  "actionId": "action_abc123",
  "status": "COMPLETED",
  "startedAt": "2026-01-22T14:30:00Z",
  "completedAt": "2026-01-22T14:32:15Z",
  "result": {
    "recordsProcessed": 1250,
    "bytesTransferred": 2100000,
    "tables": ["orders", "order_line_items"]
  },
  "artifacts": [
    { "name": "orders.parquet", "size": 1500000 }
  ]
}

ActionRuns are immutable. Even if an action is deleted, historical runs are preserved for auditing.

Creating Actions

{
  "kind": "dlt_extract",
  "connectionRef": { "slug": "shopify" },
  "definition": {
    "source": "shopify",
    "resources": ["orders", "products"],
    "incremental": true,
    "start_date": "2026-01-01"
  }
}

{
  "kind": "duckdb_transform",
  "definition": {
    "kind": "duckdb_transform",
    "transform_name": "stg_shopify_customers",
    "sql": "SELECT id as customer_id, email, ... FROM {{ bronze_customers }} QUALIFY ROW_NUMBER() OVER (PARTITION BY id ORDER BY updated_at DESC) = 1",
    "inputs": [
      { "bronze_dataset": "bronze.shopify.customers" }
    ],
    "output_layer": "silver",
    "mode": "full_refresh"
  }
}

{
  "kind": "call_agent",
  "definition": {
    "kind": "call_agent",
    "agent": "data-analyst",
    "question": "Summarize the key trends in this quarter's sales data"
  }
}

DuckDB Transform Actions

The duckdb_transform action type enables SQL-based data transformations using DuckDB. It's the primary way to move data through the medallion architecture.

Definition Schema

Field	Type	Required	Description
`kind`	`"duckdb_transform"`	Yes	Action type
`transform_name`	string	Yes	Output table name (e.g., `stg_shopify_customers`)
`sql`	string	Yes	DuckDB SQL with `{{ variable }}` placeholders
`inputs`	array	Yes	List of input datasets
`output_layer`	`"silver"` \| `"gold"`	No	Target layer (default: silver)
`mode`	`"full_refresh"` \| `"incremental_merge"`	No	Write mode (default: full_refresh)
`merge_keys`	array	No	Primary keys for incremental merge

Input References

Inputs can reference bronze or silver datasets:

Input Type	Format	Resolves To
Bronze	`{ "bronze_dataset": "bronze.shopify.customers" }`	`read_parquet('s3://.../*.parquet')`
Silver	`{ "silver_dataset": "stg_shopify_orders" }`	`delta_scan('s3://...')`

The {{ variable }} in your SQL gets replaced with the appropriate DuckDB function.

Write Modes

Mode	Behavior	Use When
`full_refresh`	Replaces entire table	Small tables, schema changes
`incremental_merge`	MERGE on `merge_keys`	Large tables, append-heavy workloads

Example: Silver Staging Table

{
  "kind": "duckdb_transform",
  "definition": {
    "kind": "duckdb_transform",
    "transform_name": "stg_shopify_customers",
    "sql": "SELECT id as customer_id, email, first_name, last_name, CAST(total_spent AS DECIMAL(18,2)) as total_spent, created_at as created_at_utc FROM {{ bronze_customers }} QUALIFY ROW_NUMBER() OVER (PARTITION BY id ORDER BY updated_at DESC) = 1",
    "inputs": [{ "bronze_dataset": "bronze.shopify.customers" }],
    "output_layer": "silver",
    "mode": "full_refresh"
  }
}

Example: Gold Fact Table

{
  "kind": "duckdb_transform",
  "definition": {
    "kind": "duckdb_transform",
    "transform_name": "fct_order_lines",
    "sql": "SELECT li.line_item_id, o.order_id, o.customer_id, li.quantity, li.unit_price FROM {{ stg_shopify_order_line_items }} li JOIN {{ stg_shopify_orders }} o ON li.order_id = o.order_id",
    "inputs": [
      { "silver_dataset": "stg_shopify_order_line_items" },
      { "silver_dataset": "stg_shopify_orders" }
    ],
    "output_layer": "gold",
    "mode": "full_refresh"
  }
}

Learn more about the bronze/silver/gold architecture in Data Pipeline Concepts.

Connection Resolution

Actions resolve connections flexibly:

Reference Type	Example	Resolution
By ID	`{ "id": "conn_abc123" }`	Exact match
By Slug	`{ "slug": "shopify" }`	LLM-friendly name lookup
By Provider	`{ "provider": "shopify" }`	Uses org's default for provider

Progress Tracking

For data extraction actions, the system provides detailed progress tracking with per-resource and per-slice visibility.

Progress Schema (v2)

Long-running extractions report progress in a structured format:

{
  "schema_version": 2,
  "resources_order": ["profiles", "events", "lists"],
  "completed_resources": ["profiles"],
  "failed_resources": [],
  "in_progress_resource": "events",
  "resource_cursors": {
    "events": {
      "slices_completed": 127,
      "slices_total": 384
    }
  }
}

Field	Description
`schema_version`	Always `2` for new runs
`resources_order`	Ordered list of resources to extract
`completed_resources`	Resources that finished successfully
`failed_resources`	Resources that encountered errors
`in_progress_resource`	Currently extracting resource (or `null`)
`resource_cursors`	Slice-level progress for large resources

Calculating Global Progress

To compute overall percentage:

const doneCount = completed.length + failed.length
const activePercent = cursor?.slices_total
  ? (cursor.slices_completed / cursor.slices_total)
  : 0
const globalPercent = ((doneCount + activePercent) / totalResources) * 100

Partial Success

Extractions can complete with partial success when some resources succeed and others fail:

Status	Condition	Action
`COMPLETED`	All resources succeeded	None needed
`COMPLETED` (partial)	Some succeeded, some failed	Review `failed_resources`
`FAILED`	All resources failed	Check error details, retry

Partial success allows you to use successfully extracted data while investigating failures in specific resources.

Streaming Execution

For long-running actions, subscribe to real-time progress via Server-Sent Events:

GET /api/v1/action-runs/{run_id}/logs/stream
Accept: text/event-stream

Events include:

extraction_started — Extraction beginning with resource list
resource_started — Individual resource extraction starting
resource_completed — Resource finished with row/file counts
slice_completed — Progress update for large resources
run_completed — Final success with summary
run_failed — Error details

Error Handling

When actions fail, the run includes detailed error information:

{
  "status": "FAILED",
  "error": {
    "code": "CONNECTION_ERROR",
    "message": "Failed to connect to Shopify API",
    "details": {
      "statusCode": 401,
      "shopifyError": "Invalid API key"
    },
    "retryable": true
  }
}

Error Codes

Code	Description	Retryable
`CONNECTION_ERROR`	Failed to connect to external service	Usually yes
`AUTH_ERROR`	Authentication/authorization failed	No (fix credentials)
`RATE_LIMITED`	External API rate limit hit	Yes (with backoff)
`DATA_ERROR`	Invalid or corrupt data	No (fix source data)
`TIMEOUT`	Execution exceeded time limit	Sometimes
`INTERNAL_ERROR`	Unexpected system error	Yes

Retry Behavior

Retryable errors are automatically retried with exponential backoff. Non-retryable errors require manual intervention.

Rate Limiting

Data extraction actions use pre-emptive rate limiting to avoid hitting vendor API limits:

Source	Strategy	Details
Amazon SP-API	65s intervals	Reports API has strict 1/min sustained limits
Klaviyo	Per-endpoint buckets	0.1s for most endpoints, 0.02s for events

This prevents 429 errors by spacing requests to stay under vendor limits. See Data Sources Guide for details.

Resource Ordering

When extracting multiple resources, VirtuousAI uses size-optimized ordering by default:

Behavior	When
Size-optimized (XS→XL)	No resources specified, or defaults used
User-specified order	Resources explicitly listed in definition

This ensures fast resources complete first, providing partial results quickly. See Data Sources Guide for details.

Best Practices

Use incremental syncs — When possible, sync only new/changed data to reduce execution time
Set appropriate timeouts — Configure timeouts based on expected data volume
Monitor runs — Set up alerts for failed actions, especially in automations
Test with small datasets — Validate action configuration before running on full data
Use streaming for long runs — Subscribe to SSE for real-time progress on lengthy operations

OpenAPI Reference

For detailed endpoint schemas, request/response formats, and authentication:

Action Run Endpoints

Full OpenAPI documentation for action management and execution

Actions

View Execution Mode Comparison

Action Run Endpoints

On this page