VirtuousAI
Primitives

Actions

Execute and monitor units of work on your connections

Actions

Actions are discrete units of work that execute on connections. They represent the "verbs" of VirtuousAI — the things you want to do with your connected services.

Concept

An action combines:

  • Template — What to do (e.g., dlt_extract, web_search)
  • Connection — Where to connect (credentials)
  • Config — How to do it (parameters)

Examples:

TemplateConnectionConfig
dlt_extractconn_shopify{ resources: ["orders"], incremental: true }
web_searchconn_rest_api{ query: "customer data" }
call_agent{ kind: "call_agent", agent: "data-analyst", question: "Summarize..." }

Action Types

Actions fall into categories based on what they do:

TypeDescriptionExamples
Data SyncsPull data from external services into VirtuousAIdlt_extract, file_sync
TransformationsProcess and transform dataduckdb_transform
IntegrationsPush data or call external APIshttp_request
ResearchSearch and fetch web contentweb_search, fetch_page
AI OperationsAgent delegation and AI-powered taskscall_agent, generate_query, explain_schema
OutputGenerate deliverablescreate_flashboard
Control FlowWorkflow routing and orchestrationapproval_gate, conditional_branch, call_automation

ActionKind Reference

Every action has a kind that determines its behavior. Here is the complete reference:

KindExecutionDescription
dlt_extractAsyncExtract data from external sources (Shopify, Klaviyo, etc.) to bronze layer
file_syncAsyncSync files from remote storage (S3, Google Drive)
duckdb_transformAsyncSQL-based data transformation via DuckDB
http_requestSyncMake arbitrary HTTP requests to external APIs
web_searchSyncSearch the web via Tavily API, returns titles/URLs/snippets
fetch_pageSyncExtract readable content from web page URLs
generate_querySyncGenerate and execute SQL queries against connected data schemas
explain_schemaSyncExplain table structures and column meanings
call_agentAsyncInvoke an agent by slug for delegated work
agent_searchSyncAgent-scoped web search (read-only variant)
agent_fetch_pageSyncAgent-scoped page fetch (read-only variant)
create_flashboardAsyncGenerate a persistent flashboard dashboard
approval_gateSyncPause workflow and wait for human approval
conditional_branchSyncEvaluate conditions and route execution flow
call_automationAsyncInvoke another automation as a sub-workflow

Action Lifecycle

StatusDescriptionCan Transition To
PENDINGCreated, awaiting executionRUNNING, CANCELLED, AWAITING_APPROVAL
AWAITING_APPROVALRequires human approvalRUNNING (approved), REJECTED
RUNNINGActively executingCOMPLETED, FAILED, CANCELLED
COMPLETEDFinished successfully— (terminal)
FAILEDEncountered an error— (terminal, can retry)
CANCELLEDManually cancelled— (terminal)
REJECTEDApproval denied— (terminal)

Execution Model

VirtuousAI uses two execution modes:

ModeWhen UsedCharacteristics
SYNCFast operations (under 30s)Inline execution, immediate response
ASYNC_QUEUELong-running operationsSQS + Dramatiq workers, lease-based

Lease-Based Ownership

For ASYNC_QUEUE operations:

  • Lease Duration: 90 seconds
  • Heartbeat Interval: Every 30 seconds
  • Watchdog Grace: 180 seconds before marking abandoned

If a worker crashes, the watchdog detects the stale lease and marks the run as failed, preventing zombie jobs.

ActionRun Structure

Each execution creates an ActionRun — an immutable record:

{
  "id": "run_xyz789",
  "actionId": "action_abc123",
  "status": "COMPLETED",
  "startedAt": "2026-01-22T14:30:00Z",
  "completedAt": "2026-01-22T14:32:15Z",
  "result": {
    "recordsProcessed": 1250,
    "bytesTransferred": 2100000,
    "tables": ["orders", "order_line_items"]
  },
  "artifacts": [
    { "name": "orders.parquet", "size": 1500000 }
  ]
}

ActionRuns are immutable. Even if an action is deleted, historical runs are preserved for auditing.

Creating Actions

{
  "kind": "dlt_extract",
  "connectionRef": { "slug": "shopify" },
  "definition": {
    "source": "shopify",
    "resources": ["orders", "products"],
    "incremental": true,
    "start_date": "2026-01-01"
  }
}
{
  "kind": "duckdb_transform",
  "definition": {
    "kind": "duckdb_transform",
    "transform_name": "stg_shopify_customers",
    "sql": "SELECT id as customer_id, email, ... FROM {{ bronze_customers }} QUALIFY ROW_NUMBER() OVER (PARTITION BY id ORDER BY updated_at DESC) = 1",
    "inputs": [
      { "bronze_dataset": "bronze.shopify.customers" }
    ],
    "output_layer": "silver",
    "mode": "full_refresh"
  }
}
{
  "kind": "call_agent",
  "definition": {
    "kind": "call_agent",
    "agent": "data-analyst",
    "question": "Summarize the key trends in this quarter's sales data"
  }
}

DuckDB Transform Actions

The duckdb_transform action type enables SQL-based data transformations using DuckDB. It's the primary way to move data through the medallion architecture.

Definition Schema

FieldTypeRequiredDescription
kind"duckdb_transform"YesAction type
transform_namestringYesOutput table name (e.g., stg_shopify_customers)
sqlstringYesDuckDB SQL with {{ variable }} placeholders
inputsarrayYesList of input datasets
output_layer"silver" | "gold"NoTarget layer (default: silver)
mode"full_refresh" | "incremental_merge"NoWrite mode (default: full_refresh)
merge_keysarrayNoPrimary keys for incremental merge

Input References

Inputs can reference bronze or silver datasets:

Input TypeFormatResolves To
Bronze{ "bronze_dataset": "bronze.shopify.customers" }read_parquet('s3://.../*.parquet')
Silver{ "silver_dataset": "stg_shopify_orders" }delta_scan('s3://...')

The {{ variable }} in your SQL gets replaced with the appropriate DuckDB function.

Write Modes

ModeBehaviorUse When
full_refreshReplaces entire tableSmall tables, schema changes
incremental_mergeMERGE on merge_keysLarge tables, append-heavy workloads

Example: Silver Staging Table

{
  "kind": "duckdb_transform",
  "definition": {
    "kind": "duckdb_transform",
    "transform_name": "stg_shopify_customers",
    "sql": "SELECT id as customer_id, email, first_name, last_name, CAST(total_spent AS DECIMAL(18,2)) as total_spent, created_at as created_at_utc FROM {{ bronze_customers }} QUALIFY ROW_NUMBER() OVER (PARTITION BY id ORDER BY updated_at DESC) = 1",
    "inputs": [{ "bronze_dataset": "bronze.shopify.customers" }],
    "output_layer": "silver",
    "mode": "full_refresh"
  }
}

Example: Gold Fact Table

{
  "kind": "duckdb_transform",
  "definition": {
    "kind": "duckdb_transform",
    "transform_name": "fct_order_lines",
    "sql": "SELECT li.line_item_id, o.order_id, o.customer_id, li.quantity, li.unit_price FROM {{ stg_shopify_order_line_items }} li JOIN {{ stg_shopify_orders }} o ON li.order_id = o.order_id",
    "inputs": [
      { "silver_dataset": "stg_shopify_order_line_items" },
      { "silver_dataset": "stg_shopify_orders" }
    ],
    "output_layer": "gold",
    "mode": "full_refresh"
  }
}

Learn more about the bronze/silver/gold architecture in Data Pipeline Concepts.

Connection Resolution

Actions resolve connections flexibly:

Reference TypeExampleResolution
By ID{ "id": "conn_abc123" }Exact match
By Slug{ "slug": "shopify" }LLM-friendly name lookup
By Provider{ "provider": "shopify" }Uses org's default for provider

Progress Tracking

For data extraction actions, the system provides detailed progress tracking with per-resource and per-slice visibility.

Progress Schema (v2)

Long-running extractions report progress in a structured format:

{
  "schema_version": 2,
  "resources_order": ["profiles", "events", "lists"],
  "completed_resources": ["profiles"],
  "failed_resources": [],
  "in_progress_resource": "events",
  "resource_cursors": {
    "events": {
      "slices_completed": 127,
      "slices_total": 384
    }
  }
}
FieldDescription
schema_versionAlways 2 for new runs
resources_orderOrdered list of resources to extract
completed_resourcesResources that finished successfully
failed_resourcesResources that encountered errors
in_progress_resourceCurrently extracting resource (or null)
resource_cursorsSlice-level progress for large resources

Calculating Global Progress

To compute overall percentage:

const doneCount = completed.length + failed.length
const activePercent = cursor?.slices_total
  ? (cursor.slices_completed / cursor.slices_total)
  : 0
const globalPercent = ((doneCount + activePercent) / totalResources) * 100

Partial Success

Extractions can complete with partial success when some resources succeed and others fail:

StatusConditionAction
COMPLETEDAll resources succeededNone needed
COMPLETED (partial)Some succeeded, some failedReview failed_resources
FAILEDAll resources failedCheck error details, retry

Partial success allows you to use successfully extracted data while investigating failures in specific resources.

Streaming Execution

For long-running actions, subscribe to real-time progress via Server-Sent Events:

GET /api/v1/action-runs/{run_id}/logs/stream
Accept: text/event-stream

Events include:

  • extraction_started — Extraction beginning with resource list
  • resource_started — Individual resource extraction starting
  • resource_completed — Resource finished with row/file counts
  • slice_completed — Progress update for large resources
  • run_completed — Final success with summary
  • run_failed — Error details

Error Handling

When actions fail, the run includes detailed error information:

{
  "status": "FAILED",
  "error": {
    "code": "CONNECTION_ERROR",
    "message": "Failed to connect to Shopify API",
    "details": {
      "statusCode": 401,
      "shopifyError": "Invalid API key"
    },
    "retryable": true
  }
}

Error Codes

CodeDescriptionRetryable
CONNECTION_ERRORFailed to connect to external serviceUsually yes
AUTH_ERRORAuthentication/authorization failedNo (fix credentials)
RATE_LIMITEDExternal API rate limit hitYes (with backoff)
DATA_ERRORInvalid or corrupt dataNo (fix source data)
TIMEOUTExecution exceeded time limitSometimes
INTERNAL_ERRORUnexpected system errorYes

Retry Behavior

Retryable errors are automatically retried with exponential backoff. Non-retryable errors require manual intervention.

Rate Limiting

Data extraction actions use pre-emptive rate limiting to avoid hitting vendor API limits:

SourceStrategyDetails
Amazon SP-API65s intervalsReports API has strict 1/min sustained limits
KlaviyoPer-endpoint buckets0.1s for most endpoints, 0.02s for events

This prevents 429 errors by spacing requests to stay under vendor limits. See Data Sources Guide for details.

Resource Ordering

When extracting multiple resources, VirtuousAI uses size-optimized ordering by default:

BehaviorWhen
Size-optimized (XS→XL)No resources specified, or defaults used
User-specified orderResources explicitly listed in definition

This ensures fast resources complete first, providing partial results quickly. See Data Sources Guide for details.

Best Practices

  1. Use incremental syncs — When possible, sync only new/changed data to reduce execution time
  2. Set appropriate timeouts — Configure timeouts based on expected data volume
  3. Monitor runs — Set up alerts for failed actions, especially in automations
  4. Test with small datasets — Validate action configuration before running on full data
  5. Use streaming for long runs — Subscribe to SSE for real-time progress on lengthy operations

OpenAPI Reference

For detailed endpoint schemas, request/response formats, and authentication:

On this page