Skip to main content
Webhooks let you trigger Python datasets in Definite docs from external systems with a simple HTTPS POST. Send JSON to our endpoint, and Definite will execute your doc’s Python datasets with access to your data. This enables real-time pipelines for events like meeting transcripts, signups, payments, orders, alerts, and more.

How it works

1

Create a doc with a Python dataset that processes webhook data.
2

Send a POST to the webhook endpoint with your doc ID and JSON payload.
3

Authenticate using your API key as a bearer token.
4

Execute: Definite spins up a sandbox, writes your payload as a file, and runs the Python dataset.
5

Respond: You get a response with per-dataset execution results.

Endpoint

POST /v4/webhook/docs/{doc_id}/execute Where {doc_id} is the UUID of the doc containing your Python dataset(s).

Authentication

Authenticate with your API key in the Authorization header:
Authorization: Bearer YOUR_API_KEY
API key format: {user_id}-{api_key_suffix} Get your API key from the bottom-left user menu in the Definite app.

Request Body

{
  "data": {
    "event_type": "meeting_end",
    "transcript": [
      {"speaker": "Alice", "text": "Let's review Q1 results"}
    ],
    "summary": "Team reviewed Q1 results."
  },
  "environmentVariables": {
    "CUSTOM_VAR": "value"
  },
  "datasetKeys": ["process_webhook"]
}
FieldTypeDescription
dataobjectArbitrary JSON payload. Written as a file in the sandbox, accessible via WEBHOOK_DATA_FILE env var.
environmentVariablesobjectAdditional environment variables injected into the sandbox.
datasetKeysstring[]Execute only these datasets. Default: all Python datasets in the doc.

Data injection

Your data payload is written as a JSON file inside the execution sandbox. The file path is available via the WEBHOOK_DATA_FILE environment variable. This approach supports large payloads (e.g., full meeting transcripts) without hitting environment variable size limits.

Response

The response is wrapped in a standard v4 envelope:
{
  "success": true,
  "data": {
    "docId": "your-doc-uuid",
    "docName": "Webhook Processor",
    "results": {
      "process_webhook": {
        "success": true,
        "executionId": "uuid-string",
        "error": null
      }
    }
  },
  "meta": {
    "requestId": "uuid",
    "durationMs": 1234
  }
}
Each entry in results corresponds to a Python dataset that was executed:
FieldDescription
successWhether the dataset executed without errors
executionIdUUID for retrieving execution logs
errorError message if execution failed

Reading webhook data in Python

import json, os

# Read the webhook payload from the file
data = json.loads(open(os.environ["WEBHOOK_DATA_FILE"]).read())

event_type = data.get("event_type")
transcript = data.get("transcript", [])

print(f"Processing {event_type} with {len(transcript)} turns")

# Access additional environment variables
custom_var = os.environ.get("CUSTOM_VAR")
Both DEFINITE_API_KEY and DEFINITE_API_BASE_URL are automatically available in the sandbox, so you can use the Definite SDK to write data back to DuckLake.

Example: Pipeline doc for webhook processing

Create a doc with a Python dataset that processes incoming webhook data:
version: 1
schemaVersion: "2025-01"
kind: pipeline

metadata:
  name: "Transcript Processor"

datasets:
  process_webhook:
    engine: python
    code: |
      import json, os
      from definite_sdk import DefiniteClient
      import duckdb

      # Read webhook payload
      data = json.loads(open(os.environ["WEBHOOK_DATA_FILE"]).read())

      # Set up DuckLake connection
      client = DefiniteClient(
          os.environ["DEFINITE_API_KEY"],
          api_url=os.environ["DEFINITE_API_BASE_URL"]
      )
      conn = duckdb.connect()
      conn.execute(client.attach_ducklake())

      # Write to DuckLake
      conn.execute("""
          INSERT INTO lake.transcripts.meetings
          VALUES (?, ?, ?, CURRENT_TIMESTAMP)
      """, [data["session_id"], data["title"], json.dumps(data["transcript"])])

      print(f"Ingested meeting: {data['title']}")
    timeoutMs: 120000

Examples

Basic webhook call

curl -X POST https://api.definite.app/v4/webhook/docs/YOUR_DOC_ID/execute \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "data": {
      "event_type": "user.created",
      "user_id": "123",
      "email": "user@example.com"
    }
  }'

With environment variables and dataset filter

curl -X POST https://api.definite.app/v4/webhook/docs/YOUR_DOC_ID/execute \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "data": {
      "event_type": "payment.succeeded",
      "amount": 99.99
    },
    "environmentVariables": {
      "STRIPE_KEY": "sk_test_123"
    },
    "datasetKeys": ["process_payment"]
  }'

Python client

import httpx

response = httpx.post(
    "https://api.definite.app/v4/webhook/docs/YOUR_DOC_ID/execute",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={
        "data": {
            "session_id": "abc-123",
            "title": "Weekly Sync",
            "transcript": [
                {"speaker": "Alice", "text": "Let's review Q1"},
                {"speaker": "Bob", "text": "Revenue is up 20%"},
            ],
        }
    },
)
print(response.json())

Common use cases

  • Meeting transcript ingestion: Receive webhooks from Read.ai, Otter.ai, or similar services
  • Event processing: Payments, signups, order lifecycle events
  • Streaming ingestion: Telemetry, IoT, monitoring alerts
  • Workflow automation: Trigger transformations or enrichment on external events
  • Third-party callbacks: Process responses from external integrations

Best practices

  • Use bearer auth in the Authorization header.
  • Use datasetKeys to target specific datasets when your doc has multiple.
  • Include an idempotency key (e.g., session_id in data) if your sender may retry.
  • Separate secrets from data: Pass secrets via environmentVariables, not in data.
  • Log executionId from responses to trace runs in Definite.
  • Set timeoutMs on your Python dataset (default is 6s, which is too short for most webhook processing).

Troubleshooting

StatusCauseFix
401Invalid or missing API keyCheck API key format and Authorization header
404Doc not found, archived, or belongs to a different teamVerify doc ID and that your API key has access
422Invalid request bodyValidate JSON structure
5xxServer errorRetry with exponential backoff