Agent Reference - Definite

A deep reference for AI agents working with the Definite MCP server. Covers querying, dashboards, Python automation pipelines, the Definite SDK, and Slack/email messaging.

Download as CLAUDE.md

# Add Definite MCP server
claude mcp add definite --transport http https://api.definite.app/v3/mcp/http --header "Authorization: YOUR_API_KEY"

# Download this page as your local CLAUDE.md
curl -o CLAUDE.md https://docs.definite.app/mcp/agent-reference.md

Setup first: See MCP Server for installation and API key setup.

Querying Data

SQL (DuckDB)

run_sql_query(sql="SELECT table_schema, table_name FROM information_schema.tables WHERE table_catalog = 'LAKE' LIMIT 100")

DuckDB rules:

Always include LIMIT (default to 100-2000)
Use information_schema for discovery (not SHOW TABLES)
DuckLake catalog is uppercase: WHERE table_catalog = 'LAKE'
Table names use three-part format: LAKE.SCHEMA.table
JSON: -> returns JSON, ->> returns VARCHAR. Cast for math: (data ->> 'price')::DOUBLE

Cube (Semantic Models)

run_cube_query(cube_query={
  "measures": ["sales.revenue"],
  "dimensions": ["sales.region"],
  "timeDimensions": [{"dimension": "sales.order_date", "dateRange": "last 90 days", "granularity": "month"}],
  "order": [["sales.revenue", "desc"]],
  "limit": 1000
})

Use search_cube_schema to discover available cubes, dimensions, and measures.

When to Use Which

SQL	Cube
Row-level detail	Aggregated KPIs
Ad-hoc joins, schema discovery	Time series with rollups
No Cube model exists	Pre-defined business logic exists

Creating Dashboards & Reports

Use create_doc with YAML content. Call get_doc_schema for the full specification.

Minimal Dashboard

version: 1
schemaVersion: "2025-01"
kind: dashboard

metadata:
  name: "Sales Dashboard"
  description: "Revenue by region"

layout:
  columns: 36
  tiles:
    - id: revenue_chart
      x: 0
      y: 0
      w: 18
      h: 10
      type: chart
      title: "Revenue by Region"
      source: revenue_data
    - id: total_kpi
      x: 18
      y: 0
      w: 9
      h: 6
      type: kpi
      source: total_revenue

datasets:
  revenue_data:
    engine: cube
    cube: sales
    measures:
      - sales.revenue
    dimensions:
      - sales.region
    viz:
      type: bar
  total_revenue:
    engine: sql
    sql: |
      SELECT SUM(amount) as total FROM LAKE.PUBLIC.sales
    viz:
      type: kpi

Document Kinds

Kind	Purpose
`dashboard`	Charts, tables, KPIs
`report`	Report-focused layout
`document`	Rich text with embedded data
`pipeline`	Code execution + scheduling (layout optional)

Grid Layout (36-column system)

Width	Columns	Use
Full	`w: 36`	Headers, wide charts
Half	`w: 18`	Side-by-side charts
Third	`w: 12`	Three-column layout
Quarter	`w: 9`	KPI row (4 across)

Dataset Engines

Engine	Use When
Cube	Semantic model exists; aggregations, time series, business KPIs
SQL	Custom queries, joins, schema discovery, no Cube model
Python	Transformations, API calls, combining datasets, ML

Viz Types

line, bar, area, pie, scatter, heatmap, gauge, kpi, map, funnel, sankey, table, code

Python Automation Pipelines

The full workflow: write a Python script, upload to Drive, create a pipeline doc, schedule it.

Step 1: Write Your Script

Use uv inline script metadata for dependencies:

#!/usr/bin/env -S uv run
# /// script
# dependencies = ["definite-sdk", "duckdb"]
# ///
from definite_sdk import DefiniteClient
import os

client = DefiniteClient(os.environ["DEFINITE_API_KEY"], api_url=os.environ["DEFINITE_API_BASE_URL"])
sql = client.get_sql_client()

result = sql.execute("SELECT * FROM LAKE.PUBLIC.users WHERE created_at >= CURRENT_DATE - INTERVAL '1 day' LIMIT 2000")
rows = result.get("data", [])
print(f"Found {len(rows)} new users")

Both DEFINITE_API_KEY and DEFINITE_API_BASE_URL are auto-injected when the script runs on Definite.

Step 2: Upload to Drive

Use the get_drive_upload_url MCP tool, then upload with curl:

# 1. Call MCP tool: get_drive_upload_url(file_name="my_sync.py", folder="pipelines")
# 2. Upload using the returned signed URL:
curl -X PUT -T ./my_sync.py "SIGNED_UPLOAD_URL"

The file is now accessible at /home/user/drive/pipelines/my_sync.py when running on Definite.

Step 3: Create Pipeline Doc

Use create_doc with pipeline YAML:

version: 1
schemaVersion: "2025-01"
kind: pipeline

metadata:
  name: "Daily User Sync"

datasets:
  prepare_staging:
    engine: sql
    sql: |
      CREATE OR REPLACE TABLE LAKE.STAGING.daily_snapshot AS
      SELECT * FROM LAKE.PUBLIC.users
      WHERE created_at >= CURRENT_DATE - INTERVAL '1 day'

  run_sync:
    engine: python
    command: "uv run /home/user/drive/pipelines/my_sync.py"
    dependsOn: [prepare_staging]
    fiEnvId: "<_THREAD_ID>"
    timeoutMs: 300000

automations:
  - trigger:
      type: schedule
      config:
        type: cron
        cron: "0 9 * * *"

Key Pipeline Fields

Field	Description
`command`	Shell command to run (e.g., `uv run /home/user/drive/script.py`)
`code`	Inline Python (alternative to `command` for short scripts)
`dependsOn`	Array of dataset IDs that must complete first
`fiEnvId`	Always set to `"<_THREAD_ID>"` for scheduled jobs. Ensures persistent sandbox with installed packages and drive access.
`timeoutMs`	Timeout in ms. Default is 6000 (6s), which is too short. Use 300000 (5min) to 1800000 (30min).
`inputs`	Array of dataset IDs available as DataFrame variables in Python

Automations

automations:
  - trigger:
      type: schedule
      config:
        type: cron
        cron: "0 9 * * *"     # Daily at 9am UTC
    destination:               # Optional: send snapshot
      type: slack
      channel_id: "C1234..."

Cron examples: "0 */4 * * *" (every 4h), "0 9 * * 1" (Mondays 9am), "0 0 1 * *" (1st of month)

Definite SDK

Install with uv add definite-sdk or use inline script dependencies as shown above.

from definite_sdk import DefiniteClient
import os

# In pipelines, env vars are auto-injected
# For local use, set DEFINITE_API_KEY env var or pass directly
client = DefiniteClient(
    os.environ["DEFINITE_API_KEY"],
    api_url=os.environ.get("DEFINITE_API_BASE_URL", "https://api.definite.app")
)

SDK source: github.com/luabase/definite_sdk

SQL & Cube Queries

sql_client = client.get_sql_client()
result = sql_client.execute("SELECT * FROM LAKE.PUBLIC.users LIMIT 10")

cube_query = {
    "measures": ["sales.revenue"],
    "timeDimensions": [{"dimension": "sales.date", "granularity": "month"}],
    "limit": 1000
}
result = sql_client.execute_cube_query(cube_query, integration_id="YOUR_CUBE_INTEGRATION_ID")

KV Store (Pipeline State)

state = client.get_kv_store('my_pipeline')  # Unique name per pipeline
last_sync = state.get('last_sync_at')
state['last_sync_at'] = '2026-01-31T00:00:00'
state.commit()  # Persist changes

Secret Store

secrets = client.secret_store()
secrets.set_secret("external_api_key", "sk-...")
key = secrets.get_secret("external_api_key")

DuckLake (Local DuckDB)

import duckdb
conn = duckdb.connect()
conn.execute(client.attach_ducklake())
df = conn.sql("SELECT * FROM lake.PUBLIC.users LIMIT 10").df()

Integration Store

integrations = list(client.integration_store().list_integrations(integration_type="slack"))
slack_id = integrations[0]["id"]

Slack & Email Messaging

Slack Messages (with Block Kit)

msg = client.get_message_client()

# Simple text message
msg.send_slack_message(
    integration_id="YOUR_SLACK_INTEGRATION_ID",
    channel_id="C1234567890",
    text="Hello from Definite!"
)

# Rich message with Slack Block Kit
msg.send_slack_message(
    integration_id="YOUR_SLACK_INTEGRATION_ID",
    channel_id="C1234567890",
    text="Daily report ready",  # Fallback for notifications
    blocks=[
        {"type": "header", "text": {"type": "plain_text", "text": "Daily Report"}},
        {"type": "section", "text": {"type": "mrkdwn", "text": "*New users today:* 42\n*Revenue:* $12,500"}},
        {"type": "actions", "elements": [
            {
                "type": "button",
                "text": {"type": "plain_text", "text": "View Dashboard"},
                "url": "https://app.definite.app/docs/DOC_ID",
                "style": "primary"
            }
        ]}
    ],
    thread_ts="1234567890.123456"  # Optional: reply to thread
)

Find your Slack integration ID: Use list_integrations(integration_type="slack") MCP tool.

Email Messages

msg = client.get_message_client()
msg.send_email_message(
    to_emails=["[email protected]"],
    subject="Daily Report: 42 new users",
    body="<h2>New Users</h2><p>42 users signed up today.</p>"  # HTML supported
)

Alert Workflows

Combine pipelines + SDK messaging for data-driven alerts.

Email Alert (Daily New Users)

#!/usr/bin/env -S uv run
# /// script
# dependencies = ["definite-sdk", "duckdb"]
# ///
from definite_sdk import DefiniteClient
import os

client = DefiniteClient(os.environ["DEFINITE_API_KEY"], api_url=os.environ["DEFINITE_API_BASE_URL"])
sql = client.get_sql_client()

result = sql.execute(
    "SELECT email, created_at FROM LAKE.PUBLIC.users "
    "WHERE created_at >= CURRENT_DATE - INTERVAL '1 day' LIMIT 2000"
)
rows = result.get("data", [])

if rows:
    table_rows = "".join(
        f"<tr><td>{r['email']}</td><td>{r['created_at']}</td></tr>" for r in rows
    )
    body = (
        f"<h2>New Users: {len(rows)}</h2>"
        f"<table><tr><th>Email</th><th>Signed Up</th></tr>{table_rows}</table>"
    )
    client.get_message_client().send_email_message(
        to_emails=["[email protected]"],
        subject=f"Daily: {len(rows)} new users",
        body=body,
    )

Threshold Alert (Slack with KV State)

#!/usr/bin/env -S uv run
# /// script
# dependencies = ["definite-sdk", "duckdb"]
# ///
from definite_sdk import DefiniteClient
import os

client = DefiniteClient(os.environ["DEFINITE_API_KEY"], api_url=os.environ["DEFINITE_API_BASE_URL"])
sql = client.get_sql_client()
state = client.get_kv_store('mrr_alert')

result = sql.execute(
    "SELECT SUM(amount) as mrr FROM LAKE.PUBLIC.subscriptions WHERE status = 'active' LIMIT 1"
)
current_mrr = float(result["data"][0]["mrr"])
previous_mrr = float(state.get('last_mrr', '0'))

if previous_mrr > 0:
    pct_change = (current_mrr - previous_mrr) / previous_mrr
    if abs(pct_change) >= 0.20:
        direction = "up" if pct_change > 0 else "down"
        client.get_message_client().send_slack_message(
            integration_id="YOUR_SLACK_INTEGRATION_ID",
            channel_id="C1234567890",
            text=f"MRR Alert: {direction} {abs(pct_change):.0%}",
            blocks=[
                {"type": "header", "text": {"type": "plain_text", "text": "MRR Alert"}},
                {
                    "type": "section",
                    "text": {
                        "type": "mrkdwn",
                        "text": f"MRR moved *{direction} {abs(pct_change):.0%}*\nFrom ${previous_mrr:,.0f} to ${current_mrr:,.0f}"
                    }
                },
            ],
        )

state['last_mrr'] = str(current_mrr)
state.commit()

Wrap in Pipeline Doc

Upload either script to Drive, then create a pipeline doc:

version: 1
schemaVersion: "2025-01"
kind: pipeline

metadata:
  name: "MRR Alert"

datasets:
  check_mrr:
    engine: python
    command: "uv run /home/user/drive/alerts/mrr_alert.py"
    fiEnvId: "<_THREAD_ID>"
    timeoutMs: 600000

automations:
  - trigger:
      type: schedule
      config:
        type: cron
        cron: "0 9 * * *"

HTML Output from Python Datasets

To render custom HTML (Plotly, D3, etc.) in a dashboard tile:

import os
with open(os.environ["DEFINITE_RESULT_TYPE_FILE"], "w") as f:
    f.write("html")
with open(os.environ["DEFINITE_RESULT_FILE"], "w") as f:
    f.write(html_content)

Pair with type: html tiles in your dashboard layout.

CLAUDE.md for Your Project

For the best experience with Claude Code, copy the Definite CLAUDE.md template into your project root alongside your .mcp.json. This gives Claude full context on all Definite MCP tools and patterns.

Home

Getting Started

Analyze & Build

Pushing to Destinations

Extracting from Data Sources

Data Modeling

Definite API

Custom Python Functions

Workspace

​Download as CLAUDE.md

​Querying Data

​SQL (DuckDB)

​Cube (Semantic Models)

​When to Use Which

​Creating Dashboards & Reports

​Minimal Dashboard

​Document Kinds

​Grid Layout (36-column system)

​Dataset Engines

​Viz Types

​Python Automation Pipelines

​Step 1: Write Your Script

​Step 2: Upload to Drive

​Step 3: Create Pipeline Doc

​Key Pipeline Fields

​Automations

​Definite SDK

​SQL & Cube Queries

​KV Store (Pipeline State)

​Secret Store

​DuckLake (Local DuckDB)

​Integration Store

​Slack & Email Messaging

​Slack Messages (with Block Kit)

​Email Messages

​Alert Workflows

​Email Alert (Daily New Users)

​Threshold Alert (Slack with KV State)

​Wrap in Pipeline Doc

​HTML Output from Python Datasets

​CLAUDE.md for Your Project

Download as CLAUDE.md

Querying Data

SQL (DuckDB)

Cube (Semantic Models)

When to Use Which

Creating Dashboards & Reports

Minimal Dashboard

Document Kinds

Grid Layout (36-column system)

Dataset Engines

Viz Types

Python Automation Pipelines

Step 1: Write Your Script

Step 2: Upload to Drive

Step 3: Create Pipeline Doc

Key Pipeline Fields

Automations

Definite SDK

SQL & Cube Queries

KV Store (Pipeline State)

Secret Store

DuckLake (Local DuckDB)

Integration Store

Slack & Email Messaging

Slack Messages (with Block Kit)

Email Messages

Alert Workflows

Email Alert (Daily New Users)

Threshold Alert (Slack with KV State)

Wrap in Pipeline Doc

HTML Output from Python Datasets

CLAUDE.md for Your Project