Skip to main content
A deep reference for AI agents working with the Definite MCP server. Covers querying, dashboards, Python automation pipelines, the Definite SDK, and Slack/email messaging.

Download as CLAUDE.md

# Add Definite MCP server
claude mcp add definite --transport http https://api.definite.app/v3/mcp/http --header "Authorization: YOUR_API_KEY"

# Download this page as your local CLAUDE.md
curl -o CLAUDE.md https://docs.definite.app/mcp/agent-reference.md
Setup first: See MCP Server for installation and API key setup.

Querying Data

SQL (DuckDB)

run_sql_query(sql="SELECT table_schema, table_name FROM information_schema.tables WHERE table_catalog = 'LAKE' LIMIT 100")
DuckDB rules:
  • Always include LIMIT (default to 100-2000)
  • Use information_schema for discovery (not SHOW TABLES)
  • DuckLake catalog is uppercase: WHERE table_catalog = 'LAKE'
  • Table names use three-part format: LAKE.SCHEMA.table
  • JSON: -> returns JSON, ->> returns VARCHAR. Cast for math: (data ->> 'price')::DOUBLE

Cube (Semantic Models)

run_cube_query(cube_query={
  "measures": ["sales.revenue"],
  "dimensions": ["sales.region"],
  "timeDimensions": [{"dimension": "sales.order_date", "dateRange": "last 90 days", "granularity": "month"}],
  "order": [["sales.revenue", "desc"]],
  "limit": 1000
})
Use search_cube_schema to discover available cubes, dimensions, and measures.

When to Use Which

SQLCube
Row-level detailAggregated KPIs
Ad-hoc joins, schema discoveryTime series with rollups
No Cube model existsPre-defined business logic exists

Creating Dashboards & Reports

Use create_doc with YAML content. Call get_doc_schema for the full specification.

Minimal Dashboard

version: 1
schemaVersion: "2025-01"
kind: dashboard

metadata:
  name: "Sales Dashboard"
  description: "Revenue by region"

layout:
  columns: 36
  tiles:
    - id: revenue_chart
      x: 0
      y: 0
      w: 18
      h: 10
      type: chart
      title: "Revenue by Region"
      source: revenue_data
    - id: total_kpi
      x: 18
      y: 0
      w: 9
      h: 6
      type: kpi
      source: total_revenue

datasets:
  revenue_data:
    engine: cube
    cube: sales
    measures:
      - sales.revenue
    dimensions:
      - sales.region
    viz:
      type: bar
  total_revenue:
    engine: sql
    sql: |
      SELECT SUM(amount) as total FROM LAKE.PUBLIC.sales
    viz:
      type: kpi

Document Kinds

KindPurpose
dashboardCharts, tables, KPIs
reportReport-focused layout
documentRich text with embedded data
pipelineCode execution + scheduling (layout optional)

Grid Layout (36-column system)

WidthColumnsUse
Fullw: 36Headers, wide charts
Halfw: 18Side-by-side charts
Thirdw: 12Three-column layout
Quarterw: 9KPI row (4 across)

Dataset Engines

EngineUse When
CubeSemantic model exists; aggregations, time series, business KPIs
SQLCustom queries, joins, schema discovery, no Cube model
PythonTransformations, API calls, combining datasets, ML

Viz Types

line, bar, area, pie, scatter, heatmap, gauge, kpi, map, funnel, sankey, table, code

Python Automation Pipelines

The full workflow: write a Python script, upload to Drive, create a pipeline doc, schedule it.

Step 1: Write Your Script

Use uv inline script metadata for dependencies:
#!/usr/bin/env -S uv run
# /// script
# dependencies = ["definite-sdk", "duckdb"]
# ///
from definite_sdk import DefiniteClient
import os

client = DefiniteClient(os.environ["DEFINITE_API_KEY"], api_url=os.environ["DEFINITE_API_BASE_URL"])
sql = client.get_sql_client()

result = sql.execute("SELECT * FROM LAKE.PUBLIC.users WHERE created_at >= CURRENT_DATE - INTERVAL '1 day' LIMIT 2000")
rows = result.get("data", [])
print(f"Found {len(rows)} new users")
Both DEFINITE_API_KEY and DEFINITE_API_BASE_URL are auto-injected when the script runs on Definite.

Step 2: Upload to Drive

Use the get_drive_upload_url MCP tool, then upload with curl:
# 1. Call MCP tool: get_drive_upload_url(file_name="my_sync.py", folder="pipelines")
# 2. Upload using the returned signed URL:
curl -X PUT -T ./my_sync.py "SIGNED_UPLOAD_URL"
The file is now accessible at /home/user/drive/pipelines/my_sync.py when running on Definite.

Step 3: Create Pipeline Doc

Use create_doc with pipeline YAML:
version: 1
schemaVersion: "2025-01"
kind: pipeline

metadata:
  name: "Daily User Sync"

datasets:
  prepare_staging:
    engine: sql
    sql: |
      CREATE OR REPLACE TABLE LAKE.STAGING.daily_snapshot AS
      SELECT * FROM LAKE.PUBLIC.users
      WHERE created_at >= CURRENT_DATE - INTERVAL '1 day'

  run_sync:
    engine: python
    command: "uv run /home/user/drive/pipelines/my_sync.py"
    dependsOn: [prepare_staging]
    fiEnvId: "<_THREAD_ID>"
    timeoutMs: 300000

automations:
  - trigger:
      type: schedule
      config:
        type: cron
        cron: "0 9 * * *"

Key Pipeline Fields

FieldDescription
commandShell command to run (e.g., uv run /home/user/drive/script.py)
codeInline Python (alternative to command for short scripts)
dependsOnArray of dataset IDs that must complete first
fiEnvIdAlways set to "<_THREAD_ID>" for scheduled jobs. Ensures persistent sandbox with installed packages and drive access.
timeoutMsTimeout in ms. Default is 6000 (6s), which is too short. Use 300000 (5min) to 1800000 (30min).
inputsArray of dataset IDs available as DataFrame variables in Python

Automations

automations:
  - trigger:
      type: schedule
      config:
        type: cron
        cron: "0 9 * * *"     # Daily at 9am UTC
    destination:               # Optional: send snapshot
      type: slack
      channel_id: "C1234..."
Cron examples: "0 */4 * * *" (every 4h), "0 9 * * 1" (Mondays 9am), "0 0 1 * *" (1st of month)

Definite SDK

Install with uv add definite-sdk or use inline script dependencies as shown above.
from definite_sdk import DefiniteClient
import os

# In pipelines, env vars are auto-injected
# For local use, set DEFINITE_API_KEY env var or pass directly
client = DefiniteClient(
    os.environ["DEFINITE_API_KEY"],
    api_url=os.environ.get("DEFINITE_API_BASE_URL", "https://api.definite.app")
)
SDK source: github.com/luabase/definite_sdk

SQL & Cube Queries

sql_client = client.get_sql_client()
result = sql_client.execute("SELECT * FROM LAKE.PUBLIC.users LIMIT 10")

cube_query = {
    "measures": ["sales.revenue"],
    "timeDimensions": [{"dimension": "sales.date", "granularity": "month"}],
    "limit": 1000
}
result = sql_client.execute_cube_query(cube_query, integration_id="YOUR_CUBE_INTEGRATION_ID")

KV Store (Pipeline State)

state = client.get_kv_store('my_pipeline')  # Unique name per pipeline
last_sync = state.get('last_sync_at')
state['last_sync_at'] = '2026-01-31T00:00:00'
state.commit()  # Persist changes

Secret Store

secrets = client.secret_store()
secrets.set_secret("external_api_key", "sk-...")
key = secrets.get_secret("external_api_key")

DuckLake (Local DuckDB)

import duckdb
conn = duckdb.connect()
conn.execute(client.attach_ducklake())
df = conn.sql("SELECT * FROM lake.PUBLIC.users LIMIT 10").df()

Integration Store

integrations = list(client.integration_store().list_integrations(integration_type="slack"))
slack_id = integrations[0]["id"]

Slack & Email Messaging

Slack Messages (with Block Kit)

msg = client.get_message_client()

# Simple text message
msg.send_slack_message(
    integration_id="YOUR_SLACK_INTEGRATION_ID",
    channel_id="C1234567890",
    text="Hello from Definite!"
)

# Rich message with Slack Block Kit
msg.send_slack_message(
    integration_id="YOUR_SLACK_INTEGRATION_ID",
    channel_id="C1234567890",
    text="Daily report ready",  # Fallback for notifications
    blocks=[
        {"type": "header", "text": {"type": "plain_text", "text": "Daily Report"}},
        {"type": "section", "text": {"type": "mrkdwn", "text": "*New users today:* 42\n*Revenue:* $12,500"}},
        {"type": "actions", "elements": [
            {
                "type": "button",
                "text": {"type": "plain_text", "text": "View Dashboard"},
                "url": "https://app.definite.app/docs/DOC_ID",
                "style": "primary"
            }
        ]}
    ],
    thread_ts="1234567890.123456"  # Optional: reply to thread
)
Find your Slack integration ID: Use list_integrations(integration_type="slack") MCP tool.

Email Messages

msg = client.get_message_client()
msg.send_email_message(
    to_emails=["[email protected]"],
    subject="Daily Report: 42 new users",
    body="<h2>New Users</h2><p>42 users signed up today.</p>"  # HTML supported
)

Alert Workflows

Combine pipelines + SDK messaging for data-driven alerts.

Email Alert (Daily New Users)

#!/usr/bin/env -S uv run
# /// script
# dependencies = ["definite-sdk", "duckdb"]
# ///
from definite_sdk import DefiniteClient
import os

client = DefiniteClient(os.environ["DEFINITE_API_KEY"], api_url=os.environ["DEFINITE_API_BASE_URL"])
sql = client.get_sql_client()

result = sql.execute(
    "SELECT email, created_at FROM LAKE.PUBLIC.users "
    "WHERE created_at >= CURRENT_DATE - INTERVAL '1 day' LIMIT 2000"
)
rows = result.get("data", [])

if rows:
    table_rows = "".join(
        f"<tr><td>{r['email']}</td><td>{r['created_at']}</td></tr>" for r in rows
    )
    body = (
        f"<h2>New Users: {len(rows)}</h2>"
        f"<table><tr><th>Email</th><th>Signed Up</th></tr>{table_rows}</table>"
    )
    client.get_message_client().send_email_message(
        to_emails=["[email protected]"],
        subject=f"Daily: {len(rows)} new users",
        body=body,
    )

Threshold Alert (Slack with KV State)

#!/usr/bin/env -S uv run
# /// script
# dependencies = ["definite-sdk", "duckdb"]
# ///
from definite_sdk import DefiniteClient
import os

client = DefiniteClient(os.environ["DEFINITE_API_KEY"], api_url=os.environ["DEFINITE_API_BASE_URL"])
sql = client.get_sql_client()
state = client.get_kv_store('mrr_alert')

result = sql.execute(
    "SELECT SUM(amount) as mrr FROM LAKE.PUBLIC.subscriptions WHERE status = 'active' LIMIT 1"
)
current_mrr = float(result["data"][0]["mrr"])
previous_mrr = float(state.get('last_mrr', '0'))

if previous_mrr > 0:
    pct_change = (current_mrr - previous_mrr) / previous_mrr
    if abs(pct_change) >= 0.20:
        direction = "up" if pct_change > 0 else "down"
        client.get_message_client().send_slack_message(
            integration_id="YOUR_SLACK_INTEGRATION_ID",
            channel_id="C1234567890",
            text=f"MRR Alert: {direction} {abs(pct_change):.0%}",
            blocks=[
                {"type": "header", "text": {"type": "plain_text", "text": "MRR Alert"}},
                {
                    "type": "section",
                    "text": {
                        "type": "mrkdwn",
                        "text": f"MRR moved *{direction} {abs(pct_change):.0%}*\nFrom ${previous_mrr:,.0f} to ${current_mrr:,.0f}"
                    }
                },
            ],
        )

state['last_mrr'] = str(current_mrr)
state.commit()

Wrap in Pipeline Doc

Upload either script to Drive, then create a pipeline doc:
version: 1
schemaVersion: "2025-01"
kind: pipeline

metadata:
  name: "MRR Alert"

datasets:
  check_mrr:
    engine: python
    command: "uv run /home/user/drive/alerts/mrr_alert.py"
    fiEnvId: "<_THREAD_ID>"
    timeoutMs: 600000

automations:
  - trigger:
      type: schedule
      config:
        type: cron
        cron: "0 9 * * *"

HTML Output from Python Datasets

To render custom HTML (Plotly, D3, etc.) in a dashboard tile:
import os
with open(os.environ["DEFINITE_RESULT_TYPE_FILE"], "w") as f:
    f.write("html")
with open(os.environ["DEFINITE_RESULT_FILE"], "w") as f:
    f.write(html_content)
Pair with type: html tiles in your dashboard layout.

CLAUDE.md for Your Project

For the best experience with Claude Code, copy the Definite CLAUDE.md template into your project root alongside your .mcp.json. This gives Claude full context on all Definite MCP tools and patterns.