A deep reference for AI agents working with the Definite MCP server. Covers querying, dashboards, Python automation pipelines, the Definite SDK, and Slack/email messaging.
Download as CLAUDE.md
# Add Definite MCP server
claude mcp add definite --transport http https://api.definite.app/v3/mcp/http --header "Authorization: YOUR_API_KEY"
# Download this page as your local CLAUDE.md
curl -o CLAUDE.md https://docs.definite.app/mcp/agent-reference.md
Setup first: See MCP Server for installation and API key setup.
Querying Data
SQL (DuckDB)
run_sql_query(sql="SELECT table_schema, table_name FROM information_schema.tables WHERE table_catalog = 'LAKE' LIMIT 100")
DuckDB rules:
- Always include
LIMIT (default to 100-2000)
- Use
information_schema for discovery (not SHOW TABLES)
- DuckLake catalog is uppercase:
WHERE table_catalog = 'LAKE'
- Table names use three-part format:
LAKE.SCHEMA.table
- JSON:
-> returns JSON, ->> returns VARCHAR. Cast for math: (data ->> 'price')::DOUBLE
Cube (Semantic Models)
run_cube_query(cube_query={
"measures": ["sales.revenue"],
"dimensions": ["sales.region"],
"timeDimensions": [{"dimension": "sales.order_date", "dateRange": "last 90 days", "granularity": "month"}],
"order": [["sales.revenue", "desc"]],
"limit": 1000
})
Use search_cube_schema to discover available cubes, dimensions, and measures.
When to Use Which
| SQL | Cube |
|---|
| Row-level detail | Aggregated KPIs |
| Ad-hoc joins, schema discovery | Time series with rollups |
| No Cube model exists | Pre-defined business logic exists |
Creating Dashboards & Reports
Use create_doc with YAML content. Call get_doc_schema for the full specification.
Minimal Dashboard
version: 1
schemaVersion: "2025-01"
kind: dashboard
metadata:
name: "Sales Dashboard"
description: "Revenue by region"
layout:
columns: 36
tiles:
- id: revenue_chart
x: 0
y: 0
w: 18
h: 10
type: chart
title: "Revenue by Region"
source: revenue_data
- id: total_kpi
x: 18
y: 0
w: 9
h: 6
type: kpi
source: total_revenue
datasets:
revenue_data:
engine: cube
cube: sales
measures:
- sales.revenue
dimensions:
- sales.region
viz:
type: bar
total_revenue:
engine: sql
sql: |
SELECT SUM(amount) as total FROM LAKE.PUBLIC.sales
viz:
type: kpi
Document Kinds
| Kind | Purpose |
|---|
dashboard | Charts, tables, KPIs |
report | Report-focused layout |
document | Rich text with embedded data |
pipeline | Code execution + scheduling (layout optional) |
Grid Layout (36-column system)
| Width | Columns | Use |
|---|
| Full | w: 36 | Headers, wide charts |
| Half | w: 18 | Side-by-side charts |
| Third | w: 12 | Three-column layout |
| Quarter | w: 9 | KPI row (4 across) |
Dataset Engines
| Engine | Use When |
|---|
| Cube | Semantic model exists; aggregations, time series, business KPIs |
| SQL | Custom queries, joins, schema discovery, no Cube model |
| Python | Transformations, API calls, combining datasets, ML |
Viz Types
line, bar, area, pie, scatter, heatmap, gauge, kpi, map, funnel, sankey, table, code
Python Automation Pipelines
The full workflow: write a Python script, upload to Drive, create a pipeline doc, schedule it.
Step 1: Write Your Script
Use uv inline script metadata for dependencies:
#!/usr/bin/env -S uv run
# /// script
# dependencies = ["definite-sdk", "duckdb"]
# ///
from definite_sdk import DefiniteClient
import os
client = DefiniteClient(os.environ["DEFINITE_API_KEY"], api_url=os.environ["DEFINITE_API_BASE_URL"])
sql = client.get_sql_client()
result = sql.execute("SELECT * FROM LAKE.PUBLIC.users WHERE created_at >= CURRENT_DATE - INTERVAL '1 day' LIMIT 2000")
rows = result.get("data", [])
print(f"Found {len(rows)} new users")
Both DEFINITE_API_KEY and DEFINITE_API_BASE_URL are auto-injected when the script runs on Definite.
Step 2: Upload to Drive
Use the get_drive_upload_url MCP tool, then upload with curl:
# 1. Call MCP tool: get_drive_upload_url(file_name="my_sync.py", folder="pipelines")
# 2. Upload using the returned signed URL:
curl -X PUT -T ./my_sync.py "SIGNED_UPLOAD_URL"
The file is now accessible at /home/user/drive/pipelines/my_sync.py when running on Definite.
Step 3: Create Pipeline Doc
Use create_doc with pipeline YAML:
version: 1
schemaVersion: "2025-01"
kind: pipeline
metadata:
name: "Daily User Sync"
datasets:
prepare_staging:
engine: sql
sql: |
CREATE OR REPLACE TABLE LAKE.STAGING.daily_snapshot AS
SELECT * FROM LAKE.PUBLIC.users
WHERE created_at >= CURRENT_DATE - INTERVAL '1 day'
run_sync:
engine: python
command: "uv run /home/user/drive/pipelines/my_sync.py"
dependsOn: [prepare_staging]
fiEnvId: "<_THREAD_ID>"
timeoutMs: 300000
automations:
- trigger:
type: schedule
config:
type: cron
cron: "0 9 * * *"
Key Pipeline Fields
| Field | Description |
|---|
command | Shell command to run (e.g., uv run /home/user/drive/script.py) |
code | Inline Python (alternative to command for short scripts) |
dependsOn | Array of dataset IDs that must complete first |
fiEnvId | Always set to "<_THREAD_ID>" for scheduled jobs. Ensures persistent sandbox with installed packages and drive access. |
timeoutMs | Timeout in ms. Default is 6000 (6s), which is too short. Use 300000 (5min) to 1800000 (30min). |
inputs | Array of dataset IDs available as DataFrame variables in Python |
Automations
automations:
- trigger:
type: schedule
config:
type: cron
cron: "0 9 * * *" # Daily at 9am UTC
destination: # Optional: send snapshot
type: slack
channel_id: "C1234..."
Cron examples: "0 */4 * * *" (every 4h), "0 9 * * 1" (Mondays 9am), "0 0 1 * *" (1st of month)
Definite SDK
Install with uv add definite-sdk or use inline script dependencies as shown above.
from definite_sdk import DefiniteClient
import os
# In pipelines, env vars are auto-injected
# For local use, set DEFINITE_API_KEY env var or pass directly
client = DefiniteClient(
os.environ["DEFINITE_API_KEY"],
api_url=os.environ.get("DEFINITE_API_BASE_URL", "https://api.definite.app")
)
SDK source: github.com/luabase/definite_sdk
SQL & Cube Queries
sql_client = client.get_sql_client()
result = sql_client.execute("SELECT * FROM LAKE.PUBLIC.users LIMIT 10")
cube_query = {
"measures": ["sales.revenue"],
"timeDimensions": [{"dimension": "sales.date", "granularity": "month"}],
"limit": 1000
}
result = sql_client.execute_cube_query(cube_query, integration_id="YOUR_CUBE_INTEGRATION_ID")
KV Store (Pipeline State)
state = client.get_kv_store('my_pipeline') # Unique name per pipeline
last_sync = state.get('last_sync_at')
state['last_sync_at'] = '2026-01-31T00:00:00'
state.commit() # Persist changes
Secret Store
secrets = client.secret_store()
secrets.set_secret("external_api_key", "sk-...")
key = secrets.get_secret("external_api_key")
DuckLake (Local DuckDB)
import duckdb
conn = duckdb.connect()
conn.execute(client.attach_ducklake())
df = conn.sql("SELECT * FROM lake.PUBLIC.users LIMIT 10").df()
Integration Store
integrations = list(client.integration_store().list_integrations(integration_type="slack"))
slack_id = integrations[0]["id"]
Slack & Email Messaging
Slack Messages (with Block Kit)
msg = client.get_message_client()
# Simple text message
msg.send_slack_message(
integration_id="YOUR_SLACK_INTEGRATION_ID",
channel_id="C1234567890",
text="Hello from Definite!"
)
# Rich message with Slack Block Kit
msg.send_slack_message(
integration_id="YOUR_SLACK_INTEGRATION_ID",
channel_id="C1234567890",
text="Daily report ready", # Fallback for notifications
blocks=[
{"type": "header", "text": {"type": "plain_text", "text": "Daily Report"}},
{"type": "section", "text": {"type": "mrkdwn", "text": "*New users today:* 42\n*Revenue:* $12,500"}},
{"type": "actions", "elements": [
{
"type": "button",
"text": {"type": "plain_text", "text": "View Dashboard"},
"url": "https://app.definite.app/docs/DOC_ID",
"style": "primary"
}
]}
],
thread_ts="1234567890.123456" # Optional: reply to thread
)
Find your Slack integration ID: Use list_integrations(integration_type="slack") MCP tool.
Email Messages
msg = client.get_message_client()
msg.send_email_message(
to_emails=["[email protected]"],
subject="Daily Report: 42 new users",
body="<h2>New Users</h2><p>42 users signed up today.</p>" # HTML supported
)
Alert Workflows
Combine pipelines + SDK messaging for data-driven alerts.
Email Alert (Daily New Users)
#!/usr/bin/env -S uv run
# /// script
# dependencies = ["definite-sdk", "duckdb"]
# ///
from definite_sdk import DefiniteClient
import os
client = DefiniteClient(os.environ["DEFINITE_API_KEY"], api_url=os.environ["DEFINITE_API_BASE_URL"])
sql = client.get_sql_client()
result = sql.execute(
"SELECT email, created_at FROM LAKE.PUBLIC.users "
"WHERE created_at >= CURRENT_DATE - INTERVAL '1 day' LIMIT 2000"
)
rows = result.get("data", [])
if rows:
table_rows = "".join(
f"<tr><td>{r['email']}</td><td>{r['created_at']}</td></tr>" for r in rows
)
body = (
f"<h2>New Users: {len(rows)}</h2>"
f"<table><tr><th>Email</th><th>Signed Up</th></tr>{table_rows}</table>"
)
client.get_message_client().send_email_message(
to_emails=["[email protected]"],
subject=f"Daily: {len(rows)} new users",
body=body,
)
Threshold Alert (Slack with KV State)
#!/usr/bin/env -S uv run
# /// script
# dependencies = ["definite-sdk", "duckdb"]
# ///
from definite_sdk import DefiniteClient
import os
client = DefiniteClient(os.environ["DEFINITE_API_KEY"], api_url=os.environ["DEFINITE_API_BASE_URL"])
sql = client.get_sql_client()
state = client.get_kv_store('mrr_alert')
result = sql.execute(
"SELECT SUM(amount) as mrr FROM LAKE.PUBLIC.subscriptions WHERE status = 'active' LIMIT 1"
)
current_mrr = float(result["data"][0]["mrr"])
previous_mrr = float(state.get('last_mrr', '0'))
if previous_mrr > 0:
pct_change = (current_mrr - previous_mrr) / previous_mrr
if abs(pct_change) >= 0.20:
direction = "up" if pct_change > 0 else "down"
client.get_message_client().send_slack_message(
integration_id="YOUR_SLACK_INTEGRATION_ID",
channel_id="C1234567890",
text=f"MRR Alert: {direction} {abs(pct_change):.0%}",
blocks=[
{"type": "header", "text": {"type": "plain_text", "text": "MRR Alert"}},
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": f"MRR moved *{direction} {abs(pct_change):.0%}*\nFrom ${previous_mrr:,.0f} to ${current_mrr:,.0f}"
}
},
],
)
state['last_mrr'] = str(current_mrr)
state.commit()
Wrap in Pipeline Doc
Upload either script to Drive, then create a pipeline doc:
version: 1
schemaVersion: "2025-01"
kind: pipeline
metadata:
name: "MRR Alert"
datasets:
check_mrr:
engine: python
command: "uv run /home/user/drive/alerts/mrr_alert.py"
fiEnvId: "<_THREAD_ID>"
timeoutMs: 600000
automations:
- trigger:
type: schedule
config:
type: cron
cron: "0 9 * * *"
HTML Output from Python Datasets
To render custom HTML (Plotly, D3, etc.) in a dashboard tile:
import os
with open(os.environ["DEFINITE_RESULT_TYPE_FILE"], "w") as f:
f.write("html")
with open(os.environ["DEFINITE_RESULT_FILE"], "w") as f:
f.write(html_content)
Pair with type: html tiles in your dashboard layout.
CLAUDE.md for Your Project
For the best experience with Claude Code, copy the Definite CLAUDE.md template into your project root alongside your .mcp.json. This gives Claude full context on all Definite MCP tools and patterns.