A deep reference for AI agents working with the Definite MCP server. Covers querying, dashboards, Python automation pipelines, the Definite SDK, and Slack/email messaging.
Download as CLAUDE.md
# Add Definite MCP server
claude mcp add definite --transport http https://api.definite.app/v3/mcp/http --header "Authorization: YOUR_API_KEY"
# Download this page as your local CLAUDE.md
curl -o CLAUDE.md https://docs.definite.app/mcp/agent-reference.md
Setup first: See MCP Server for installation and API key setup.
Querying Data
SQL (DuckDB)
run_sql_query(sql="SELECT table_schema, table_name FROM information_schema.tables WHERE table_catalog = 'LAKE' LIMIT 100")
DuckDB rules:
- Always include
LIMIT (default to 100-2000)
- Use
information_schema for discovery (not SHOW TABLES)
- DuckLake catalog is uppercase:
WHERE table_catalog = 'LAKE'
- Table names use three-part format:
LAKE.SCHEMA.table
- JSON:
-> returns JSON, ->> returns VARCHAR. Cast for math: (data ->> 'price')::DOUBLE
Cube (Semantic Models)
run_cube_query(cube_query={
"measures": ["sales.revenue"],
"dimensions": ["sales.region"],
"timeDimensions": [{"dimension": "sales.order_date", "dateRange": "last 90 days", "granularity": "month"}],
"order": [["sales.revenue", "desc"]],
"limit": 1000
})
Use search_cube_schema to discover available cubes, dimensions, and measures.
When to Use Which
| SQL | Cube |
|---|
| Row-level detail | Aggregated KPIs |
| Ad-hoc joins, schema discovery | Time series with rollups |
| No Cube model exists | Pre-defined business logic exists |
Creating Dashboards & Reports
Use create_doc with YAML content. Call get_doc_schema for the full specification.
Minimal Dashboard
version: 1
schemaVersion: "2025-01"
kind: dashboard
metadata:
name: "Sales Dashboard"
description: "Revenue by region"
layout:
columns: 36
tiles:
- id: revenue_chart
x: 0
y: 0
w: 18
h: 10
type: chart
title: "Revenue by Region"
source: revenue_data
- id: total_kpi
x: 18
y: 0
w: 9
h: 6
type: kpi
source: total_revenue
datasets:
revenue_data:
engine: cube
cube: sales
measures:
- sales.revenue
dimensions:
- sales.region
viz:
type: bar
total_revenue:
engine: sql
sql: |
SELECT SUM(amount) as total FROM LAKE.PUBLIC.sales
viz:
type: kpi
Document Kinds
| Kind | Purpose |
|---|
dashboard | Charts, tables, KPIs |
report | Report-focused layout |
document | Rich text with embedded data |
pipeline | Code execution + scheduling (layout optional) |
Grid Layout (36-column system)
| Width | Columns | Use |
|---|
| Full | w: 36 | Headers, wide charts |
| Half | w: 18 | Side-by-side charts |
| Third | w: 12 | Three-column layout |
| Quarter | w: 9 | KPI row (4 across) |
Dataset Engines
| Engine | Use When |
|---|
| Cube | Semantic model exists; aggregations, time series, business KPIs |
| SQL | Custom queries, joins, schema discovery, no Cube model |
| Python | Transformations, API calls, combining datasets, ML |
Viz Types
line, bar, area, pie, scatter, heatmap, gauge, kpi, map, funnel, sankey, table, code
Data Apps
For a complete guide to data apps including DuckDB WASM, Perspective.js, theming, and caching, see Data Apps. This section covers programmatic creation via the MCP server.
For custom interactive UI beyond YAML tiles, create a data app: a single-file HTML application stored in Drive and rendered inside a dashboard iframe.
When to Use Data Apps vs YAML Dashboards
| YAML Dashboard | Data App |
|---|
| Standard charts, tables, KPIs | Custom interactive UI (expandable rows, conditional formatting) |
| Multiple tiles in a grid layout | Single full-screen experience |
| Cube/SQL datasets with built-in viz | Custom JavaScript visualizations (D3, Chart.js, ECharts) |
Data Bridge API
The frontend auto-injects a window.Definite object into every data app iframe. You do not need to define it.
SQL queries return an array of row objects:
const rows = await Definite.query("SELECT region, SUM(revenue) as rev FROM LAKE.PUBLIC.sales GROUP BY 1 LIMIT 1000")
// [{ region: "North America", rev: 50000 }, { region: "Europe", rev: 32000 }, ...]
Cube queries also return an array of row objects:
const rows = await Definite.cubeQuery({
measures: ["sales.revenue"],
dimensions: ["sales.region"],
timeDimensions: [{ dimension: "sales.order_date", granularity: "month", dateRange: "last 90 days" }],
limit: 1000
})
Alternatively, query data upfront via run_sql_query or run_cube_query and embed it as const DATA = [...] for static snapshots.
Creating a Data App
Step 1: Write an HTML file. Use Tailwind CDN and Inter font for consistency with Definite. Include a loading state and error handling since bridge queries are async.
<script src="https://cdn.tailwindcss.com"></script>
<body class="bg-gray-50 p-6">
<div id="app"><div id="loading">Loading...</div></div>
<script>
async function init() {
try {
const rows = await Definite.query("SELECT region, SUM(revenue) as rev FROM LAKE.PUBLIC.sales GROUP BY 1 LIMIT 1000")
document.getElementById("loading").remove()
// Build your UI with rows
} catch (err) {
document.getElementById("loading").textContent = "Error: " + err.message
}
}
init()
</script>
</body>
Step 2: Upload to Drive using get_drive_upload_url(file_name="my-app.html", folder="apps"), then curl -X PUT the signed URL. Same pattern as pipeline script uploads.
Step 3: Create a doc with a fullScreen HTML tile. No datasets needed since the app fetches its own data.
version: 1
schemaVersion: "2025-01"
kind: dashboard
metadata:
name: "Sales Dashboard"
datasets: {}
layout:
columns: 36
tiles:
- id: app
x: 0
y: 0
w: 36
h: 20
type: html
fullScreen: true
driveFile: "apps/my-app.html"
fullScreen: true removes all padding, borders, and tile headers so the HTML fills the entire page edge-to-edge.
Python Automation Pipelines
The full workflow: write a Python script, upload to Drive, create a pipeline doc, schedule it.
Step 1: Write Your Script
Use uv inline script metadata for dependencies:
#!/usr/bin/env -S uv run
# /// script
# dependencies = ["definite-sdk", "duckdb"]
# ///
from definite_sdk import DefiniteClient
import os
client = DefiniteClient(os.environ["DEFINITE_API_KEY"], api_url=os.environ["DEFINITE_API_BASE_URL"])
sql = client.get_sql_client()
result = sql.execute("SELECT * FROM LAKE.PUBLIC.users WHERE created_at >= CURRENT_DATE - INTERVAL '1 day' LIMIT 2000")
rows = result.get("data", [])
print(f"Found {len(rows)} new users")
Both DEFINITE_API_KEY and DEFINITE_API_BASE_URL are auto-injected when the script runs on Definite.
Step 2: Upload to Drive
Use the get_drive_upload_url MCP tool, then upload with curl:
# 1. Call MCP tool: get_drive_upload_url(file_name="my_sync.py", folder="pipelines")
# 2. Upload using the returned signed URL:
curl -X PUT -T ./my_sync.py "SIGNED_UPLOAD_URL"
The file is now accessible at /home/user/drive/pipelines/my_sync.py when running on Definite.
Step 3: Create Pipeline Doc
Use create_doc with pipeline YAML:
version: 1
schemaVersion: "2025-01"
kind: pipeline
metadata:
name: "Daily User Sync"
datasets:
prepare_staging:
engine: sql
sql: |
CREATE OR REPLACE TABLE LAKE.STAGING.daily_snapshot AS
SELECT * FROM LAKE.PUBLIC.users
WHERE created_at >= CURRENT_DATE - INTERVAL '1 day'
run_sync:
engine: python
command: "uv run /home/user/drive/pipelines/my_sync.py"
dependsOn: [prepare_staging]
fiEnvId: "<_THREAD_ID>"
timeoutMs: 300000
automations:
- trigger:
type: schedule
config:
type: cron
cron: "0 9 * * *"
Key Pipeline Fields
| Field | Description |
|---|
command | Shell command to run (e.g., uv run /home/user/drive/script.py) |
code | Inline Python (alternative to command for short scripts) |
dependsOn | Array of dataset IDs that must complete first |
fiEnvId | Always set to "<_THREAD_ID>" for scheduled jobs. Ensures persistent sandbox with installed packages and drive access. |
timeoutMs | Timeout in ms. Default is 6000 (6s), which is too short. Use 300000 (5min) to 1800000 (30min). |
inputs | Array of dataset IDs available as DataFrame variables in Python |
Automations
automations:
- trigger:
type: schedule
config:
type: cron
cron: "0 9 * * *" # Daily at 9am UTC
destination: # Optional: send snapshot
type: slack
channel_id: "C1234..."
Cron examples: "0 */4 * * *" (every 4h), "0 9 * * 1" (Mondays 9am), "0 0 1 * *" (1st of month)
Definite SDK
Install with uv add definite-sdk or use inline script dependencies as shown above.
from definite_sdk import DefiniteClient
import os
# In pipelines, env vars are auto-injected
# For local use, set DEFINITE_API_KEY env var or pass directly
client = DefiniteClient(
os.environ["DEFINITE_API_KEY"],
api_url=os.environ.get("DEFINITE_API_BASE_URL", "https://api.definite.app")
)
SDK source: github.com/luabase/definite_sdk
SQL & Cube Queries
sql_client = client.get_sql_client()
result = sql_client.execute("SELECT * FROM LAKE.PUBLIC.users LIMIT 10")
cube_query = {
"measures": ["sales.revenue"],
"timeDimensions": [{"dimension": "sales.date", "granularity": "month"}],
"limit": 1000
}
result = sql_client.execute_cube_query(cube_query, integration_id="YOUR_CUBE_INTEGRATION_ID")
KV Store (Pipeline State)
state = client.get_kv_store('my_pipeline') # Unique name per pipeline
last_sync = state.get('last_sync_at')
state['last_sync_at'] = '2026-01-31T00:00:00'
state.commit() # Persist changes
Secret Store
secrets = client.secret_store()
secrets.set_secret("external_api_key", "sk-...")
key = secrets.get_secret("external_api_key")
DuckLake (Local DuckDB)
import duckdb
conn = duckdb.connect()
conn.execute(client.attach_ducklake())
df = conn.sql("SELECT * FROM lake.PUBLIC.users LIMIT 10").df()
Integration Store
integrations = list(client.integration_store().list_integrations(integration_type="slack"))
slack_id = integrations[0]["id"]
Slack & Email Messaging
Slack Messages (with Block Kit)
msg = client.get_message_client()
# Simple text message
msg.send_slack_message(
integration_id="YOUR_SLACK_INTEGRATION_ID",
channel_id="C1234567890",
text="Hello from Definite!"
)
# Rich message with Slack Block Kit
msg.send_slack_message(
integration_id="YOUR_SLACK_INTEGRATION_ID",
channel_id="C1234567890",
text="Daily report ready", # Fallback for notifications
blocks=[
{"type": "header", "text": {"type": "plain_text", "text": "Daily Report"}},
{"type": "section", "text": {"type": "mrkdwn", "text": "*New users today:* 42\n*Revenue:* $12,500"}},
{"type": "actions", "elements": [
{
"type": "button",
"text": {"type": "plain_text", "text": "View Dashboard"},
"url": "https://app.definite.app/docs/DOC_ID",
"style": "primary"
}
]}
],
thread_ts="1234567890.123456" # Optional: reply to thread
)
Find your Slack integration ID: Use list_integrations(integration_type="slack") MCP tool.
Email Messages
msg = client.get_message_client()
msg.send_email_message(
to_emails=["team@example.com"],
subject="Daily Report: 42 new users",
body="<h2>New Users</h2><p>42 users signed up today.</p>" # HTML supported
)
Alert Workflows
Combine pipelines + SDK messaging for data-driven alerts.
Email Alert (Daily New Users)
#!/usr/bin/env -S uv run
# /// script
# dependencies = ["definite-sdk", "duckdb"]
# ///
from definite_sdk import DefiniteClient
import os
client = DefiniteClient(os.environ["DEFINITE_API_KEY"], api_url=os.environ["DEFINITE_API_BASE_URL"])
sql = client.get_sql_client()
result = sql.execute(
"SELECT email, created_at FROM LAKE.PUBLIC.users "
"WHERE created_at >= CURRENT_DATE - INTERVAL '1 day' LIMIT 2000"
)
rows = result.get("data", [])
if rows:
table_rows = "".join(
f"<tr><td>{r['email']}</td><td>{r['created_at']}</td></tr>" for r in rows
)
body = (
f"<h2>New Users: {len(rows)}</h2>"
f"<table><tr><th>Email</th><th>Signed Up</th></tr>{table_rows}</table>"
)
client.get_message_client().send_email_message(
to_emails=["team@example.com"],
subject=f"Daily: {len(rows)} new users",
body=body,
)
Threshold Alert (Slack with KV State)
#!/usr/bin/env -S uv run
# /// script
# dependencies = ["definite-sdk", "duckdb"]
# ///
from definite_sdk import DefiniteClient
import os
client = DefiniteClient(os.environ["DEFINITE_API_KEY"], api_url=os.environ["DEFINITE_API_BASE_URL"])
sql = client.get_sql_client()
state = client.get_kv_store('mrr_alert')
result = sql.execute(
"SELECT SUM(amount) as mrr FROM LAKE.PUBLIC.subscriptions WHERE status = 'active' LIMIT 1"
)
current_mrr = float(result["data"][0]["mrr"])
previous_mrr = float(state.get('last_mrr', '0'))
if previous_mrr > 0:
pct_change = (current_mrr - previous_mrr) / previous_mrr
if abs(pct_change) >= 0.20:
direction = "up" if pct_change > 0 else "down"
client.get_message_client().send_slack_message(
integration_id="YOUR_SLACK_INTEGRATION_ID",
channel_id="C1234567890",
text=f"MRR Alert: {direction} {abs(pct_change):.0%}",
blocks=[
{"type": "header", "text": {"type": "plain_text", "text": "MRR Alert"}},
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": f"MRR moved *{direction} {abs(pct_change):.0%}*\nFrom ${previous_mrr:,.0f} to ${current_mrr:,.0f}"
}
},
],
)
state['last_mrr'] = str(current_mrr)
state.commit()
Wrap in Pipeline Doc
Upload either script to Drive, then create a pipeline doc:
version: 1
schemaVersion: "2025-01"
kind: pipeline
metadata:
name: "MRR Alert"
datasets:
check_mrr:
engine: python
command: "uv run /home/user/drive/alerts/mrr_alert.py"
fiEnvId: "<_THREAD_ID>"
timeoutMs: 600000
automations:
- trigger:
type: schedule
config:
type: cron
cron: "0 9 * * *"
HTML Output from Python Datasets
To render custom HTML (Plotly, D3, etc.) in a dashboard tile:
import os
with open(os.environ["DEFINITE_RESULT_TYPE_FILE"], "w") as f:
f.write("html")
with open(os.environ["DEFINITE_RESULT_FILE"], "w") as f:
f.write(html_content)
Pair with type: html tiles in your dashboard layout.
CLAUDE.md for Your Project
For the best experience with Claude Code, copy the Definite CLAUDE.md template into your project root alongside your .mcp.json. This gives Claude full context on all Definite MCP tools and patterns.