Definite ships an on-prem distribution you can run in your own AWS, GCP, or Azure account. You get the full platform (data lake, semantic layer, dashboards, Fi) inside your VPC, with your data never leaving your network. The install has three pieces:Documentation Index
Fetch the complete documentation index at: https://docs.definite.app/llms.txt
Use this file to discover all available pages before exploring further.
- Cloud infrastructure: a Kubernetes cluster, a Postgres database, and an object store. For AWS we publish a Terraform module that provisions these with sane defaults. For GCP and Azure, follow the per-cloud guide below; a Terraform module for each is on the roadmap.
- The
definiteCLI: a single binary that installs the platform onto the cluster from aconfig.yamlyou fill in with values from your infra. - Day-2 operations: upgrades, maintenance, license, and logs, all driven through the same CLI.
Install guides
AWS (EKS + RDS + S3)
Provision with Terraform, install with the
definite CLI.GCP (GKE + Cloud SQL + GCS)
Provision with
gcloud, install with the definite CLI.Azure (AKS + Postgres + Blob)
Preliminary guide; reach out if you want to be a design partner.
Authentication
Google SSO
Sign in with Google Workspace via OIDC. Works on any cloud.
Local auth
Email + password (default). The first admin is bootstrapped from your
config.yaml.What you’ll need before you start
| Category | What |
|---|---|
| Cloud account | An account with permission to create a Kubernetes cluster, a managed Postgres, an object store bucket, and IAM resources |
| Local tooling | kubectl 1.28+, helm 3.12+, and your cloud’s CLI (aws / gcloud / az). AWS adds terraform 1.5+ |
| LLM credentials | Anthropic, Bedrock (AWS), Vertex (GCP), or Azure OpenAI |
| Optional | A Google Workspace account if you want Sign in with Google instead of local auth |
definite doctor command verifies all of these against your config.yaml before deploying, so you don’t have to keep this list in your head.
Architecture
An on-prem Definite deployment is one Kubernetes namespace with these workloads:- API: FastAPI service that all clients talk to.
- Frontend: the Definite web UI.
- Lakehouse: a single StatefulSet running our DuckLake query engine. Catalog file lives on a
ReadWriteOncePV; bulk data lives in your object store. - Job runner: executes scheduled syncs, Python pipelines, and webhook-triggered docs.
- Fi sandboxes: ephemeral pods spun up per agent run via
agent-sandboxCRDs.
- Postgres 15+ for the application database (users, docs, automations).
- An object store (S3, GCS, Azure Blob, or MinIO) for lakehouse Parquet files.
- An LLM provider for Fi.

