This guide walks you through a production-grade on-prem install of Definite on AWS. You’ll provision the infrastructure with Terraform, install the platform with theDocumentation Index
Fetch the complete documentation index at: https://docs.definite.app/llms.txt
Use this file to discover all available pages before exploring further.
definite CLI, and tear it down cleanly when you’re done.
End to end: about 30 to 40 minutes, most of which is terraform apply waiting on EKS and RDS.
What gets created
| Resource | Notes |
|---|---|
| VPC | Private subnets for nodes and the database, public subnets for NAT and load balancers |
| EKS cluster | Kubernetes 1.30 (configurable), managed node group, OIDC provider enabled for IRSA |
| RDS Postgres 15 | Private (not publicly accessible), reachable only from the cluster security group |
| S3 bucket | Lakehouse data; public access blocked, encryption and versioning on |
| IAM IRSA roles | Least-privilege roles for the lakehouse and for Fi’s Bedrock access |
| IAM user + access key | Static-credential fallback for the lakehouse (see IRSA vs access key) |
gp3 StorageClass | Cluster-default StorageClass the lakehouse PVC needs |
Every name, instance size, CIDR, and count is a Terraform variable with a sane default. Override only what you need to in
terraform.tfvars.Prerequisites
AWS account access
AWS credentials in your shell with permission to create VPC, EKS, RDS, S3, and IAM resources. The Terraform provider reads the standard AWS credential chain (env vars, shared config, SSO, instance profile). No secrets live in the Terraform code.
Local tooling
Install the following on the machine you’ll run
terraform and definite from:| Tool | Version | Check |
|---|---|---|
terraform | 1.5+ | terraform version |
aws CLI | recent | aws --version |
kubectl | 1.28+ | kubectl version --client |
helm | 3.12+ | helm version |
LLM access
Decide which LLM provider Fi will use. Bedrock is the most common choice for AWS deployments; this guide uses it as the default. Anthropic, Vertex, and Azure OpenAI are also supported. If you go with Bedrock, make sure you’ve requested model access for Claude in the target region.
Phase 1: Provision AWS infrastructure with Terraform
Clone the on-prem repo and change into the AWS Terraform module:1. Configure inputs
Copy the example tfvars file and edit it:terraform.tfvars is just two lines:
2. Init, plan, apply
terraform apply takes 20 to 25 minutes (mostly EKS and RDS). When it finishes, every value you need is in terraform output.
3. Read the outputs
Inspect the non-sensitive outputs:-raw:
config.yaml are summarized here:
| Terraform output | Goes into |
|---|---|
cluster_name | aws eks update-kubeconfig --name ... |
region | object_store.region, llm.region |
postgres_url (sensitive) | postgres.url |
rds_password (sensitive) | POSTGRES_PASSWORD env var |
lakehouse_bucket_name | object_store.bucket |
lakehouse_prefix | lakehouse.prefix |
lakehouse_s3_access_key_id | S3_ACCESS_KEY_ID env var |
lakehouse_s3_secret_access_key (sensitive) | S3_SECRET_ACCESS_KEY env var |
bedrock_irsa_role_arn | serviceAccount.annotations (Bedrock path) |
config.yaml snippet:
IRSA vs access key
The module emits both an IRSA role and an IAM user + access key for lakehouse S3 access. Both share one identical least-privilege policy.| Option | When to use |
|---|---|
Access key (lakehouse_s3_access_key_id, lakehouse_s3_secret_access_key) | Use this today. The lakehouse reads S3 through DuckDB’s httpfs extension, which speaks the S3 API with static credentials, not the AWS SDK, so it can’t assume an IRSA role yet. |
IRSA role (lakehouse_irsa_role_arn) | The end state. Once the lakehouse gains SDK-based S3 support, annotate the ServiceAccount with this role’s ARN and delete the IAM user. No infra change needed. |
Phase 2: Install Definite with the definite CLI
1. Install the CLI
definite on your PATH (default: $HOME/.local/bin). Binaries are published for macOS and Linux (arm64 and x86_64), and are uploaded by the release workflow using short-lived Workload Identity Federation credentials (no long-lived keys).
To pin a version, set DEFINITE_VERSION before piping to sh:
2. Point kubectl at the new cluster
Ready state.
3. Bootstrap cluster prerequisites
definite bootstrap installs the cluster-level pieces that definite init assumes are already present:
| Prerequisite | What it provides |
|---|---|
| Ingress controller | HTTP/S routing for the deployment’s Ingress resource |
| cert-manager (+ CRDs) | TLS certificate issuance for tls: cert_manager |
letsencrypt-prod ClusterIssuer | The issuer the ingress references for automatic Let’s Encrypt certs |
| agent-sandbox CRDs | Custom resources the Fi runtime uses to dispatch per-run sandboxes |
--dry-run prints what it would install without touching the cluster. The command is idempotent; safe to re-run.
4. Discover the load balancer hostname
The ingress controller provisions an AWS Network Load Balancer. Wait for it to land, then grab its hostname:<random-id>.elb.<your-region>.amazonaws.com.
You now have two choices for what to set as deployment.hostname in your config.yaml:
- Quick install (nip.io)
- Production (real CNAME)
For demos and internal pilots, resolve the LB hostname to an IP and use a
<ip-with-dashes>.nip.io host. No DNS configuration needed:nip.io wildcards every <ip>.nip.io host to that IP, so Let’s Encrypt issues a real cert with no extra setup.5. Build config.yaml
Start from the EKS example in the repo: examples/minimal-eks.yaml. The shape:
6. Export secrets
config.yaml references env vars for every secret. Export them from Terraform outputs:
7. Preflight with definite doctor
doctor runs a battery of preflight checks: it connects to Postgres and runs SELECT version(), validates the Kubernetes context, checks object-store config shape, and (for Anthropic) pings the LLM API. Fix anything it flags before moving on.
8. Deploy with definite init
init re-runs preflight, renders the bundled Helm chart with your values, and runs helm upgrade --install. It waits for pods to reach Ready by default. Useful flags:
| Flag | Purpose |
|---|---|
--dry-run | Render values, don’t apply |
--wait=false | Return as soon as Helm finishes; don’t wait for pods |
--skip-preflight | Skip doctor (not recommended) |
Ready and cert-manager has issued a cert, open your hostname in a browser and log in.
Want Google SSO instead of local auth? See the Google SSO guide. You’ll add an
auth.oidc block to your config.yaml and re-run definite upgrade — no need to redo the install.Day-2 operations
The same CLI handles upgrades, logs, license, and lakehouse maintenance. A few of the most common commands:Phase 3: Teardown
When you’re done, tear down the cluster and supporting infra with Terraform:| Variable | Default | Effect |
|---|---|---|
rds_deletion_protection | false (set to true for production) | When true, RDS refuses to be deleted; flip it to false first |
lakehouse_force_destroy | false | Non-empty buckets won’t be deleted; flip it to true if you really mean to remove the bucket and its contents |
Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
terraform apply hangs on EKS | Cluster takes 15-20 min to provision; this is normal | Wait |
Lakehouse pod stuck Pending on PVC | No default StorageClass | Confirm gp3 StorageClass is present: kubectl get storageclass. The module creates it by default; set create_gp3_storage_class = true if you disabled it |
definite doctor Postgres check fails | RDS security group only allows the cluster security group | This is by design. Run doctor from a pod in the cluster, or temporarily allowlist your IP on the RDS security group |
| LB hostname never appears | Ingress controller isn’t running, or the AWS Load Balancer Controller fights with ingress-nginx | kubectl get pods -n ingress-nginx; if you installed both controllers, pick one |
| Cert never issues | letsencrypt-prod ClusterIssuer missing, or DNS doesn’t resolve to the LB | kubectl describe certificate -n definite shows the cert-manager error |
| Fi can’t reach Bedrock | The Bedrock IRSA ServiceAccount isn’t annotated, or the model isn’t enabled in the region | Confirm serviceAccount.annotations in config.yaml, and request model access in the AWS Bedrock console |
Next steps
- Connect a data source to start ingesting data.
- Set up the MCP server so Claude, Cursor, or Windsurf can query your deployment.
- For deeper config (custom container registry, sandbox network policies, OIDC tuning), see the config reference.

