Install on AWS

This guide walks you through a production-grade on-prem install of Definite on AWS. You’ll provision the infrastructure with Terraform, install the platform with the definite CLI, and tear it down cleanly when you’re done. End to end: about 30 to 40 minutes, most of which is terraform apply waiting on EKS and RDS.

What gets created

Resource	Notes
VPC	Private subnets for nodes and the database, public subnets for NAT and load balancers
EKS cluster	Kubernetes 1.30 (configurable), managed node group, OIDC provider enabled for IRSA
RDS Postgres 15	Private (not publicly accessible), reachable only from the cluster security group
S3 bucket	Lakehouse data; public access blocked, encryption and versioning on
IAM IRSA roles	Least-privilege roles for the lakehouse and for Fi’s Bedrock access
IAM user + access key	Static-credential fallback for the lakehouse (see IRSA vs access key)
`gp3` StorageClass	Cluster-default StorageClass the lakehouse PVC needs

Every name, instance size, CIDR, and count is a Terraform variable with a sane default. Override only what you need to in terraform.tfvars.

Prerequisites

AWS account access

AWS credentials in your shell with permission to create VPC, EKS, RDS, S3, and IAM resources. The Terraform provider reads the standard AWS credential chain (env vars, shared config, SSO, instance profile). No secrets live in the Terraform code.

Local tooling

Install the following on the machine you’ll run terraform and definite from:

Tool	Version	Check
`terraform`	1.5+	`terraform version`
`aws` CLI	recent	`aws --version`
`kubectl`	1.28+	`kubectl version --client`
`helm`	3.12+	`helm version`

LLM access

Decide which LLM provider Fi will use. Bedrock is the most common choice for AWS deployments; this guide uses it as the default. Anthropic, Vertex, and Azure OpenAI are also supported. If you go with Bedrock, make sure you’ve requested model access for Claude in the target region.

(Optional) Remote Terraform state

For team use, configure an S3 backend in versions.tf before applying. A backend stub is committed in the module. Single-operator installs can skip this and use local state.

Phase 1: Provision AWS infrastructure with Terraform

Clone the on-prem repo and change into the AWS Terraform module:

git clone https://github.com/definite-app/definite-onprem
cd definite-onprem/deploy/terraform/aws

1. Configure inputs

Copy the example tfvars file and edit it:

cp terraform.tfvars.example terraform.tfvars
$EDITOR terraform.tfvars

A minimal terraform.tfvars is just two lines:

region      = "<your-region>"
name_prefix = "<your-name-prefix>"

Everything else has a default. Common overrides:

# Lock the EKS public API endpoint to your office or VPN CIDRs.
cluster_endpoint_public_access_cidrs = ["<your-cidr>/24"]

# Production hardening.
rds_multi_az            = true
rds_deletion_protection = true

# Bedrock model access (must be enabled in the AWS console first).
bedrock_model_ids = ["anthropic.claude-sonnet-4-20250514-v1:0"]

See the module’s README for the full list of variables.

2. Init, plan, apply

terraform init
terraform plan
terraform apply

terraform apply takes 20 to 25 minutes (mostly EKS and RDS). When it finishes, every value you need is in terraform output.

3. Read the outputs

Inspect the non-sensitive outputs:

terraform output

For sensitive values, read them explicitly with -raw:

terraform output -raw rds_password
terraform output -raw lakehouse_s3_secret_access_key
terraform output -raw postgres_url

The values you’ll feed into config.yaml are summarized here:

Terraform output	Goes into
`cluster_name`	`aws eks update-kubeconfig --name ...`
`region`	`object_store.region`, `llm.region`
`postgres_url` (sensitive)	`postgres.url`
`rds_password` (sensitive)	`POSTGRES_PASSWORD` env var
`lakehouse_bucket_name`	`object_store.bucket`
`lakehouse_prefix`	`lakehouse.prefix`
`lakehouse_s3_access_key_id`	`S3_ACCESS_KEY_ID` env var
`lakehouse_s3_secret_access_key` (sensitive)	`S3_SECRET_ACCESS_KEY` env var
`bedrock_irsa_role_arn`	`serviceAccount.annotations` (Bedrock path)

The module also emits a ready-to-paste, non-secret config.yaml snippet:

terraform output -raw config_yaml_fragment

IRSA vs access key

The module emits both an IRSA role and an IAM user + access key for lakehouse S3 access. Both share one identical least-privilege policy.

Option	When to use
Access key (`lakehouse_s3_access_key_id`, `lakehouse_s3_secret_access_key`)	Use this today. The lakehouse reads S3 through DuckDB’s `httpfs` extension, which speaks the S3 API with static credentials, not the AWS SDK, so it can’t assume an IRSA role yet.
IRSA role (`lakehouse_irsa_role_arn`)	The end state. Once the lakehouse gains SDK-based S3 support, annotate the ServiceAccount with this role’s ARN and delete the IAM user. No infra change needed.

Fi’s Bedrock access uses IRSA today (no static credentials).

Phase 2: Install Definite with the `definite` CLI

1. Install the CLI

curl -fsSL https://storage.googleapis.com/definite-public/definite-onprem/install.sh | sh

The install script detects your OS and architecture, downloads the matching prebuilt binary from a public Google Cloud Storage bucket, verifies its SHA256 checksum, and places definite on your PATH (default: $HOME/.local/bin). Binaries are published for macOS and Linux (arm64 and x86_64), and are uploaded by the release workflow using short-lived Workload Identity Federation credentials (no long-lived keys). To pin a version, set DEFINITE_VERSION before piping to sh:

curl -fsSL https://storage.googleapis.com/definite-public/definite-onprem/install.sh \
  | DEFINITE_VERSION=v0.1.0 sh

Verify:

definite version

2. Point `kubectl` at the new cluster

aws eks update-kubeconfig \
  --region "$(terraform output -raw region)" \
  --name   "$(terraform output -raw cluster_name)"

kubectl get nodes

You should see your managed node group’s nodes in the Ready state.

3. Bootstrap cluster prerequisites

definite bootstrap installs the cluster-level pieces that definite init assumes are already present:

Prerequisite	What it provides
Ingress controller	HTTP/S routing for the deployment’s `Ingress` resource
cert-manager (+ CRDs)	TLS certificate issuance for `tls: cert_manager`
`letsencrypt-prod` ClusterIssuer	The issuer the ingress references for automatic Let’s Encrypt certs
agent-sandbox CRDs	Custom resources the Fi runtime uses to dispatch per-run sandboxes

Run it once against the fresh cluster:

definite bootstrap --acme-email you@yourcompany.com

--dry-run prints what it would install without touching the cluster. The command is idempotent; safe to re-run.

4. Discover the load balancer hostname

The ingress controller provisions an AWS Network Load Balancer. Wait for it to land, then grab its hostname:

kubectl get svc ingress-nginx-controller -n ingress-nginx \
  -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'

You’ll get something like <random-id>.elb.<your-region>.amazonaws.com. You now have two choices for what to set as deployment.hostname in your config.yaml:

Quick install (nip.io)
Production (real CNAME)

For demos and internal pilots, resolve the LB hostname to an IP and use a <ip-with-dashes>.nip.io host. No DNS configuration needed:

LB_HOST=$(kubectl get svc ingress-nginx-controller -n ingress-nginx \
  -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')
LB_IP=$(dig +short "$LB_HOST" | head -1)
HOSTNAME="${LB_IP//./-}.nip.io"
echo "Use this as deployment.hostname: $HOSTNAME"

nip.io wildcards every <ip>.nip.io host to that IP, so Let’s Encrypt issues a real cert with no extra setup.

Some OAuth providers (Google, Slack, HubSpot) reject nip.io redirect URIs. Use a real CNAME for production integrations.

Create a CNAME record in your DNS provider pointing your chosen hostname (e.g. definite.acme.com) at the LB hostname:

definite.acme.com    CNAME    <lb-hostname>.elb.<your-region>.amazonaws.com

Wait for the record to propagate, then use that hostname in config.yaml. cert-manager will issue a Let’s Encrypt cert against it on first deploy.

5. Build `config.yaml`

Start from the EKS example in the repo: examples/minimal-eks.yaml. The shape:

deployment:
  name: definite
  namespace: definite
  hostname: <your-hostname>          # from step 4
  tls: cert_manager

postgres:
  url: postgres://<rds-user>:${POSTGRES_PASSWORD}@<rds-endpoint>:5432/<db-name>

object_store:
  type: s3
  bucket: <bucket-name>              # terraform output lakehouse_bucket_name
  region: <your-region>              # terraform output region
  credentials:
    key_id:
      env: S3_ACCESS_KEY_ID
    secret:
      env: S3_SECRET_ACCESS_KEY

lakehouse:
  prefix: lake/                      # terraform output lakehouse_prefix
  storage:
    size: 50Gi
    storage_class_name: gp3

auth:
  mode: oidc                         # or `local` for username/password auth
  issuer: https://<your-okta-domain>
  client_id: definite-onprem
  client_secret:
    env: OIDC_CLIENT_SECRET

llm:
  provider: bedrock
  region: <your-region>
  model: anthropic.claude-sonnet-4-20250514-v1:0
  # Credentials via IRSA; no credentials block needed.

resources:
  api:
    replicas: 2
    cpu: "1"
    memory: 2Gi
  lakehouse:
    replicas: 1
    cpu: "4"
    memory: 16Gi
  frontend:
    replicas: 2
    cpu: 500m
    memory: 512Mi
  job_runner:
    replicas: 1
    cpu: 500m
    memory: 1Gi

The fastest way to fill in postgres.url, object_store.bucket, and friends is to paste the output of terraform output -raw config_yaml_fragment directly into your config.yaml.

For the full list of knobs (image registry overrides, ingress class, sandbox configuration, etc.), see the config reference.

6. Export secrets

config.yaml references env vars for every secret. Export them from Terraform outputs:

export POSTGRES_PASSWORD=$(terraform output -raw rds_password)
export S3_ACCESS_KEY_ID=$(terraform output -raw lakehouse_s3_access_key_id)
export S3_SECRET_ACCESS_KEY=$(terraform output -raw lakehouse_s3_secret_access_key)
export OIDC_CLIENT_SECRET=...        # only if auth.mode: oidc

Don’t commit config.yaml, terraform.tfvars, or terraform.tfstate to a public repo. The state file holds the RDS password and the S3 secret in cleartext.

7. Preflight with `definite doctor`

definite doctor --config config.yaml

doctor runs a battery of preflight checks: it connects to Postgres and runs SELECT version(), validates the Kubernetes context, checks object-store config shape, and (for Anthropic) pings the LLM API. Fix anything it flags before moving on.

8. Deploy with `definite init`

definite init --config config.yaml

init re-runs preflight, renders the bundled Helm chart with your values, and runs helm upgrade --install. It waits for pods to reach Ready by default. Useful flags:

Flag	Purpose
`--dry-run`	Render values, don’t apply
`--wait=false`	Return as soon as Helm finishes; don’t wait for pods
`--skip-preflight`	Skip `doctor` (not recommended)

Watch the rollout in another terminal:

definite status --config config.yaml
definite logs api --follow

When the pods are Ready and cert-manager has issued a cert, open your hostname in a browser and log in.

Want Google SSO instead of local auth? See the Google SSO guide. You’ll add an auth.oidc block to your config.yaml and re-run definite upgrade — no need to redo the install.

Day-2 operations

The same CLI handles upgrades, logs, license, and lakehouse maintenance. A few of the most common commands:

definite status   --config config.yaml         # `kubectl get pods,svc,ingress` for the namespace
definite logs api --follow                     # stream component logs
definite upgrade  --config config.yaml         # re-render and re-apply with the current CLI version
definite license  show                         # decode + verify the active license
definite run maintenance stats                 # lakehouse file/snapshot stats

See the CLI reference for every command and flag.

Phase 3: Teardown

When you’re done, tear down the cluster and supporting infra with Terraform:

cd deploy/terraform/aws
terraform destroy

A couple of safety rails are on by default:

Variable	Default	Effect
`rds_deletion_protection`	`false` (set to `true` for production)	When `true`, RDS refuses to be deleted; flip it to `false` first
`lakehouse_force_destroy`	`false`	Non-empty buckets won’t be deleted; flip it to `true` if you really mean to remove the bucket and its contents

For production teardowns, snapshot RDS and back up the S3 bucket first; once Terraform deletes them, they’re gone.

Troubleshooting

Symptom	Likely cause	Fix
`terraform apply` hangs on EKS	Cluster takes 15-20 min to provision; this is normal	Wait
Lakehouse pod stuck `Pending` on PVC	No default StorageClass	Confirm `gp3` StorageClass is present: `kubectl get storageclass`. The module creates it by default; set `create_gp3_storage_class = true` if you disabled it
`definite doctor` Postgres check fails	RDS security group only allows the cluster security group	This is by design. Run `doctor` from a pod in the cluster, or temporarily allowlist your IP on the RDS security group
LB hostname never appears	Ingress controller isn’t running, or the AWS Load Balancer Controller fights with `ingress-nginx`	`kubectl get pods -n ingress-nginx`; if you installed both controllers, pick one
Cert never issues	`letsencrypt-prod` ClusterIssuer missing, or DNS doesn’t resolve to the LB	`kubectl describe certificate -n definite` shows the cert-manager error
Fi can’t reach Bedrock	The Bedrock IRSA ServiceAccount isn’t annotated, or the model isn’t enabled in the region	Confirm `serviceAccount.annotations` in `config.yaml`, and request model access in the AWS Bedrock console

Next steps

Connect a data source to start ingesting data.
Set up the MCP server so Claude, Cursor, or Windsurf can query your deployment.
For deeper config (custom container registry, sandbox network policies, OIDC tuning), see the config reference.

Support

For issues or questions, contact hello@definite.app or open an issue on definite-app/definite-onprem.

Getting Started

Analyze & Build

Destinations

Extracting from Data Sources

Connect Your Database

Data Modeling

AI & Agents

Definite API

Custom Python Functions

Workspace

On-Prem

What gets created

Prerequisites

Phase 1: Provision AWS infrastructure with Terraform

1. Configure inputs

2. Init, plan, apply

3. Read the outputs

IRSA vs access key

Phase 2: Install Definite with the `definite` CLI

1. Install the CLI

2. Point `kubectl` at the new cluster

3. Bootstrap cluster prerequisites

4. Discover the load balancer hostname

5. Build `config.yaml`

6. Export secrets

7. Preflight with `definite doctor`

8. Deploy with `definite init`

Day-2 operations

Phase 3: Teardown

Troubleshooting

Next steps

Support

Getting Started

Analyze & Build

Destinations

Extracting from Data Sources

Connect Your Database

Data Modeling

AI & Agents

Definite API

Custom Python Functions

Workspace

On-Prem

Documentation Index

​What gets created

​Prerequisites

​Phase 1: Provision AWS infrastructure with Terraform

​1. Configure inputs

​2. Init, plan, apply

​3. Read the outputs

​IRSA vs access key

​Phase 2: Install Definite with the definite CLI

​1. Install the CLI

​2. Point kubectl at the new cluster

​3. Bootstrap cluster prerequisites

​4. Discover the load balancer hostname

​5. Build config.yaml

​6. Export secrets

​7. Preflight with definite doctor

​8. Deploy with definite init

​Day-2 operations

​Phase 3: Teardown

​Troubleshooting

​Next steps

​Support

What gets created

Prerequisites

Phase 1: Provision AWS infrastructure with Terraform

1. Configure inputs

2. Init, plan, apply

3. Read the outputs

IRSA vs access key

Phase 2: Install Definite with the `definite` CLI

1. Install the CLI

2. Point `kubectl` at the new cluster

3. Bootstrap cluster prerequisites

4. Discover the load balancer hostname

5. Build `config.yaml`

6. Export secrets

7. Preflight with `definite doctor`

8. Deploy with `definite init`

Day-2 operations

Phase 3: Teardown

Troubleshooting

Next steps

Support