Install on GCP

This guide walks you through an on-prem install of Definite on Google Cloud. You’ll provision the infrastructure with gcloud, install the platform with the definite CLI, and tear it down cleanly when you’re done. End to end: about 30 to 40 minutes, most of which is GKE and Cloud SQL spinning up.

Unlike the AWS guide, GCP does not yet ship a Terraform module. The steps below use gcloud directly. A GCP Terraform module (matching the AWS module pattern in deploy/terraform/aws) is on the roadmap. If you’d prefer to wait or want early access, contact hello@definite.app.

What gets created

Resource	Notes
VPC + subnet	Regional VPC with a subnet for the GKE nodes; private cluster recommended
GKE cluster	Kubernetes 1.30 (configurable). Autopilot works; Standard is also supported
Cloud SQL Postgres 15	Private IP, reachable only from the VPC
GCS bucket	Lakehouse data; uniform bucket-level access, versioning recommended
GCS HMAC key	Service-account-bound HMAC key the lakehouse uses to read/write the bucket (see HMAC vs Workload Identity)
Artifact Registry repo (optional)	Only needed if you mirror Definite images into your project
`premium-rwo` StorageClass	GKE shorthand for PD-SSD; the lakehouse PVC binds to this by default

Every name, region, machine type, and CIDR below is something you choose. Set them once at the top of your shell session as env vars (PROJECT_ID, REGION, etc.) and reuse them.

Prerequisites

GCP project access

A GCP project where you have permission to create GKE clusters, Cloud SQL instances, GCS buckets, service accounts, and IAM bindings. The gcloud CLI must be authenticated (gcloud auth login) and the project set (gcloud config set project <your-project-id>).

Local tooling

Install the following on the machine you’ll run gcloud and definite from:

Tool	Version	Check
`gcloud`	recent	`gcloud version`
`kubectl`	1.28+	`kubectl version --client`
`helm`	3.12+	`helm version`
`gke-gcloud-auth-plugin`	recent	`gke-gcloud-auth-plugin --version`

Enable APIs

Turn on the APIs the install touches:

gcloud services enable \
  container.googleapis.com \
  sqladmin.googleapis.com \
  compute.googleapis.com \
  servicenetworking.googleapis.com \
  iamcredentials.googleapis.com \
  aiplatform.googleapis.com

The last one (aiplatform.googleapis.com) is only needed if Fi will use Vertex AI as its LLM provider.

LLM access

Decide which LLM provider Fi will use. Vertex AI is the most common choice for GKE deployments because it lives in the same project and authenticates via the cluster’s identity. Anthropic, Bedrock, and Azure OpenAI are also supported. If you go with Vertex, make sure the Claude model you want is enabled for your project (Vertex Model Garden -> Anthropic models).

Phase 1: Provision GCP infrastructure

The fastest path right now is to provision with gcloud directly. The shape mirrors what the AWS Terraform module produces. Set up shared env vars first:

export PROJECT_ID="<your-project-id>"
export REGION="<your-region>"                # e.g. us-central1
export CLUSTER_NAME="<your-cluster-name>"    # e.g. definite
export SQL_INSTANCE="<your-sql-instance>"    # e.g. definite-db
export BUCKET="<your-bucket-name>"           # e.g. <project>-definite-lake
export SA_NAME="<your-sa-name>"              # e.g. definite-lakehouse

gcloud config set project "$PROJECT_ID"

1. Create a GKE cluster

Autopilot is the simplest option (Google manages nodes):

gcloud container clusters create-auto "$CLUSTER_NAME" \
  --region "$REGION" \
  --release-channel regular

If you prefer Standard mode (more control over node pools, GPU, sole-tenant nodes), use gcloud container clusters create with a --machine-type like e2-standard-4 and at least 3 nodes. Wire up kubectl:

gcloud container clusters get-credentials "$CLUSTER_NAME" --region "$REGION"
kubectl get nodes

2. Create Cloud SQL Postgres

gcloud sql instances create "$SQL_INSTANCE" \
  --database-version=POSTGRES_15 \
  --region="$REGION" \
  --tier=db-custom-2-8192 \
  --availability-type=REGIONAL

gcloud sql databases create definite --instance="$SQL_INSTANCE"

gcloud sql users create definite \
  --instance="$SQL_INSTANCE" \
  --password="<your-postgres-password>"

Stash the connection name (used by the Cloud SQL Auth Proxy and by postgres.url):

gcloud sql instances describe "$SQL_INSTANCE" --format='value(connectionName)'

For production, give the instance a private IP in your VPC (see Private IP setup) so the cluster can reach it without going through the auth proxy. If the instance is on the same VPC as the cluster, postgres.url looks like:

postgres://definite:${POSTGRES_PASSWORD}@<private-ip>:5432/definite

3. Create a GCS bucket for the lakehouse

gcloud storage buckets create "gs://${BUCKET}" \
  --location="$REGION" \
  --uniform-bucket-level-access \
  --public-access-prevention

Versioning is recommended for production:

gcloud storage buckets update "gs://${BUCKET}" --versioning

4. Create a service account and HMAC key for the bucket

The lakehouse reads and writes GCS through DuckDB’s httpfs extension. httpfs speaks the S3 interop API, not native GCS, so it needs an HMAC key pair bound to a service account that has object-admin on the bucket. Workload Identity alone is not enough.

# Create a dedicated service account for the lakehouse.
gcloud iam service-accounts create "$SA_NAME" \
  --display-name="Definite lakehouse"

SA_EMAIL="${SA_NAME}@${PROJECT_ID}.iam.gserviceaccount.com"

# Grant it object-admin on the lakehouse bucket only (least privilege).
gcloud storage buckets add-iam-policy-binding "gs://${BUCKET}" \
  --member="serviceAccount:${SA_EMAIL}" \
  --role="roles/storage.objectAdmin"

# Create the HMAC pair. The key + secret print once; capture them now.
gcloud storage hmac create "$SA_EMAIL"

The output gives you accessId (the HMAC key ID) and secret (the HMAC secret). These are what you’ll set as GCS_HMAC_KEY_ID and GCS_HMAC_SECRET before running definite init.

5. (If using Vertex AI for Fi) Grant Vertex access

Bind the GKE workload identity for the API pod’s service account to Vertex User on the project. The simplest way is to grant roles/aiplatform.user to the node service account the cluster runs as:

NODE_SA=$(gcloud container clusters describe "$CLUSTER_NAME" \
  --region "$REGION" \
  --format='value(nodeConfig.serviceAccount)')

# `default` resolves to the Compute Engine default SA; replace if you set one.
[ "$NODE_SA" = "default" ] && \
  NODE_SA="$(gcloud projects describe "$PROJECT_ID" --format='value(projectNumber)')-compute@developer.gserviceaccount.com"

gcloud projects add-iam-policy-binding "$PROJECT_ID" \
  --member="serviceAccount:${NODE_SA}" \
  --role="roles/aiplatform.user"

For tighter scoping, set up Workload Identity on the deployment’s API service account and bind roles/aiplatform.user to that specifically.

HMAC vs Workload Identity

Option	When to use
HMAC key (`accessId`, `secret`)	Use this today for lakehouse bucket access. The lakehouse uses DuckDB’s `httpfs` extension, which speaks the S3 interop API with static credentials. Workload Identity can’t be used for the bucket.
Workload Identity	Use this for Vertex AI access (and any other GCP API the deployment’s pods call). The API pod runs as a Kubernetes ServiceAccount that’s bound to a Google service account with `roles/aiplatform.user`. No static credentials.

Phase 2: Install Definite with the `definite` CLI

1. Install the CLI

curl -fsSL https://storage.googleapis.com/definite-public/definite-onprem/install.sh | sh

The install script detects your OS and architecture, downloads the matching prebuilt binary from a public Google Cloud Storage bucket, verifies its SHA256 checksum, and places definite on your PATH (default: $HOME/.local/bin). Binaries are published for macOS and Linux (arm64 and x86_64), and are uploaded by the release workflow using short-lived Workload Identity Federation credentials (no long-lived keys). To pin a version, set DEFINITE_VERSION before piping to sh:

curl -fsSL https://storage.googleapis.com/definite-public/definite-onprem/install.sh \
  | DEFINITE_VERSION=v0.1.0 sh

Verify:

definite version

2. Bootstrap cluster prerequisites

definite bootstrap installs the cluster-level pieces that definite init assumes are already present:

Prerequisite	What it provides
Ingress controller	HTTP/S routing for the deployment’s `Ingress` resource
cert-manager (+ CRDs)	TLS certificate issuance for `tls: cert_manager`
`letsencrypt-prod` ClusterIssuer	The issuer the ingress references for automatic Let’s Encrypt certs
agent-sandbox CRDs	Custom resources the Fi runtime uses to dispatch per-run sandboxes

Run it once against the fresh cluster:

definite bootstrap --acme-email you@yourcompany.com

On GKE Autopilot, definite bootstrap pins cert-manager’s leader-election lease to the cert-manager namespace. The chart default of kube-system is Google-managed on Autopilot and rejects writes; without the override the webhook CA is never injected. The CLI applies this automatically.

--dry-run prints what it would install without touching the cluster. The command is idempotent; safe to re-run.

3. Discover the load balancer IP

The ingress controller provisions a Google Cloud Network Load Balancer. Wait for it to land, then grab its external IP:

kubectl get svc ingress-nginx-controller -n ingress-nginx \
  -o jsonpath='{.status.loadBalancer.ingress[0].ip}'

You’ll get something like 34.x.y.z. You now have two choices for what to set as deployment.hostname:

Quick install (nip.io)
Production (real A record)

For demos and internal pilots, build a nip.io hostname from the IP. No DNS configuration needed:

LB_IP=$(kubectl get svc ingress-nginx-controller -n ingress-nginx \
  -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
HOSTNAME="${LB_IP//./-}.nip.io"
echo "Use this as deployment.hostname: $HOSTNAME"

nip.io wildcards every <ip>.nip.io host to that IP, so Let’s Encrypt issues a real cert with no extra setup.

Some OAuth providers (Google, Slack, HubSpot) reject nip.io redirect URIs. Use a real A record for production integrations.

Create an A record in your DNS provider pointing your chosen hostname (e.g. definite.acme.com) at the LB IP:

definite.acme.com    A    34.x.y.z

Wait for the record to propagate, then use that hostname in config.yaml. cert-manager will issue a Let’s Encrypt cert against it on first deploy.

4. Build `config.yaml`

Start from the GKE example in the repo: examples/minimal-gke.yaml. The shape:

deployment:
  name: definite
  namespace: definite
  hostname: <your-hostname>          # from step 3
  tls: cert_manager

postgres:
  url: postgres://definite:${POSTGRES_PASSWORD}@<cloud-sql-private-ip>:5432/definite

object_store:
  type: gcs
  bucket: <your-bucket-name>
  # GCS HMAC pair from `gcloud storage hmac create`.
  credentials:
    key_id:
      env: GCS_HMAC_KEY_ID
    secret:
      env: GCS_HMAC_SECRET

lakehouse:
  prefix: lake/
  storage:
    size: 50Gi
    storage_class_name: premium-rwo  # GKE shorthand for PD-SSD

auth:
  mode: local                        # or `oidc` for SSO
  initial_admin_email: admin@<your-domain>

llm:
  provider: vertex
  project: <your-project-id>
  region: <your-region>
  # Credentials via Workload Identity on the node SA; no credentials block needed.

resources:
  api:
    replicas: 2
    cpu: "1"
    memory: 2Gi
  lakehouse:
    replicas: 1
    cpu: "4"
    memory: 16Gi
  frontend:
    replicas: 2
    cpu: 500m
    memory: 512Mi
  job_runner:
    replicas: 1
    cpu: 500m
    memory: 1Gi

GKE Autopilot uses standard-rwo (Balanced PD) and premium-rwo (SSD PD) as its two built-in StorageClasses. The lakehouse benefits from SSD, so premium-rwo is the default. Run kubectl get storageclass to see what your cluster has.

For the full list of knobs (image registry overrides, ingress class, sandbox configuration, etc.), see the config reference.

5. Export secrets

config.yaml references env vars for every secret:

export POSTGRES_PASSWORD="<your-postgres-password>"
export GCS_HMAC_KEY_ID="<accessId-from-step-4>"
export GCS_HMAC_SECRET="<secret-from-step-4>"
export OIDC_CLIENT_SECRET=...        # only if auth.mode: oidc

Don’t commit config.yaml to a public repo while these env vars are exported into your shell history. If you want to script the install, source the values from a secret manager (e.g. gcloud secrets versions access).

6. (Optional) Use LiteLLM gateway

If you want a single LLM gateway in front of Vertex (for rate limiting, request logging, model routing, etc.), deploy LiteLLM into the cluster and set llm.provider: anthropic in config.yaml with api_key pointing at LiteLLM’s master key. LiteLLM presents an Anthropic-shaped API while talking to Vertex behind the scenes.

7. Preflight with `definite doctor`

definite doctor --config config.yaml

doctor runs a battery of preflight checks: it connects to Postgres and runs SELECT version(), validates the Kubernetes context, checks object-store config shape, and (for Anthropic) pings the LLM API. Fix anything it flags before moving on.

8. Deploy with `definite init`

definite init --config config.yaml

init re-runs preflight, renders the bundled Helm chart with your values, and runs helm upgrade --install. It waits for pods to reach Ready by default. Useful flags:

Flag	Purpose
`--dry-run`	Render values, don’t apply
`--wait=false`	Return as soon as Helm finishes; don’t wait for pods
`--skip-preflight`	Skip `doctor` (not recommended)

Watch the rollout in another terminal:

definite status --config config.yaml
definite logs api --follow

When the pods are Ready and cert-manager has issued a cert, open your hostname in a browser and log in.

Day-2 operations

The same CLI handles upgrades, logs, license, and lakehouse maintenance. A few of the most common commands:

definite status   --config config.yaml         # `kubectl get pods,svc,ingress` for the namespace
definite logs api --follow                     # stream component logs
definite upgrade  --config config.yaml         # re-render and re-apply with the current CLI version
definite license  show                         # decode + verify the active license
definite run maintenance stats                 # lakehouse file/snapshot stats

See the CLI reference for every command and flag.

Phase 3: Teardown

# Lakehouse bucket (snapshot first if you care about the data).
gcloud storage rm -r "gs://${BUCKET}"

# Service account + HMAC key.
gcloud storage hmac list --service-account="$SA_EMAIL"
gcloud storage hmac delete <accessId>
gcloud iam service-accounts delete "$SA_EMAIL"

# Postgres (take a snapshot first if you want a backup).
gcloud sql instances delete "$SQL_INSTANCE"

# GKE cluster.
gcloud container clusters delete "$CLUSTER_NAME" --region "$REGION"

For production teardowns, snapshot Cloud SQL and back up the GCS bucket first; once they’re deleted, they’re gone.

Troubleshooting

Symptom	Likely cause	Fix
GKE cluster takes 10-15 min to provision	Normal for Autopilot regional clusters	Wait
Lakehouse pod stuck `Pending` on PVC	No default StorageClass, or the name doesn’t exist	`kubectl get storageclass`; set `lakehouse.storage.storage_class_name` to one that exists (`premium-rwo` on Autopilot)
`definite doctor` Postgres check fails	Cloud SQL has no private IP, or the cluster can’t reach it	Confirm Cloud SQL is on the same VPC as the cluster with a private IP, or run `doctor` from a pod in the cluster
LB IP never appears	Ingress controller isn’t running, or the GKE ingress controller is fighting with `ingress-nginx`	`kubectl get pods -n ingress-nginx`; if you have both controllers, pick one
Cert never issues	`letsencrypt-prod` ClusterIssuer missing, or DNS doesn’t resolve to the LB	`kubectl describe certificate -n definite` shows the cert-manager error
cert-manager webhook never ready (Autopilot)	Lease pinned to `kube-system`, which is Google-managed	`definite bootstrap` pins the lease to `cert-manager`; if you installed cert-manager by hand, set `--set global.leaderElection.namespace=cert-manager`
Fi can’t reach Vertex	Node SA missing `roles/aiplatform.user`, or model not enabled in your region	Confirm the binding (step 5 above) and enable the model in Vertex Model Garden
Lakehouse can’t read/write GCS	HMAC key was created against a service account without object-admin on the bucket	Re-grant `roles/storage.objectAdmin` to the SA, or regenerate the HMAC pair

Next steps

Connect a data source to start ingesting data.
Set up the MCP server so Claude, Cursor, or Windsurf can query your deployment.
For deeper config (custom container registry, sandbox network policies, OIDC tuning), see the config reference.

Support

For issues or questions, contact hello@definite.app or open an issue on definite-app/definite-onprem.

Getting Started

Analyze & Build

Destinations

Extracting from Data Sources

Connect Your Database

Data Modeling

AI & Agents

Definite API

Custom Python Functions

Workspace

On-Prem

What gets created

Prerequisites

Phase 1: Provision GCP infrastructure

1. Create a GKE cluster

2. Create Cloud SQL Postgres

3. Create a GCS bucket for the lakehouse

4. Create a service account and HMAC key for the bucket

5. (If using Vertex AI for Fi) Grant Vertex access

HMAC vs Workload Identity

Phase 2: Install Definite with the `definite` CLI

1. Install the CLI

2. Bootstrap cluster prerequisites

3. Discover the load balancer IP

4. Build `config.yaml`

5. Export secrets

6. (Optional) Use LiteLLM gateway

7. Preflight with `definite doctor`

8. Deploy with `definite init`

Day-2 operations

Phase 3: Teardown

Troubleshooting

Next steps

Support

Getting Started

Analyze & Build

Destinations

Extracting from Data Sources

Connect Your Database

Data Modeling

AI & Agents

Definite API

Custom Python Functions

Workspace

On-Prem

Documentation Index

​What gets created

​Prerequisites

​Phase 1: Provision GCP infrastructure

​1. Create a GKE cluster

​2. Create Cloud SQL Postgres

​3. Create a GCS bucket for the lakehouse

​4. Create a service account and HMAC key for the bucket

​5. (If using Vertex AI for Fi) Grant Vertex access

​HMAC vs Workload Identity

​Phase 2: Install Definite with the definite CLI

​1. Install the CLI

​2. Bootstrap cluster prerequisites

​3. Discover the load balancer IP

​4. Build config.yaml

​5. Export secrets

​6. (Optional) Use LiteLLM gateway

​7. Preflight with definite doctor

​8. Deploy with definite init

​Day-2 operations

​Phase 3: Teardown

​Troubleshooting

​Next steps

​Support

What gets created

Prerequisites

Phase 1: Provision GCP infrastructure

1. Create a GKE cluster

2. Create Cloud SQL Postgres

3. Create a GCS bucket for the lakehouse

4. Create a service account and HMAC key for the bucket

5. (If using Vertex AI for Fi) Grant Vertex access

HMAC vs Workload Identity

Phase 2: Install Definite with the `definite` CLI

1. Install the CLI

2. Bootstrap cluster prerequisites

3. Discover the load balancer IP

4. Build `config.yaml`

5. Export secrets

6. (Optional) Use LiteLLM gateway

7. Preflight with `definite doctor`

8. Deploy with `definite init`

Day-2 operations

Phase 3: Teardown

Troubleshooting

Next steps

Support