GCP Deployment
Deploy the FailZero agent on Google Cloud Platform using Compute Engine or GKE.
Compute Engine
1. Create Service Account
# Create service account
gcloud iam service-accounts create failzero-agent \
--display-name= "FailZero Agent"
# Get the service account email
SA_EMAIL = "failzero-agent@YOUR_PROJECT.iam.gserviceaccount.com"
2. Grant IAM Permissions
The agent needs permissions to execute failover operations:
PROJECT_ID = "your-project-id"
SA_EMAIL = "failzero-agent@${ PROJECT_ID }.iam.gserviceaccount.com"
# Cloud SQL (database promotion)
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member= "serviceAccount:${ SA_EMAIL }" \
--role= "roles/cloudsql.admin"
# Cloud DNS (DNS updates)
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member= "serviceAccount:${ SA_EMAIL }" \
--role= "roles/dns.admin"
# Compute Engine (instance group scaling)
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member= "serviceAccount:${ SA_EMAIL }" \
--role= "roles/compute.instanceAdmin.v1"
# Secret Manager (reading secrets)
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member= "serviceAccount:${ SA_EMAIL }" \
--role= "roles/secretmanager.secretAccessor"
Grant only the permissions needed for your DR plan. These are examples for common failover operations.
3. Create VM Instance
gcloud compute instances create failzero-agent \
--zone=us-west1-a \
--machine-type=e2-small \
--service-account= $SA_EMAIL \
--scopes=cloud-platform \
--image-family=cos-stable \
--image-project=cos-cloud \
--metadata=startup-script= '#!/bin/bash
docker run -d \
--name failzero-agent \
--restart unless-stopped \
-e FAILZERO_AGENT_TOKEN=fzat_your_token \
-e FAILZERO_API_URL=https://api.failzero.io \
-e PROVIDER_TYPE=gcp \
-e GCP_PROJECT_ID=your-project \
failzero/agent:latest'
Replace fzat_your_token with your actual agent token. For production, use Secret Manager instead of metadata.
4. Using Secret Manager (Recommended)
Store sensitive values in Secret Manager:
# Create secret for agent token
echo -n "fzat_your_actual_token" | \
gcloud secrets create failzero-agent-token \
--data-file=-
# Grant access to service account
gcloud secrets add-iam-policy-binding failzero-agent-token \
--member= "serviceAccount:${ SA_EMAIL }" \
--role= "roles/secretmanager.secretAccessor"
Update startup script to fetch from Secret Manager:
#!/bin/bash
TOKEN = $( gcloud secrets versions access latest --secret=failzero-agent-token )
docker run -d \
--name failzero-agent \
--restart unless-stopped \
-e FAILZERO_AGENT_TOKEN= $TOKEN \
-e FAILZERO_API_URL=https://api.failzero.io \
-e PROVIDER_TYPE=gcp \
-e GCP_PROJECT_ID=your-project \
failzero/agent:latest
GKE (Kubernetes)
1. Create Kubernetes Secret
kubectl create secret generic failzero-agent \
--from-literal=token=fzat_your_token
2. Deploy Agent
apiVersion : apps/v1
kind : Deployment
metadata :
name : failzero-agent
spec :
replicas : 1
selector :
matchLabels :
app : failzero-agent
template :
metadata :
labels :
app : failzero-agent
spec :
serviceAccountName : failzero-agent
containers :
- name : agent
image : failzero/agent:latest
env :
- name : FAILZERO_AGENT_TOKEN
valueFrom :
secretKeyRef :
name : failzero-agent
key : token
- name : FAILZERO_API_URL
value : "https://api.failzero.io"
- name : PROVIDER_TYPE
value : "gcp"
- name : GCP_PROJECT_ID
value : "your-project"
resources :
requests :
memory : "128Mi"
cpu : "100m"
limits :
memory : "256Mi"
cpu : "200m"
3. Workload Identity (Recommended)
Use Workload Identity for secure credential management:
# Enable Workload Identity on cluster
gcloud container clusters update YOUR_CLUSTER \
--workload-pool=YOUR_PROJECT.svc.id.goog
# Create Kubernetes service account
kubectl create serviceaccount failzero-agent
# Bind to GCP service account
gcloud iam service-accounts add-iam-policy-binding \
failzero-agent@YOUR_PROJECT.iam.gserviceaccount.com \
--role= "roles/iam.workloadIdentityUser" \
--member= "serviceAccount:YOUR_PROJECT.svc.id.goog[default/failzero-agent]"
# Annotate Kubernetes service account
kubectl annotate serviceaccount failzero-agent \
iam.gke.io/gcp-service-account=failzero-agent@YOUR_PROJECT.iam.gserviceaccount.com
IAM Permissions
Minimum Required
Resource Type Role Purpose Cloud SQL roles/cloudsql.adminPromote replicas Cloud DNS roles/dns.adminUpdate DNS records
Optional (Based on DR Plan)
Resource Type Role Purpose Compute Engine roles/compute.instanceAdmin.v1Scale instance groups Secret Manager roles/secretmanager.secretAccessorRead secrets Cloud Storage roles/storage.adminBackup operations Pub/Sub roles/pubsub.publisherNotifications
Verify Deployment
Compute Engine
# SSH into the VM
gcloud compute ssh failzero-agent --zone=us-west1-a
# Check Docker logs
docker logs failzero-agent
GKE
# Check pod status
kubectl get pods -l app=failzero-agent
# View logs
kubectl logs -l app=failzero-agent
Expected output:
[Agent] Starting FailZero Agent...
[Agent] Registering with FailZero API...
[Agent] Registered successfully for organization: your-org
[Agent] Agent started successfully
Troubleshooting
Permission denied errors:
Verify IAM roles are assigned to the service account
Check the service account is attached to the VM/pod
Ensure Workload Identity is configured correctly (GKE)
Cannot reach API:
Check firewall rules allow outbound HTTPS (port 443)
Verify VPC allows egress to api.failzero.io
Agent not registering:
Confirm token is correct and not expired
Check logs for specific error messages
Next Steps
Example Plans GCP-specific DR configurations
CLI Reference Deploy and manage plans