status-guide

Is GCP Cloud Run Down? Google Cloud Status & Diagnostics in 2026

Is Google Cloud Platform Down Right Now?

Google Cloud Platform powers millions of production workloads — Cloud Run serverless containers, GKE clusters, BigQuery analytics, Cloud SQL databases, and hundreds of other services. When GCP has issues, the blast radius can be massive.

Here's how to check GCP status instantly, diagnose Cloud Run–specific issues, and know when it's GCP vs. your deployment.

Step 1: Check GCP's Official Status Page

Google maintains a real-time status dashboard at status.cloud.google.com.

Key things to check:

  • The specific service you depend on (Cloud Run, GKE, Cloud SQL, Pub/Sub, etc.)
  • Your specific region (us-central1, us-east1, europe-west1, asia-northeast1, etc.)
  • Whether the incident is Disruption (partial) or Outage (complete)

GCP status page also available as JSON API:

curl https://status.cloud.google.com/incidents.json | \
  python3 -c "import sys,json; incidents=json.load(sys.stdin); \
  [print(f'{i[\"service_name\"]}: {i[\"most_recent_update\"][\"text\"][:100]}') \
  for i in incidents if not i.get('end')]"

Subscribe to RSS at status.cloud.google.com/feed.atom for push notifications.


GCP Service Architecture: What Can Break

GCP incidents rarely affect all services simultaneously. The most common failure patterns:

Service What it does Failure mode
Cloud Run Serverless container execution New deployments fail, cold start timeouts, traffic-splitting errors
GKE Managed Kubernetes Control plane API unavailable, node auto-provisioning fails, cluster upgrades stuck
Cloud SQL Managed PostgreSQL/MySQL/SQL Server Connection failures, failover to replica, storage I/O degradation
Pub/Sub Managed message queue Message delivery delays, publish failures, subscription backlog growth
Cloud Storage (GCS) Object storage Upload/download latency, bucket operation errors
Cloud Functions Serverless functions (gen 1/gen 2) Deployment failures, invocation timeouts, IAM permission errors
BigQuery Serverless analytics warehouse Query job failures, slot exhaustion, streaming insert delays
Cloud Spanner Globally distributed RDBMS Regional leader election delays, transaction abort rate spikes
Cloud Load Balancing Global/regional HTTP/TCP load balancers Backend health check failures, SSL cert errors, routing misconfigurations
IAM / Cloud Identity Authentication and authorization Service account token failures, permission propagation delays

Cloud Run Diagnostics: Is It GCP or Your Container?

Cloud Run issues are frequently misidentified as GCP outages when they're actually container, IAM, or configuration issues. Use this table to triage:

Symptom Most likely cause How to verify
Deployment fails with "Build error" Container build issue (Cloud Build) Check Cloud Build history in Console; look for Dockerfile errors
Service returns 500 on all requests Container crash on startup Cloud Run logs → filter for "container failed to start"
Service returns 503 on all requests No healthy instances (crash loop or traffic not routed) Check instance count in Cloud Run metrics; look for OOM kills
Cold starts very slow (>10s) Large container image or CPU throttling Check image size; enable CPU always-on or min-instances
Timeout errors under load Max concurrency too low or backend slow Increase --concurrency; check downstream service latency
403 Forbidden on all requests IAM policy missing allUsers (public) or invoker binding Check Cloud Run → Security → Authentication setting
Can't connect to Cloud SQL Missing Cloud SQL Client IAM role or wrong connection name Verify --add-cloudsql-instances flag and service account permissions
New revision deployed but no traffic Traffic split still pointing to old revision Check traffic allocation in Cloud Run console
Works in one region, fails in another Regional GCP incident or regional service limit Check GCP status page for the specific region
Intermittent failures ~1% requests GCP underlying infrastructure noise (common in large services) Check error rate in Cloud Monitoring; implement retries with exponential backoff

Cloud Run Monitoring: Key Commands

Check service health via gcloud

# List all Cloud Run services and their status
gcloud run services list --platform=managed --region=us-central1

# Get details on a specific service
gcloud run services describe my-service --platform=managed --region=us-central1

# Check recent revisions
gcloud run revisions list --service=my-service --platform=managed --region=us-central1

# Tail Cloud Run logs
gcloud logging read "resource.type=cloud_run_revision resource.labels.service_name=my-service" \
  --limit=50 --format="table(timestamp,textPayload)" --order=desc

Check service URL and test

# Get the service URL
SERVICE_URL=$(gcloud run services describe my-service \
  --platform=managed --region=us-central1 \
  --format="value(status.url)")

# Test with auth token (for private services)
TOKEN=$(gcloud auth print-identity-token)
curl -H "Authorization: Bearer $TOKEN" $SERVICE_URL/healthz

Check Cloud Run quotas

# List Cloud Run quota usage
gcloud compute project-info describe --format="yaml(quotas)" | \
  grep -A1 -i "run"

Q1 2026 GCP Incidents (Public Record)

Google publishes incident post-mortems at status.cloud.google.com/summary. Notable Q1 2026 events from the public record:

Date Service Affected regions Description Source
Jan–Mar 2026 Various Multiple See official incident summary for current Q1 incidents status.cloud.google.com

We link to GCP's public incident summary rather than reproduce it — their post-mortems contain accurate root cause analyses and timeline details.


Is It GCP, My Container, or My Network? Decision Tree

  1. Check GCP status page first — is there an active incident for your service + region?
    • Yes → watch the incident; implement fallback if possible
    • No → continue to step 2
  2. Can you deploy a hello-world container to Cloud Run in the same region?
    • No → likely a regional Cloud Run issue; try a different region
    • Yes → your container or config is the issue
  3. Does the container work locally?
    • docker build . && docker run -p 8080:8080 -e PORT=8080 my-image
    • No → container bug
    • Yes → environment/config mismatch (secrets, IAM, env vars, connection strings)
  4. Is it a dependencies issue?
    • Cloud SQL connection → verify Cloud SQL Auth Proxy or connector config
    • Secret Manager → verify service account has roles/secretmanager.secretAccessor
    • Pub/Sub → verify service account has publish/subscribe roles
  5. Is it a quota issue?
    • Check Cloud Run → Quotas in Console or gcloud compute project-info describe
    • Common limits: requests per second, CPU allocation, number of revisions

GCP Multi-Region Failover Pattern

Cloud Run is regional. If us-central1 has an outage, your service in that region goes down. Mitigations:

Option 1: Global Load Balancer + Multi-Region Cloud Run

# Deploy same service to multiple regions
for REGION in us-central1 us-east1 europe-west1; do
  gcloud run deploy my-service \
    --image gcr.io/my-project/my-image:latest \
    --region $REGION \
    --platform managed
done

# Create Network Endpoint Groups for each region
for REGION in us-central1 us-east1 europe-west1; do
  gcloud compute network-endpoint-groups create my-service-neg-$REGION \
    --region=$REGION \
    --network-endpoint-type=serverless \
    --cloud-run-service=my-service
done

Option 2: Traffic Director for automatic failover

Use GCP's Traffic Director to route away from unhealthy regions automatically. Requires load balancer configuration — see GCP documentation.

Option 3: Minimum instances to avoid cold start amplification during incidents

gcloud run services update my-service \
  --min-instances=1 \
  --region=us-central1

During a partial outage when some instances are being recycled, having minimum instances prevents the cold start spike from compounding the problem.


Cloud Run Monitoring Best Practices

Cloud Monitoring alerts for Cloud Run

{
  "displayName": "Cloud Run Error Rate Alert",
  "conditions": [{
    "displayName": "Error rate > 5%",
    "conditionThreshold": {
      "filter": "resource.type=\"cloud_run_revision\" AND metric.type=\"run.googleapis.com/request_count\" AND metric.labels.response_code_class=\"5xx\"",
      "comparison": "COMPARISON_GT",
      "thresholdValue": 0.05,
      "duration": "60s"
    }
  }],
  "alertStrategy": {
    "autoClose": "604800s"
  }
}

External uptime check (monitors from outside GCP)

Cloud Monitoring's built-in uptime checks run from within GCP infrastructure — they may not catch regional issues. Use an external monitoring service (like ezmon.com) that checks from multiple geographic locations outside GCP to detect regional failures independently.

External monitoring catches what GCP internal monitoring misses:

  • GCP Load Balancer routing failures
  • DNS propagation issues
  • Global anycast routing problems
  • SSL certificate expiry

Where to Get GCP Status Updates


Related Guides


Bottom Line

When GCP has issues, the fastest path to resolution is:

  1. Check status.cloud.google.com for your service + region
  2. Check Cloud Run logs for container-level errors
  3. Verify IAM bindings and service account permissions
  4. Use gcloud run services describe to check deployment health
  5. If confirmed GCP issue: implement multi-region failover or wait for remediation

Set up external uptime monitoring at ezmon.com to get alerted the moment your Cloud Run service becomes unreachable — before your users report it.

gcpgoogle-cloudcloud-rundownstatusoutagekubernetesserverless