status-guide

Is GCP Cloud Run Down? Google Cloud Status & Diagnostics in 2026

ezmon.com team • March 18, 2026 • 9 min read

Is Google Cloud Platform Down Right Now?

Google Cloud Platform powers millions of production workloads — Cloud Run serverless containers, GKE clusters, BigQuery analytics, Cloud SQL databases, and hundreds of other services. When GCP has issues, the blast radius can be massive.

Here's how to check GCP status instantly, diagnose Cloud Run–specific issues, and know when it's GCP vs. your deployment.

Step 1: Check GCP's Official Status Page

Google maintains a real-time status dashboard at status.cloud.google.com.

Key things to check:

The specific service you depend on (Cloud Run, GKE, Cloud SQL, Pub/Sub, etc.)
Your specific region (us-central1, us-east1, europe-west1, asia-northeast1, etc.)
Whether the incident is Disruption (partial) or Outage (complete)

GCP status page also available as JSON API:

curl https://status.cloud.google.com/incidents.json | \
  python3 -c "import sys,json; incidents=json.load(sys.stdin); \
  [print(f'{i[\"service_name\"]}: {i[\"most_recent_update\"][\"text\"][:100]}') \
  for i in incidents if not i.get('end')]"

Subscribe to RSS at status.cloud.google.com/feed.atom for push notifications.

GCP Service Architecture: What Can Break

GCP incidents rarely affect all services simultaneously. The most common failure patterns:

Service	What it does	Failure mode
Cloud Run	Serverless container execution	New deployments fail, cold start timeouts, traffic-splitting errors
GKE	Managed Kubernetes	Control plane API unavailable, node auto-provisioning fails, cluster upgrades stuck
Cloud SQL	Managed PostgreSQL/MySQL/SQL Server	Connection failures, failover to replica, storage I/O degradation
Pub/Sub	Managed message queue	Message delivery delays, publish failures, subscription backlog growth
Cloud Storage (GCS)	Object storage	Upload/download latency, bucket operation errors
Cloud Functions	Serverless functions (gen 1/gen 2)	Deployment failures, invocation timeouts, IAM permission errors
BigQuery	Serverless analytics warehouse	Query job failures, slot exhaustion, streaming insert delays
Cloud Spanner	Globally distributed RDBMS	Regional leader election delays, transaction abort rate spikes
Cloud Load Balancing	Global/regional HTTP/TCP load balancers	Backend health check failures, SSL cert errors, routing misconfigurations
IAM / Cloud Identity	Authentication and authorization	Service account token failures, permission propagation delays

Cloud Run Diagnostics: Is It GCP or Your Container?

Cloud Run issues are frequently misidentified as GCP outages when they're actually container, IAM, or configuration issues. Use this table to triage:

Symptom	Most likely cause	How to verify
Deployment fails with "Build error"	Container build issue (Cloud Build)	Check Cloud Build history in Console; look for Dockerfile errors
Service returns 500 on all requests	Container crash on startup	Cloud Run logs → filter for "container failed to start"
Service returns 503 on all requests	No healthy instances (crash loop or traffic not routed)	Check instance count in Cloud Run metrics; look for OOM kills
Cold starts very slow (>10s)	Large container image or CPU throttling	Check image size; enable CPU always-on or min-instances
Timeout errors under load	Max concurrency too low or backend slow	Increase `--concurrency`; check downstream service latency
403 Forbidden on all requests	IAM policy missing allUsers (public) or invoker binding	Check Cloud Run → Security → Authentication setting
Can't connect to Cloud SQL	Missing Cloud SQL Client IAM role or wrong connection name	Verify `--add-cloudsql-instances` flag and service account permissions
New revision deployed but no traffic	Traffic split still pointing to old revision	Check traffic allocation in Cloud Run console
Works in one region, fails in another	Regional GCP incident or regional service limit	Check GCP status page for the specific region
Intermittent failures ~1% requests	GCP underlying infrastructure noise (common in large services)	Check error rate in Cloud Monitoring; implement retries with exponential backoff

Cloud Run Monitoring: Key Commands

Check service health via gcloud

# List all Cloud Run services and their status
gcloud run services list --platform=managed --region=us-central1

# Get details on a specific service
gcloud run services describe my-service --platform=managed --region=us-central1

# Check recent revisions
gcloud run revisions list --service=my-service --platform=managed --region=us-central1

# Tail Cloud Run logs
gcloud logging read "resource.type=cloud_run_revision resource.labels.service_name=my-service" \
  --limit=50 --format="table(timestamp,textPayload)" --order=desc

Check service URL and test

# Get the service URL
SERVICE_URL=$(gcloud run services describe my-service \
  --platform=managed --region=us-central1 \
  --format="value(status.url)")

# Test with auth token (for private services)
TOKEN=$(gcloud auth print-identity-token)
curl -H "Authorization: Bearer $TOKEN" $SERVICE_URL/healthz

Check Cloud Run quotas

# List Cloud Run quota usage
gcloud compute project-info describe --format="yaml(quotas)" | \
  grep -A1 -i "run"

Q1 2026 GCP Incidents (Public Record)

Google publishes incident post-mortems at status.cloud.google.com/summary. Notable Q1 2026 events from the public record:

Date	Service	Affected regions	Description	Source
Jan–Mar 2026	Various	Multiple	See official incident summary for current Q1 incidents	status.cloud.google.com

We link to GCP's public incident summary rather than reproduce it — their post-mortems contain accurate root cause analyses and timeline details.

Is It GCP, My Container, or My Network? Decision Tree

Check GCP status page first — is there an active incident for your service + region?
- Yes → watch the incident; implement fallback if possible
- No → continue to step 2
Can you deploy a hello-world container to Cloud Run in the same region?
- No → likely a regional Cloud Run issue; try a different region
- Yes → your container or config is the issue
Does the container work locally?
- docker build . && docker run -p 8080:8080 -e PORT=8080 my-image
- No → container bug
- Yes → environment/config mismatch (secrets, IAM, env vars, connection strings)
Is it a dependencies issue?
- Cloud SQL connection → verify Cloud SQL Auth Proxy or connector config
- Secret Manager → verify service account has roles/secretmanager.secretAccessor
- Pub/Sub → verify service account has publish/subscribe roles
Is it a quota issue?
- Check Cloud Run → Quotas in Console or gcloud compute project-info describe
- Common limits: requests per second, CPU allocation, number of revisions

GCP Multi-Region Failover Pattern

Cloud Run is regional. If us-central1 has an outage, your service in that region goes down. Mitigations:

Option 1: Global Load Balancer + Multi-Region Cloud Run

# Deploy same service to multiple regions
for REGION in us-central1 us-east1 europe-west1; do
  gcloud run deploy my-service \
    --image gcr.io/my-project/my-image:latest \
    --region $REGION \
    --platform managed
done

# Create Network Endpoint Groups for each region
for REGION in us-central1 us-east1 europe-west1; do
  gcloud compute network-endpoint-groups create my-service-neg-$REGION \
    --region=$REGION \
    --network-endpoint-type=serverless \
    --cloud-run-service=my-service
done

Option 2: Traffic Director for automatic failover

Use GCP's Traffic Director to route away from unhealthy regions automatically. Requires load balancer configuration — see GCP documentation.

Option 3: Minimum instances to avoid cold start amplification during incidents

gcloud run services update my-service \
  --min-instances=1 \
  --region=us-central1

During a partial outage when some instances are being recycled, having minimum instances prevents the cold start spike from compounding the problem.

Cloud Run Monitoring Best Practices

Cloud Monitoring alerts for Cloud Run

{
  "displayName": "Cloud Run Error Rate Alert",
  "conditions": [{
    "displayName": "Error rate > 5%",
    "conditionThreshold": {
      "filter": "resource.type=\"cloud_run_revision\" AND metric.type=\"run.googleapis.com/request_count\" AND metric.labels.response_code_class=\"5xx\"",
      "comparison": "COMPARISON_GT",
      "thresholdValue": 0.05,
      "duration": "60s"
    }
  }],
  "alertStrategy": {
    "autoClose": "604800s"
  }
}

External uptime check (monitors from outside GCP)

Cloud Monitoring's built-in uptime checks run from within GCP infrastructure — they may not catch regional issues. Use an external monitoring service (like ezmon.com) that checks from multiple geographic locations outside GCP to detect regional failures independently.

External monitoring catches what GCP internal monitoring misses:

GCP Load Balancer routing failures
DNS propagation issues
Global anycast routing problems
SSL certificate expiry

Where to Get GCP Status Updates

Status page: status.cloud.google.com
Incident history + post-mortems: status.cloud.google.com/summary
RSS feed: status.cloud.google.com/feed.atom
JSON API: https://status.cloud.google.com/incidents.json
Personal Health Dashboard: GCP Console → Home → "View service health"
@GoogleCloud on X: official announcements during major incidents

Related Guides

Bottom Line

When GCP has issues, the fastest path to resolution is:

Check status.cloud.google.com for your service + region
Check Cloud Run logs for container-level errors
Verify IAM bindings and service account permissions
Use gcloud run services describe to check deployment health
If confirmed GCP issue: implement multi-region failover or wait for remediation

Set up external uptime monitoring at ezmon.com to get alerted the moment your Cloud Run service becomes unreachable — before your users report it.

gcpgoogle-cloudcloud-rundownstatusoutagekubernetesserverless