Is GCP Cloud Run Down? Google Cloud Status & Diagnostics in 2026
Is Google Cloud Platform Down Right Now?
Google Cloud Platform powers millions of production workloads — Cloud Run serverless containers, GKE clusters, BigQuery analytics, Cloud SQL databases, and hundreds of other services. When GCP has issues, the blast radius can be massive.
Here's how to check GCP status instantly, diagnose Cloud Run–specific issues, and know when it's GCP vs. your deployment.
Step 1: Check GCP's Official Status Page
Google maintains a real-time status dashboard at status.cloud.google.com.
Key things to check:
- The specific service you depend on (Cloud Run, GKE, Cloud SQL, Pub/Sub, etc.)
- Your specific region (us-central1, us-east1, europe-west1, asia-northeast1, etc.)
- Whether the incident is Disruption (partial) or Outage (complete)
GCP status page also available as JSON API:
curl https://status.cloud.google.com/incidents.json | \
python3 -c "import sys,json; incidents=json.load(sys.stdin); \
[print(f'{i[\"service_name\"]}: {i[\"most_recent_update\"][\"text\"][:100]}') \
for i in incidents if not i.get('end')]"
Subscribe to RSS at status.cloud.google.com/feed.atom for push notifications.
GCP Service Architecture: What Can Break
GCP incidents rarely affect all services simultaneously. The most common failure patterns:
| Service | What it does | Failure mode |
|---|---|---|
| Cloud Run | Serverless container execution | New deployments fail, cold start timeouts, traffic-splitting errors |
| GKE | Managed Kubernetes | Control plane API unavailable, node auto-provisioning fails, cluster upgrades stuck |
| Cloud SQL | Managed PostgreSQL/MySQL/SQL Server | Connection failures, failover to replica, storage I/O degradation |
| Pub/Sub | Managed message queue | Message delivery delays, publish failures, subscription backlog growth |
| Cloud Storage (GCS) | Object storage | Upload/download latency, bucket operation errors |
| Cloud Functions | Serverless functions (gen 1/gen 2) | Deployment failures, invocation timeouts, IAM permission errors |
| BigQuery | Serverless analytics warehouse | Query job failures, slot exhaustion, streaming insert delays |
| Cloud Spanner | Globally distributed RDBMS | Regional leader election delays, transaction abort rate spikes |
| Cloud Load Balancing | Global/regional HTTP/TCP load balancers | Backend health check failures, SSL cert errors, routing misconfigurations |
| IAM / Cloud Identity | Authentication and authorization | Service account token failures, permission propagation delays |
Cloud Run Diagnostics: Is It GCP or Your Container?
Cloud Run issues are frequently misidentified as GCP outages when they're actually container, IAM, or configuration issues. Use this table to triage:
| Symptom | Most likely cause | How to verify |
|---|---|---|
| Deployment fails with "Build error" | Container build issue (Cloud Build) | Check Cloud Build history in Console; look for Dockerfile errors |
| Service returns 500 on all requests | Container crash on startup | Cloud Run logs → filter for "container failed to start" |
| Service returns 503 on all requests | No healthy instances (crash loop or traffic not routed) | Check instance count in Cloud Run metrics; look for OOM kills |
| Cold starts very slow (>10s) | Large container image or CPU throttling | Check image size; enable CPU always-on or min-instances |
| Timeout errors under load | Max concurrency too low or backend slow | Increase --concurrency; check downstream service latency |
| 403 Forbidden on all requests | IAM policy missing allUsers (public) or invoker binding | Check Cloud Run → Security → Authentication setting |
| Can't connect to Cloud SQL | Missing Cloud SQL Client IAM role or wrong connection name | Verify --add-cloudsql-instances flag and service account permissions |
| New revision deployed but no traffic | Traffic split still pointing to old revision | Check traffic allocation in Cloud Run console |
| Works in one region, fails in another | Regional GCP incident or regional service limit | Check GCP status page for the specific region |
| Intermittent failures ~1% requests | GCP underlying infrastructure noise (common in large services) | Check error rate in Cloud Monitoring; implement retries with exponential backoff |
Cloud Run Monitoring: Key Commands
Check service health via gcloud
# List all Cloud Run services and their status
gcloud run services list --platform=managed --region=us-central1
# Get details on a specific service
gcloud run services describe my-service --platform=managed --region=us-central1
# Check recent revisions
gcloud run revisions list --service=my-service --platform=managed --region=us-central1
# Tail Cloud Run logs
gcloud logging read "resource.type=cloud_run_revision resource.labels.service_name=my-service" \
--limit=50 --format="table(timestamp,textPayload)" --order=desc
Check service URL and test
# Get the service URL
SERVICE_URL=$(gcloud run services describe my-service \
--platform=managed --region=us-central1 \
--format="value(status.url)")
# Test with auth token (for private services)
TOKEN=$(gcloud auth print-identity-token)
curl -H "Authorization: Bearer $TOKEN" $SERVICE_URL/healthz
Check Cloud Run quotas
# List Cloud Run quota usage
gcloud compute project-info describe --format="yaml(quotas)" | \
grep -A1 -i "run"
Q1 2026 GCP Incidents (Public Record)
Google publishes incident post-mortems at status.cloud.google.com/summary. Notable Q1 2026 events from the public record:
| Date | Service | Affected regions | Description | Source |
|---|---|---|---|---|
| Jan–Mar 2026 | Various | Multiple | See official incident summary for current Q1 incidents | status.cloud.google.com |
We link to GCP's public incident summary rather than reproduce it — their post-mortems contain accurate root cause analyses and timeline details.
Is It GCP, My Container, or My Network? Decision Tree
-
Check GCP status page first — is there an active incident for your service + region?
- Yes → watch the incident; implement fallback if possible
- No → continue to step 2
-
Can you deploy a hello-world container to Cloud Run in the same region?
- No → likely a regional Cloud Run issue; try a different region
- Yes → your container or config is the issue
-
Does the container work locally?
docker build . && docker run -p 8080:8080 -e PORT=8080 my-image- No → container bug
- Yes → environment/config mismatch (secrets, IAM, env vars, connection strings)
-
Is it a dependencies issue?
- Cloud SQL connection → verify Cloud SQL Auth Proxy or connector config
- Secret Manager → verify service account has
roles/secretmanager.secretAccessor - Pub/Sub → verify service account has publish/subscribe roles
-
Is it a quota issue?
- Check Cloud Run → Quotas in Console or
gcloud compute project-info describe - Common limits: requests per second, CPU allocation, number of revisions
- Check Cloud Run → Quotas in Console or
GCP Multi-Region Failover Pattern
Cloud Run is regional. If us-central1 has an outage, your service in that region goes down. Mitigations:
Option 1: Global Load Balancer + Multi-Region Cloud Run
# Deploy same service to multiple regions
for REGION in us-central1 us-east1 europe-west1; do
gcloud run deploy my-service \
--image gcr.io/my-project/my-image:latest \
--region $REGION \
--platform managed
done
# Create Network Endpoint Groups for each region
for REGION in us-central1 us-east1 europe-west1; do
gcloud compute network-endpoint-groups create my-service-neg-$REGION \
--region=$REGION \
--network-endpoint-type=serverless \
--cloud-run-service=my-service
done
Option 2: Traffic Director for automatic failover
Use GCP's Traffic Director to route away from unhealthy regions automatically. Requires load balancer configuration — see GCP documentation.
Option 3: Minimum instances to avoid cold start amplification during incidents
gcloud run services update my-service \
--min-instances=1 \
--region=us-central1
During a partial outage when some instances are being recycled, having minimum instances prevents the cold start spike from compounding the problem.
Cloud Run Monitoring Best Practices
Cloud Monitoring alerts for Cloud Run
{
"displayName": "Cloud Run Error Rate Alert",
"conditions": [{
"displayName": "Error rate > 5%",
"conditionThreshold": {
"filter": "resource.type=\"cloud_run_revision\" AND metric.type=\"run.googleapis.com/request_count\" AND metric.labels.response_code_class=\"5xx\"",
"comparison": "COMPARISON_GT",
"thresholdValue": 0.05,
"duration": "60s"
}
}],
"alertStrategy": {
"autoClose": "604800s"
}
}
External uptime check (monitors from outside GCP)
Cloud Monitoring's built-in uptime checks run from within GCP infrastructure — they may not catch regional issues. Use an external monitoring service (like ezmon.com) that checks from multiple geographic locations outside GCP to detect regional failures independently.
External monitoring catches what GCP internal monitoring misses:
- GCP Load Balancer routing failures
- DNS propagation issues
- Global anycast routing problems
- SSL certificate expiry
Where to Get GCP Status Updates
- Status page: status.cloud.google.com
- Incident history + post-mortems: status.cloud.google.com/summary
- RSS feed: status.cloud.google.com/feed.atom
- JSON API:
https://status.cloud.google.com/incidents.json - Personal Health Dashboard: GCP Console → Home → "View service health"
- @GoogleCloud on X: official announcements during major incidents
Related Guides
- Is AWS Down? Amazon Web Services Status Guide
- AWS Lambda / RDS / EC2 Service Diagnostics
- Kubernetes Cluster Issues: Control Plane vs Node vs Networking
- Azure Q1 2026 Reliability Analysis
- Is Cloudflare Down?
Bottom Line
When GCP has issues, the fastest path to resolution is:
- Check status.cloud.google.com for your service + region
- Check Cloud Run logs for container-level errors
- Verify IAM bindings and service account permissions
- Use
gcloud run services describeto check deployment health - If confirmed GCP issue: implement multi-region failover or wait for remediation
Set up external uptime monitoring at ezmon.com to get alerted the moment your Cloud Run service becomes unreachable — before your users report it.