Building a Vercel-like Platform on Cloud Run: Architecture & Design DecisionsBuilding a Vercel-like Platform on Cloud Run: Architecture & Design Decisions
Vercel gets the developer experience right: push to git and you get production. Add preview deployments per branch, zero config for common frameworks, and a single dashboard. The goal here is to replicate that experience on GCP with Cloud Run as the compute layer—no vendor lock-in, full control over the pipeline and runtime. This post is for platform engineers weighing an internal PaaS. It covers the technical shape of the golden path we built.
Architecture Overview
Code lands in a Git host (in our case, GitHub). A webhook or trigger fires on push; that trigger runs the build. We use Cloud Build: it clones the repo, runs a builder (Buildpacks or Dockerfile), and pushes the resulting image to Artifact Registry. No image leaves the pipeline without going through this step. From the registry, Cloud Run deploys a new revision: each service is a container that scales to zero when idle and scales up on request. Traffic goes to the latest revision by default; you can split traffic for rollouts or pin a preview URL to a specific revision. The webhook can be the native integration (e.g. GitHub App or Cloud Build trigger) or a small orchestrator that receives the event and invokes the build; either way, the contract is "push to branch X → build → deploy to this service."
Routing and DNS sit in front of Cloud Run. For a single default URL you get a *.run.app domain and TLS out of the box. For custom domains you put a load balancer in front, attach a serverless NEG to the Cloud Run service, and point your domain at the load balancer. Google manages TLS for both. Secrets (API keys, DB URLs) live in Secret Manager and are injected at build time or at runtime as environment variables or mounted volumes. Config that varies by environment (feature flags, service URLs) can live in the repo as a small config file, in env vars, or in Secret Manager for sensitive values. The diagram shows secrets and config feeding into the build stage because that is where we inject most of them; runtime-only secrets are available to Cloud Run via the same store.
Why Cloud Run?
GCP offers several compute options. We chose Cloud Run for this PaaS because it gives container-native semantics without the operational cost of Kubernetes, and it scales to zero so previews and low-traffic apps do not burn budget.
| Option | Pros | Cons |
|---|---|---|
| GKE | Full control, rich ecosystem, long-running jobs, any workload. | More ops (nodes, upgrades, networking). Overkill for "push a web app and run it." |
| App Engine | Simple, fully managed, good for standard web apps. | Less container-native; you work in runtimes and config rather than "my Dockerfile." Harder to mirror local dev. |
| Cloud Functions | Event-driven, pay per invocation, no container to maintain. | Suited to events and small handlers, not full apps with many routes and dependencies. |
| Cloud Run | Containers, scale-to-zero, managed, no node management. | Cold starts, request timeout limits, no long-running background work. |
We wanted "build an image and run it" so that what developers run locally (e.g. Docker) matches what runs in production. App Engine would have forced us into its abstractions. GKE would have forced us to operate a cluster. Cloud Run sits in the middle: you bring a container, we run it. No nodes to patch, no cluster to size. Under the hood, Cloud Run is built on Knative; Google runs the control plane and the data plane. We get request-based scaling, revision management, and traffic splitting without operating Knative ourselves. That matters for a small platform team: we don't run etcd, we don't tune node pools, we don't manage Ingress controllers.
Scale-to-zero is a big deal for an internal PaaS. Many apps are internal tools or previews that get a few hits a day. On GKE or a VM you pay for idle capacity. On Cloud Run you pay for request time and CPU/memory only when the container is handling traffic. Previews per branch become affordable: each preview is a Cloud Run service that scales to zero when nobody is looking. We accepted a few tradeoffs:
- Cold starts: the first request after idle can take a second or two. We accept the default (min instances zero) for cost, and optionally set min instances to one for latency-sensitive production services.
- Request timeout: Cloud Run has a maximum request timeout (e.g. 60 minutes on the default, lower on some tiers). Long-running HTTP requests or background jobs need another pattern (e.g. a queue and a worker, or a separate job runner). For web apps and APIs that respond in seconds, this was acceptable.
- No long-running processes: we did not try to run cron-like or batch workloads in the same Cloud Run service. The runtime is request-scoped; we kept the model "one request, one response." Jobs that run on a schedule or process queues live elsewhere (e.g. Cloud Run Jobs or a dedicated worker pool). That separation keeps the PaaS simple and avoids overloading a single abstraction.
Key Design Decisions
Decision 1 — Detect and build projects
We needed a single path that works for "drop a repo and go" and for teams that want full control. We chose Buildpacks by default, Dockerfile as escape hatch. The pipeline detects the repo: if it finds a Dockerfile, it builds with Docker; otherwise it uses Cloud Native Buildpacks. Buildpacks give zero-config for many languages and frameworks: we don't require a Dockerfile in the repo. Teams that need a custom build or a specific base image add a Dockerfile and we use it. We considered framework-specific builders (e.g. Next.js-only) but that would have multiplied the number of paths we had to maintain. One generic builder (Buildpacks) plus Docker covers almost everything. The detection rule is explicit: presence of a Dockerfile in the build context switches to Docker build; absence triggers Buildpacks. We document that so teams can opt into either path without guessing.
Decision 2 — Configuration
We aimed for sensible defaults and an optional config file. No config file means "build from the repo root, deploy one service, use the default port." If the app needs a different root, port, or env, we support a small configuration in the UI.
Config that is secret (DB URLs, API keys) lives in Secret Manager and is mapped into the service by name. Non-secret env vars (feature flags, log level) are also managed in the platform UI. Developers can edit their environment variables in the platform UI, with a "sensitive" flag turned on by default to indicate that the value should be hidden from the UI going forward.
Each secret and environment variable can be overridden per environment (staging vs prod) in the platform UI.
Decision 3 — Multi-tenancy and isolation
We went with one Cloud Run service per application. Developers create Products; each product can contain multiple applications. Each application is a Cloud Run service with its own environment variables. One service per app keeps URLs predictable and isolation clear. Preview deployments and rollbacks were not implemented in the platform UI in the first version; the architecture supports them (e.g. point to specific revisions for previews, Cloud Run revisions for rollbacks), but we deferred the UI and lifecycle automation to a later phase.
For the sake of simplicity, we provisioned a single GCP project that became the first "pool" of applications. In the future, we will allow developers to bring their own GCP project as their Product and deploy applications to them.
Decision 4 — Custom domains and TLS
We created a global HTTP(S) load balancer; each Cloud Run service sits behind a serverless NEG. We set up a wildcard subdomain *.apps.example.com, and each application gets its own hostname derived from the service name and the pool the service is deployed in—for example, service-name--pool-1.apps.example.com. The double-dash syntax is a convention we chose to parse the hostname into a service name and a pool name.
TLS is handled by Google-managed certificates on the load balancer. We used Cloud DNS so domain creation and cert attachment could be automated.
What Vercel Does That We Didn't Implement (And Why)
We scoped the platform to what we needed: git push to production, one production URL per app, and configuration. Here is what we did not build in the first version and why.
Preview deployments per PR. Vercel spins up a unique URL for every pull request. We did not implement PR previews in the first version for the sake of time. We plan to use Cloud Run revisions so each PR gets its own URL; the open question is how to map those URLs in the Load Balancer.
Edge functions. Vercel runs code at the edge for low latency and geo distribution. We run everything on Cloud Run in a region (or multi-region) and did not add a separate edge layer. Reason: scope. Our apps are mostly APIs and server-rendered pages; we did not have a strong requirement for edge logic. If we need it later, we can add Cloud Run in multiple regions or use a CDN with edge logic.
Analytics and observability. Vercel gives you traffic, Web Vitals, and deployment analytics in the UI. We rely on GCP: Cloud Monitoring, Logging, and Trace. We did not build a custom analytics dashboard. Reason: GCP already provides the data; we did not want to duplicate it. We documented how to use existing GCP tools and, where useful, added links from our platform UI to the relevant Cloud Run or project metrics.
Serverless DB and storage. Vercel offers Postgres, KV, and Blob in the same ecosystem. We did not implement this in v1. The plan was to let developers provision databases, caches, and storage for their product via Pulumi templates maintained by the infra team—same idea as "provision from the platform," but using our existing GCP choices (e.g. Cloud SQL, Memorystore, GCS) instead of a bundled vendor stack. We may add "provision a DB from the platform" later as a convenience, also exploring other options like Supabase for Platforms.
ISR and static optimization. Vercel has Incremental Static Regeneration and smart caching for static assets. We run the container as-is: if the app is Next.js and uses ISR, it works on Cloud Run, but we did not add a platform-level ISR or edge cache layer. Reason: the app controls its own caching and static strategy; we did not want to sit in the middle. CDN and caching are app or load-balancer configuration, not a separate platform feature in v1. We may add a platform-level ISR or edge cache layer later as a convenience.
In short: we built the path from git to Cloud Run and left everything else to the existing GCP ecosystem or to a later phase. That kept the first version deliverable and let us learn what teams actually needed before adding more. If you're building something similar, start with that path and add features when usage justifies them.
The pipeline is straightforward: source control triggers a build, the build produces an image that lands in Artifact Registry, and Cloud Run serves it. The routing is handled with a global HTTP(S) load balancer and serverless NEGs. We traded edge features and some Vercel conveniences for control and GCP-native integration. That tradeoff fit our use case: we own the pipeline, we can plug in existing GCP services (Secret Manager, Cloud DNS, IAM), and we don't depend on a single vendor for the full stack. The result is a golden path that looks like Vercel from the developer's perspective—push and get a URL, without locking us into one ecosystem.
For the philosophy behind this kind of platform, see Gatekeeping vs. Golden Paths. In a follow-up post we'll cover the build pipeline and Buildpacks in detail.
- Platform Engineering
- GCP
- Cloud Run
- DevOps
- PaaS