April 24, 2026

AWS Lambda vs GCP Cloud Run: picking the right serverless model

awsgcpcloud
Lambda vs Cloud Run cover graphic for erkshitiz.com.np

I have shipped the same kind of workload, a small API service sitting behind a queue, on both AWS Lambda and GCP Cloud Run. They get compared constantly as if they are interchangeable, and for a lot of workloads they are, but the differences show up quickly once traffic stops being predictable.

Deployment shape

Lambda wants a function: one handler, one entry point, packaged as a zip or a container image that AWS unpacks and runs per invocation. Cloud Run wants a container: you bring a normal HTTP server, put it in a Dockerfile, and Cloud Run runs it behind its own load balancer.

FROM golang:1.22-alpine AS build
WORKDIR /app
COPY . .
RUN go build -o server .

FROM alpine
COPY --from=build /app/server /server
CMD ["/server"]

That difference sounds small, but it changes how you write the service. On Cloud Run, the service can hold state in memory across requests within the same instance, run background goroutines, keep a warm database connection pool, all the things a normal long-running server does. On Lambda, every invocation is nominally isolated, so you either accept that or you start working around it with provisioned concurrency and external connection poolers like RDS Proxy.

Cold starts

Both have cold starts, but they behave differently. Lambda’s cold start is per-function-version and gets worse with larger deployment packages and languages with heavier runtimes to initialize (the JVM cold start is a different order of magnitude than a Go binary). Cloud Run’s cold start is per-container-instance, and because it is a full container boot, you have more control over it: a smaller base image and a fast-starting server binary buy you a lot.

For a Go service specifically, both platforms have fast enough cold starts that it rarely mattered for us. Where it mattered was a Node service with a large dependency tree, where Cloud Run’s ability to keep a minimum number of instances warm made the tail latency far more predictable than Lambda’s provisioned concurrency, which is billed differently and needs its own capacity planning.

Concurrency and cost

This is the one that actually changed our architecture. Lambda bills and scales per invocation, one request per execution environment by default. Cloud Run allows many concurrent requests per instance (up to 1000, though we usually cap it much lower for CPU-bound work), which means a single warm instance can absorb a burst that would spin up dozens of separate Lambda executions.

For bursty, spiky traffic with mostly idle periods, Lambda’s pay-per-invocation model is hard to beat. For steadier traffic with meaningful concurrency per request (waiting on a downstream API, for example), Cloud Run’s request concurrency model usually ends up cheaper, because you are not paying for N separate execution environments to all sit there waiting on the same kind of I/O.

What we picked

We ended up running Lambda for event-driven, bursty background jobs (queue consumers, scheduled cleanup tasks) and Cloud Run for anything that looks like a normal API service with sustained traffic. Neither platform is strictly better, they are optimized for different traffic shapes, and the workload should decide which one you reach for rather than picking one platform and forcing everything onto it.