Serverless Done Right

Datadog's 2024 serverless report found that 90% of AWS Lambda users now use Node.js or Python for faster cold starts. I criticized serverless as a lie because most implementations fail to deliver on the promises. But I've also built serverless systems that worked beautifully, systems that genuinely reduced operational burden and scaled effortlessly. Here's what separates the successes from the disasters.

TL;DR

Use serverless for event-driven, bursty workloads. Avoid for latency-sensitive or long-running processes. Configure provisioned concurrency to eliminate cold starts.

The Critique Serverless Was a Lie

→

The difference isn't the technology. It's understanding what serverless is actually good for.

The Sweet Spot: Event-Driven, Bursty Workloads

Serverless excels at one thing: handling unpredictable, spiky traffic without maintaining idle capacity. If you have workloads that go from zero to thousands of requests and back to zero, serverless shines.

Examples that work well:

Webhook handlers: External services call your endpoint unpredictably. You don't know when or how often. Lambda handles this perfectly.
Image/video processing: User uploads a file, a function processes it. Traffic is inherently bursty and unpredictable.
Scheduled tasks: Daily reports, nightly cleanups, periodic syncs. Functions that run for minutes, not continuously.
API backends for mobile apps: Traffic varies wildly between 3 AM and peak hours. Scaling to zero during low periods actually saves money.

The pattern: short-lived, stateless operations that respond to events. If your workload fits this model, serverless can genuinely simplify your infrastructure.

Avoiding Cold Start Pain

Cold starts kill user experience. A function that hasn't run recently needs to initialize, sometimes taking seconds. For user-facing endpoints, this is unacceptable.

Strategies that actually work:

Provisioned concurrency. AWS Lambda and other providers let you keep functions warm. You pay for idle capacity but eliminate cold starts. This makes sense for latency-sensitive endpoints.

Language choice matters. AWS benchmarks show Python and Node.js cold start in hundreds of milliseconds. Java and .NET can take seconds. For latency-sensitive functions, choose languages with fast initialization.

Minimize dependencies. Every library you import adds initialization time. I've seen functions go from 3-second cold starts to 200ms by removing unused dependencies. Be ruthless about what you include.

Keep functions small. A function that does one thing initializes faster than a function that imports your entire application framework.

Use edge computing for latency-critical paths. Cloudflare Workers, Lambda@Edge, and similar services run closer to users with minimal cold start. The tradeoff is more limited compute capabilities.

Managing State Without Servers

Serverless functions are stateless by design. State management is where most serverless projects go wrong. They try to bolt state onto a stateless paradigm.

What works:

External state stores. DynamoDB, Redis, or managed databases hold state between function invocations. Design for this from the start, not as an afterthought.

Event sourcing. Instead of storing current state, store events that describe what happened. Functions process events and can reconstruct state as needed. This pattern fits serverless naturally.

Step Functions for workflows. AWS Step Functions (and equivalents like Azure Durable Functions) manage multi-step processes with state. Instead of one complex function tracking state internally, you have simple functions orchestrated by a state machine. This pattern handles retries, timeouts, and error handling declaratively.

Accept eventual consistency. Serverless systems work best when you don't need strong consistency. If your requirements demand transactions across multiple services, serverless adds complexity instead of removing it.

The Right Granularity

Nano-services are as bad as monolithic functions. I've seen projects with hundreds of tiny functions (each handling one operation) become unmaintainable. I've also seen single functions trying to do everything.

The sweet spot:

One function per bounded context or feature. A function handles related operations, not every operation or just one. The "users" function handles create, read, update, delete for users. Not four separate functions, not one function for the entire API.

Deploy together, scale together. Group operations that need to scale together. If create-user and validate-user always correlate, they belong in the same function.

Shared code as layers. AWS Lambda Layers, Azure Artifacts, or similar mechanisms share common code without duplicating it across functions. Use layers for utilities, not for tightly-coupled dependencies.

Observability Is Non-Negotiable

Debugging distributed systems is hard. Debugging distributed serverless systems without proper observability is nearly impossible. Before building anything, set up:

Structured logging. Every log entry needs correlation IDs, function name, and context. Use JSON logging that tools can parse. "Error occurred" is useless; "Error in payment-process, orderId=123, error=timeout" is actionable.

Distributed tracing. AWS X-Ray, Datadog, or similar tools trace requests across functions. Without this, you can't follow a request through your system.

Custom metrics. Business metrics, not just infrastructure metrics. Track what matters: orders processed, payments completed, errors by type. CloudWatch/Datadog custom metrics make this straightforward.

Alerts on errors, not just capacity. Serverless auto-scales, so capacity alerts matter less. Error rates and latency percentiles matter more. Alert on p99 latency, not average.

The investment in real observability (not just dashboards) pays for itself on the first production incident.

Cost Control Strategies

Serverless billing is unpredictable. Functions that seem cheap at test scale become expensive at production scale. Manage costs by:

Setting concurrency limits. Cap maximum concurrent executions to prevent runaway costs. Better to queue requests than to incur unbounded spend during traffic spikes.

Monitoring execution time. You're billed per millisecond. A function that could complete in 100ms but takes 500ms due to inefficiency costs 5x more. Profile and optimize hot paths.

Right-sizing memory. More memory means more CPU and faster execution, which can be cheaper than slow execution with less memory. Benchmark to find the optimal allocation.

Caching aggressively. Lambda execution environments persist between invocations. Cache database connections, configuration, and expensive computations in the function context. This single optimization often cuts execution time in half for database-heavy functions.

Knowing when containers are cheaper. At sustained high load, containers or servers cost less than serverless. If your function runs 24/7 at high concurrency, you're paying a premium for scaling you don't need. The crossover point varies, but I've seen teams save 60-70% by moving always-on workloads from Lambda to ECS.

The Hybrid Approach

The best serverless architectures aren't purely serverless. They use serverless where it fits and containers or servers where it doesn't.

Serverless vs Containers Decision Guide

Check the characteristics of your workload to see which fits better.

A typical successful pattern:

Serverless: Event handlers, webhooks, scheduled tasks, infrequent operations
Containers: Core API with predictable load, long-running processes, stateful services
Managed services: Databases, caching, queues. Don't reinvent infrastructure.

This hybrid approach captures serverless benefits without forcing every workload into the serverless model. Choose managed services strategically. They reduce operational burden for the right use cases.

The Bottom Line

Serverless works when you use it for what it's good at: event-driven, bursty, stateless workloads. It fails when you force continuous, stateful, latency-sensitive workloads into the model.

Success requires understanding the constraints: cold starts, statelessness, execution limits, and unpredictable costs. Work within these constraints instead of fighting them. Invest in observability before you need it. Know when containers are the better choice.

The serverless projects that succeed don't treat it as a universal solution. They treat it as one tool among several, deployed where its strengths matter and avoided where its weaknesses hurt.

"The serverless projects that succeed don't treat it as a universal solution. They treat it as one tool among several, deployed where its strengths matter and avoided where its weaknesses hurt."

Sources

The State of Serverless 2024 — Datadog's analysis of serverless adoption and patterns
AWS Lambda Execution Environments — Understanding cold starts and execution context
AWS: Lambda Performance Optimization — Cold start reduction strategies

Architecture Review

Considering serverless for your workload? Get assessment from someone who's seen both successes and disasters.

Get Assessment

The Sweet Spot: Event-Driven, Bursty Workloads

Avoiding Cold Start Pain

Managing State Without Servers

The Right Granularity

Observability Is Non-Negotiable

Cost Control Strategies

The Hybrid Approach

Serverless vs Containers Decision Guide

Serverless vs Containers Decision Guide

The Bottom Line

Sources

Architecture Review

Living With This Decision?

Related Articles

Microservices Decision Guide: A Framework for Architecture Choices

Why PostgreSQL Keeps Winning

Static Sites Still Win