Durable Objects: A Coordination Primitive Built From Failure

- KinfeMichael Tariku

100 reads.

I've already mentioned this on a podcast we did at Devtopia, but one thing I genuinely appreciate about Cloudflare is how they drive new infrastructure primitives instead of just reselling old ideas with a nicer API. Storage, networking, edge compute and how they keep pushing the boundary.

One of those inventions is Durable Objects.

Durable Objects are not perfect. In fact, they are very much a system-fail invention. But that's exactly what makes them interesting.

At a high level, a Durable Object is a special kind of Worker , still running inside a V8 isolate, not a VM , but with a strong opinion:

it behaves like a tiny server with memory + disk, living inside Cloudflare's edge.

And that distinction matters a lot.

Why serverless makes distributed state hard

Traditional serverless systems are stateless by design. V8 isolates are lightweight, fast, and disposable , but that also means:

  • isolates can be destroyed at any time

  • memory is ephemeral

  • no shared state between isolates

  • no guarantees about execution order

On top of that, Workers intentionally cannot access many low-level VM APIs. This is not accidental , it's because of sandboxing, channeling risks, and side-channel attacks. The isolation model is a core security feature.

All of this is great for scalability and safety, but it makes distributed coordination extremely hard.

If you try to do things like:

  • syncing shared state across isolates

  • coordinating writes

  • managing counters, locks, or sessions

you quickly fall into the classic distributed systems traps.

The wrong approach: syncing everywhere

One possible approach is to say:

"Let's just sync everything."

For example:

  • distribute a WAL across workers

  • let every isolate talk to SQLite

  • coordinate writes like a distributed DB

This is roughly the problem space D1 lives in.

But doing this does not guarantee race-condition safety unless you introduce:

  • leaders

  • locks

  • consensus

  • conflict resolution

And at that point, you're basically rebuilding a database , inside a serverless runtime , across multiple isolates.

That's expensive, complex, and fragile.

Durable Objects choose a different model: ownership, not sync

Durable Objects flip the model entirely.

Instead of saying "everyone syncs", they say:

"Choose an owner."

You bind a piece of shared state to one logical instance:

  • per organization

  • per chat room

  • per user

  • per shard

Everything related to that state is localized to that one Durable Object instance.

No traditional syncing.

No distributed WAL.

No multi-writer conflicts.

Requests are routed to the same instance, and that instance becomes the coordinator.

This is the key insight.

Durable Object stubs: how ownership actually works

Ownership only works if Cloudflare can reliably route every request to the correct instance.
That mechanism is the Durable Object stub.

A stub is a stable handle to a specific Durable Object instance, identified by its ID.

The stub is not the object itself. It does not contain state. It contains identity.

When a Worker sends a request through a stub, Cloudflare resolves where the object currently lives and routes the request there.

If the object is restarted or migrated, the stub remains valid.

This is how Cloudflare enforces ownership without exposing location, networking, or replication details.

You are not load-balancing across instances. You are always talking to the single instance that owns the state.

But that introduces a big trade-off: single-threaded execution

To make this model correct, Cloudflare enforces a strong rule:

A Durable Object processes one event at a time.

That means:

  • no parallel request execution

  • no concurrent mutation

  • everything is serialized

This is where input and output gatekeeping comes in.

Input & output gates: correctness over cleverness

Cloudflare introduced input gates and output gates to control side effects.

Input gate

Blocks new incoming events while async work is still in flight (storage operations)

Output gate

Prevents sending responses or external effects until all writes are safely committed.

This solves two hard problems at once:

  • race conditions

  • optimistic writes with incorrect acknowledgments

You don't get to tell the outside world "this worked" until the system knows it worked.

Where batching and O(1) complexity comes in

This is the clever part.

Originally, storage operations (get() / put()) were expensive:

  • every put() meant a network round trip

  • more writes = more latency

  • request time scaled with the number of writes → O(n)

Cloudflare changed this model.

Now:

  • put() writes to in-memory cache

  • it completes almost instantly

  • even if you await it

The real I/O happens at the output gate.

All writes during a request are:

  • coalesced

  • batched

  • flushed together in a constant number of network round trips

So instead of:

n writes → n network waits → O(n)

you get:

n writes → 1 commit → O(1)

This is huge.

It means:

  • input gates stay blocked for far less time

  • throughput improves

  • latency becomes predictable

  • developers don't have to manually batch writes

This is one of those infra decisions that looks small but fundamentally changes how you design apps.

Why this works (and why it's safe)

Even though `put() returns immediately, Cloudflare does not lie to the outside world.

Because of the output gate:

  • no response is sent

  • no side effects escape

  • until storage is actually committed

So correctness is preserved, while performance improves.

It's very similar to how databases use:

  • write-ahead logs

  • transactional commits

  • delayed fsyncs

Just adapted to a serverless edge runtime.

The downsides (and they matter)

This model isn't free.

Some real drawbacks:

  • single-threaded bottlenecks under high contention

  • long-running requests block everything behind them

  • batching can hide write cost until the end, causing latency spikes

  • debugging performance issues requires understanding gates

  • you might careful shard boundaries meaning Durable Object regions which is implicit by default.

Durable Objects reward good domain modeling , and punish lazy ones.

Final thought

Durable Objects are not magic.

They are not a general database.

They are not "serverless but with state".

They are a carefully constrained coordination primitive, built to work within the hard limits of V8 isolates and edge security.

And honestly?

That constraint-driven design is what makes them impressive.

If you treat them like a tiny server with ownership , not like a distributed cache, they shine.

If you fight the model, they will absolutely fight back.

✨ Schedule a call ✨

Let's talk and discuss more about my project and my experience on farming the modern technology

Schedule