Skip to main content

Dynamo Characteristics and Criticism

Think first
Dynamo is completely decentralized — no master node, no central coordinator. What responsibilities must EVERY node in the cluster take on as a consequence of this design?

What each Dynamo node does

Because Dynamo is completely decentralized (unlike GFS or BigTable), every node serves three functions simultaneously:

FunctionHow it works
Request coordinationAny node can act as coordinator for any key -- it manages get()/put() operations, forwards to appropriate replicas, or serves directly if it owns the key
Membership & failure detectionEvery node tracks the ring topology and other nodes' health via gossip protocol
Local storageEach node stores its assigned key ranges using a pluggable storage engine

Pluggable storage engines

Dynamo doesn't mandate a specific storage backend. Different Amazon services choose based on their needs:

EngineBest for
BerkeleyDB Transactional Data StoreGeneral-purpose key-value storage with transactions
MySQLLarge objects that benefit from relational indexing
In-memory buffer + persistent backingMaximum read performance (data served from RAM)

Characteristics

PropertyWhat it means for you
DistributedRuns on hundreds or thousands of machines -- no single machine limits
DecentralizedNo leader, no coordinator, no single point of failure. Every node is identical.
Linearly scalableAdd a node → get proportional capacity. No rebalancing headaches.
Highly availableOne node goes down, others cover for it. Multiple data centers go down, the system keeps serving.
Fault-tolerantData replicated to N nodes. Can survive simultaneous failures up to N-1 replicas per key.
Tunable consistencyApplications control the R/W/N parameters per-request. Strong consistency possible but not default.
DurableData persisted to disk on multiple nodes.
Eventually consistentDefault behavior -- writes propagate asynchronously, reads may temporarily see stale data.
Think first
Dynamo was designed for Amazon's internal trusted network. What security risks arise if you deployed a Dynamo-style system on an untrusted network?

Criticism

No system is perfect. Dynamo's design has real limitations that are worth understanding:

Full routing table on every node

Every node stores the entire ring topology -- every node, every token, every range. As the cluster grows to thousands of nodes, this metadata gets expensive to maintain and gossip. This is a scalability ceiling that centralized systems (with a dedicated metadata server) don't face.

Seeds compromise symmetry

Despite claiming "every node is equal," Dynamo uses seed nodes for bootstrapping -- externally discoverable nodes that prevent logical partitions. This is a form of asymmetry in an otherwise symmetric design.

Leaky abstraction

Clients must sometimes resolve conflicts themselves (merging shopping cart versions, for example). The abstraction "just put and get data" leaks when concurrent writes create conflicting versions that Dynamo can't automatically reconcile. This can make the user experience feel buggy if not handled carefully.

No security model

Dynamo was built for Amazon's trusted internal network. There's no authentication, authorization, or encryption. In an untrusted environment, a buggy or compromised node could act like a malicious actor -- and DHTs are particularly susceptible to certain attacks (Sybil attacks, routing table poisoning).

Systems built on Dynamo's principles

Dynamo is internal to Amazon, but its ideas spread widely:

SystemWhat it took from DynamoWhat it changed
CassandraConsistent hashing, gossip, hinted handoff, vnodesDropped vector clocks (uses LWW), added BigTable-style data model
RiakNearly everything -- the most faithful Dynamo cloneAdded CRDTs for automatic conflict resolution
VoldemortKey-value model, consistent hashing, quorumLinkedIn's internal use case
DynamoDBInspired by Dynamo's ideasCompletely different implementation -- managed service with strong consistency option
Quiz
You are building a new distributed key-value store for a public cloud service, inspired by Dynamo. Which of Dynamo's design decisions would be MOST problematic to carry over unchanged?
Interview angle

When discussing Dynamo's limitations, show that you understand the trade-offs are intentional. Full routing tables are the price of decentralization. Leaky abstractions are the price of eventual consistency. No security is a conscious scope limitation. Every design decision has a cost -- the question is whether the benefit justifies it for your use case.