, a fully-managed NoSQL database, is an impressive piece of technology, and it’s amazing that AWS has opened it for the entire world to use. What took millions of dollars in R&D to build – a product that services millions of queries per second with low latency – can be effectively rented for dollars per hours by anyone with a credit card. For those who need a key-value store that can store massive amounts of data reliably, there aren’t many better options.
While DynamoDB generally works quite well, it’s inevitable that we all run into issues. A few months ago at Segment, my colleagues wrote a detailed blog post about our own DynamoDB issues. Mainly, we were hitting our rate limits due to problems with our partitioning setup – a single partition was limiting throughput for an entire table. Solving the problem took a superhuman effort, but it was worth it (to the tune of $300K annually).
You can . But to save you the time, I’ve distilled the experience of our engineering team into eight pieces of advice that should help you make the most of DynamoDB and ensure that it really works for you.
1. Ask yourself: Do you actually need DynamoDB?
First, you should know if DynamoDB is actually the right tool for the job. If you have a small amount of data, require aggregations or the fine-grained ability to join together lots of data, DynamoDB probably isn’t the right tool for you. or is probably your best fit, or in the case where durability doesn’t matter and you don’t need aggregations, .
(my team has been using a fork of the for quite some time, so this is a welcome development). For additional cost savings, you can adjust how DynamoDB throughput is provisioned versus how much you are using it with AWS Lambda and CloudWatch events.
7. Leverage DynamoDB Streams
DynamoDB has a that can publish all changes to what is essentially a Kinesis feed. Streams are very useful for building other pipelines, so that you aren’t constantly running SCANs or doing your own eventing.
8. Log all of your hot shards
When my team faced excessive throttling, we figured out a clever hack: Whenever we hit a throttling error, we logged the particular key that was trying to update. In aggregate, this gave us a holistic view of what was going on, and allowed us to blacklist certain problematic “hot keys.”
DynamoDB will perform verydifferently depending on how your data is laid out and how you decide to query it. It’s important to understand your query patterns, indexing needs, and throughput. Although DynamoDB is a cloud service and run by AWS engineers, it doesn’t excuse poor architecture decisions. There is no “magic” under the hood. DynamoDB is a great piece of technology. Using it correctly can make the difference between a $1M services bill and one that’s a mere fraction of that.