Episode #35: Advanced NoSQL Data Modeling in DynamoDB with Rick Houlihan (Part 2)

February 10, 2020 • 52 minutes

Jeremy continues his conversation with Rick Houlihan about NoSQL Data Modeling. They discuss why you likely don't want to use LSIs, when sharding is necessary, the benefits of denormalization, how to efficiently store large document deltas, and much more.

This is PART 2 of my conversation with Rick Houlihan. View PART 1.

About Rick Houlihan

Rick has 30+ years of software and IT expertise and holds nine patents in Cloud Virtualization, Complex Event Processing, Root Cause Analysis, Microprocessor Architecture, and NoSQL Database technology. He currently runs the NoSQL Blackbelt team at AWS and for the last 5 years have been responsible for consulting with and on boarding the largest and most strategic customers our business supports. His role spans technology sectors and as part of his engagements he routinely provide guidance on industry best practices, distributed systems implementation, cloud migration, and more. He led the architecture and design effort at Amazon for migrating thousands of relational workloads from Oracle to NoSQL and built the center of excellence team responsible for defining the best practices and design patterns used today by thousands of Amazon internal service teams and AWS customers. He currently work on the DynamoDB service team as a Principal Technologist focused on building the market for NoSQL services through design consultations, content creation, evangelism, and training.


Jeremy: So one of the things that you have never mentioned or at least I don't think I've ever seen you mention it, at least not in any of your talks for your modeling is local secondary indexes.

And I used to think, "Hey, this is great. They've got really strong guarantees and then it's sort of this great use case if you want to do a couple of different sorts." But LSIs are not quite ...

Rick: Not the panacea you might think they are.

Jeremy: Yes, correct.

Rick: So LSIs, I'm not exactly sure. I mean, I think you're exactly correct. The biggest value of LSI is the strong consistency, right? But the limiting factor of the LSI is it doesn't really let you kind of regroup the data, right?

Jeremy: Right.

Rick: You have to, you have to use the same partition keys to the table. So the only thing you can really do is resort the data, right? So right there, that's a limited set of use cases, right? There's not a lot of access patterns. I mean there are, but there's not necessarily a ton of access patterns or applications that only required me to resort the data. Most applications are going to require to group the data on multiple dimensions so that limits the effectiveness of the LSI. The other thing about the LSI that kind of stinks is they have to be created at the time the table is created, they can never be deleted.

So if you mess it up, then you've got to recreate the table to get rid of them and I find them to be extremely limited use. I mean, most developers can tell you that strong consistency is an absolute requirement, but when you get down to it and started looking at the nature of their application, yeah, what they really need is read after write consistency, right? It's worth kind of talking about the difference, right? Strong consistency implies that no update to the database is going to be acknowledged to the client unless all copies or all indexed or copies of that data are also updated, right?

Jeremy: Yeah.

Rick: That's strong consistency. That means if I'm in a highly concurrent environment, that no two clients could read different data, okay? Unless the read is not, or the write is not yet fully committed. As long as the right hasn't committed, you're not going to get two copies of the data. Well, most use cases are really more about like if I make the write and I read back, did I get the right data?

So what we're really talking about is read after write consistency. If you think about the round trip between the client and the system, if I have a let's say in DynamoDB, GSI replication is 10 milliseconds or less, it's highly unlikely that you're ever going to be able to return to the client, that the client is ever going to be able to returned to the server and ask for the same data back in 10 milliseconds.

Jeremy: And honestly, if you do, welcome to distributed systems.

Rick: That's exactly right. I mean, that's the other thing I was going to say and most distributed systems, what you'll find is there's a propagation delay on configuration data. So oftentimes, even if you get to the point where the developers are going to tell you that there's going to be concurrent access on this data, when you back up a step, you're going to find configuration data is going to live in multiple entities. So hey, all bets are off, right?

So let's take a look at that need for strong consistency and not make arbitrary requirements because as developers when we make arbitrary requirements, it's like hooking a fire hose up to our wallets. Let's make sure that we're actually making requirements that are meaningful to our business. 90% of the application workloads I work with, I would say even maybe even higher don't require strong consistency. So let's just use those GSI. They're much more flexible, right? They can be completed anytime, they carry their own capacity allocations. They don't pillage capacity from the table. Overall they're just a lot more flexible.

Jeremy: Yeah, and you've got more control. I mean, that's one of those things too. If you are doing the single table design and you're using all those different entity types and so forth, what are the chances that all those LSIs and the sorts all align with one another too. It seems like a lot of wasted capacity.

Rick: Inevitably, you're going to end up using GSIs, right?

Jeremy: Right. Exactly.

Rick: You may be able to use an LSI for one use case, but you can't use them for all of them.

Jeremy: Yeah, and I mean, and I think just the important thing about LSIs too is regardless of the inflexibility of them, there's also a doubles the costs, right?

Rick: Well, all indexes double the cost, right? I mean [crosstalk 00:49:35]

Jeremy: Of course, yeah.

Rick: Because actually, one of the things people kind of ... It's kind of an incorrect assumption about LSIs is that customers believe that, "Oh, they use the same capacity as the table. Oh, they must be free." No, they're not free. You still pay for the storage, you still pay for the capacity. I'm just going to have to allocate twice as much capacity to the table now.

Jeremy: Moving on from LSIs and GSIs, the other thing that always comes up is this idea of hot keys or hot partitions where you basically have one key that gets access quite a bit. You sort of pointed this out in your slides where you see sort of as big red marks and sort of heat, this heat map where you get one partition that is red or is being accessed quite a bit.

So we can talk a little bit about the performance of those things, but I'm actually curious what happens if a partition exceeds that 10 gigabyte partition limit?

Rick: Oh sure. Okay. So yeah, so as you pointed out, there's a partition size limit in DynamoDB, there's capacity and throughput limits in DynamoDB and the reason we chose to do this is because we wanted a system that was responsive and scales in minutes, right? If the larger systems like you look at a MongoDB or DocumentDB that has used very large storage nodes, they have large capacity storage nodes, it takes them a long time to be able to add new capacity and what we wanted was a system where a user can come in and say go from 10 WCUs to a million WCUs and do that in realtime, right? Not months, literally months.

So what happens when a partition exceeds its 10 gigabytes is the system behind the scene is going to say, "Okay, I need to move this data into multiple storage partitions." So the way that NoSQL databases scale is they're going to add partitions. When they add partitions, they need to copy data to those new partitions in order to be able to bring them online so to speak. If I use extremely large storage nodes, then it takes me a long time to copy that data, okay?

Jeremy: Right.

Rick: So in large NoSQL clusters, I mean, the largest MongoDB cluster is about 64 shards right now. They're adding shard 65, they started in November, they expect to be done sometime in the next couple of weeks and that's no joke. That is no joke.

Jeremy: That's really scary.

Rick: Yeah, it is scary. It's really scary for your business. I mean, what happens if they see a surge in traffic in the meantime, right?

Jeremy: Yup.

Rick: They're DOA. So and they're actually talking to us to migrate because they know that when they go to add shard 66, it's going to take them nine months, right? So it's not something that's going to work for their business. So anyway, so we want to be able to scale in minutes and we can do that because we use small storage nodes. When a storage nodes hits 10 gigabytes, is going to split into two nodes.

Now, I can copy five gigabytes of data in literally seconds, right? And I can do that in parallel and number of times. So that's how DynamoDB table scale gracefully is they have these large number of smaller stories nodes when you want to add capacity, we just split those storage nodes very quickly in parallel and we can bring that capacity online in minutes and that's the advantage there. So that's kind of what happens.

Jeremy: Yeah, and so that only works though if you are not plagued by a local secondary index though.

Rick: Yes. So the local secondary index, again, another one of those limiting factors, the local secondary index. Since we only allow you to resort the data, not regroup the data, that's what gives us the ability to support strong consistency. But the only way we can do that is to ensure that all the data between the local secondary index and the table actually live on the same partition. So if you have a partition in it's single logical partition in DynamoDB on a local secondary index that exceeds 10 gigabytes, it's going to throttle the table and stop the rights because if you think about, if I resort the data inside of a logical partition and it's larger than 10 gigabytes, then that's automatically going to mean that some of the data lives in two ... The data lives in two places and maintaining consistency on two physical hosts is hard.

So we punted on that idea and said, "We'll give you a consistent indexing, but don't ask for more than 10 gigabytes in a single logical partition." That doesn't mean that an LSI can't be larger than 10 gigabytes, it just means that a single logical partition value can not contain more than 10 gigabytes of data.

Jeremy: Right. And the other thing is in order for it to split data across multiple nodes, it has to have a sort key.

Rick: That's correct. Yeah, yeah, yeah. If you're going to split a single logical partition, I mean, if you don't have a sort key than that, you're limited to 400 kilobytes in a single partition because that's the item size limit in DynamoDB.

Jeremy: Got you. Okay. All right. So then in terms of throughput performance, if you actually are on multiple nodes, wouldn't you have better throughput?

Rick: If you are on multiple nodes, yes, you have better throughput and that is another advantage of DynamoDB with lots of small storage nodes, right? We can increase the throughput of the system more easily. Now, we do need to maintain some proportion of throughput to capacity with storage allocation, right? So if I have a storage device, a storage node that has X terabytes of data, and I'm carving that up into 10 gigabyte chunks, I kind of also need to carve out the IOPS as well because otherwise, there's no way for me to guarantee that that capacity will be there for you when you come ask for it, right?

So that's kind of what you're doing. When you reserve capacity in DynamoDB, it's guaranteed you're going to get it and it's up to us to make sure there's enough capacity on the system that to satisfy your request, but whatever you allocate, there's ... Nobody is going to be able to take that from you and nobody's going to brown out your workload because they're too busy and you're sharing a storage node.

Jeremy: Yeah, so now if you were to create partitions with hundreds of gigabytes of data, it's going to spread and split itself across multiple nodes. There's sort of a throughput benefit, I think they are performance game because you're sort of doubling or increasing that throughput, but is that something we should avoid? Should we try not to create partitions more than 10 gigs?

Rick: Well, I mean, it's going to be hard to do that, right? I mean, most applications it's going to be ... You've got to have the data to aggregate. I mean, if I'm partitioning data and I'm saying I want orders by a customer, the 10 gigabytes of orders could be a lot of orders.

Jeremy: Right.

Rick: That's really what it comes down to. I wouldn't say avoid it. The one thing to be aware of when you're working with the data moving in and out of these individual logical partitions is you do want to be aware of velocity. How fast am I moving the data in and out?

Now, having 10 gigabytes of data in a single logical partition is really no big deal, but if I have to read it really quickly, that's going to be a problem because you're only going to get 3,000 RCUs, that's a megabyte a second. So you can only really at one megabyte a second. If I have a gigabyte of data inside of that logical partition, that's going to take me a hundred or a thousand seconds to read that gigabyte of data. So if I had 10 gigabytes, yikes, I'm going to be reading for a while. Right?

Jeremy: Right.

Rick: This is where we start to talk about right sharding and read sharding.

Jeremy: Yeah, sharding. So yeah, so let's talk about charting for a second because that is something that I think some people see that as, "Oh, if I want to be able to read a bunch of data back quickly or whatever, I have to have to split it up, I have to use some sort of hashing algorithm maybe, I have to figure out how much I want to sort of spread out that key space." But actually, there are quite a few benefits to doing that, right? Because you can read it in parallel and things like that, right?

Rick: Right, absolutely. I mean you want to increase the throughput of any NoSQL database, you'd talk about parallel access, right? So in DynamoDB, what we're going to try and do is if your access pattern exceeds, 1,000 WCUs or 3,000 RCUs for a single logical key and now bear in mind that I had ... It sounds like that's not a lot, but I have architected, I don't even know how many thousands of applications at this point on DynamoDB and right, sharding comes into play like, I don't know, less than 1% of the time.

Jeremy: Okay.

Rick: Most workloads are just fine with those capacity limits, right? And if there was a problem with that, then we would be working to adjust them. We just don't see that as being a problem. We see that as being more of a concern that developers might have when they start learning about the system, but when we actually start going into the implementation cycle, what we find is nobody writes shards.

Now, that's not to say nobody, some people absolutely need to and when ... But that is just a nature of the beast when you're dealing with NoSQL, right? Because we're dealing with a partition data store. If I want to increase throughput, I need to increase the number of storage nodes that are participating. This is true for every NoSQL database. It's just that the individual throughput of the storage nodes and the legacy NoSQL technology is higher because they're using entire physical servers is the storage node whereas DynamoDB takes it as physical server and chops it up into a thousand storage nodes.

So, and again, the reason we do that is we want to scale gracefully and we found that the write throughput and read throughput settings that we've adopted tend to accommodate the vast majority of workloads. So again, if your throughput requirements are higher on a per logical key basis, let's talk because it's not that hard to do, right? It's just a mechanical chore. Once you've kind of implemented that mechanism underneath the data layer API, most of your developers don't even know what's happening.

Jeremy: And I think if you're an application developer and you run into a problem like that, it's a good problem to have because obviously, you're doing pretty ... Your application is being used.

Rick: [crosstalk]

Jeremy: Exactly. All right, so let's move on to denormalization, right? This is another thing I think that the trips a lot of people up. We talk about third normal form and stuff like that, that when you're optimizing a regular SQL database, we want to split everything up into separate tables and so forth. But in in DynamoDB and NoSQL databases, we often have to denormalize the data, we have to put logical data together. Sometimes we have to copy things to multiple records, sometimes we have to copy things into the same attribute and things like that. So what are of the advantages though to de normalization?

Rick: Time complexity on your queries. I mean, that's really what it comes down to and we'd started talking about NoSQL, we're talking about cost efficiency, right? We're talking about the low latency consistent performance at scale and the way we get that is through denormalizing the data, right? Because now, everything instead of select star from inner join, inner join and inner join, it's select star from where X equals, right? Now a single table filtered select from a relation database is blindingly fast, right?

Jeremy: Mm-hmm.

Rick: And that's what we're really doing. NoSQL reduces every query to a single table filtered select and that is why it's going to be faster and it requires us to denormalize in order to achieve that effect.

Jeremy: Yeah. And one of the things that I really like about denormalization is this is something I designed SQL databases for a very, very long time, built a lot of applications, a lot of eCommerce products on there and one of the things that always drove me nuts was you have a history of orders, maybe two years of a customer's order or customer orders and then they update their email address and then suddenly your join, now you have the email address that they currently have, not the email address they had when they placed the order because of the way that that works. And so unless you're denormalizing data which is eventually what I ended up doing anyways was to keep a record ...

Rick: Right. Yeah, you got to have the history of ... Some of the data is going to be immutable, even if the user changes it, right? You're going to want to know that it was ordered by so-and-so when they had this name before they were married.

Jeremy: And what their address was at the time.

Rick: What their address was at the time and what was their phone number when it happened, right? Yeah, absolutely and so when you normalize data, when you would eliminate that data from those records, then you're eliminating the ability of the system to keep track of it and as you pointed out, the only way to do that is to de normalize it, right? And so at that point, and this is again, actually it's a good point you bring up because it's one of the things we found at Amazon retail that we were denormalizing our data inside of our relational databases to deal with the scale of the system that we were trying to support, right? We couldn't calculate these common KPIs using queries anymore. Things like the counts on the downloads of the tracks for Amazon Music, right?

Jeremy: Mm-hmm.

Rick: I mean, for a while they were just select count from download table. Okay.

Jeremy: Oh geez.

Rick: Yeah, oh geez, is right. So after a while they're like, "Oh, well, let's create a roll up table and we're going to have a top level counter for downloads for the song and we'll just update that every now and then." Right?

I'm like, "Yeah, that makes a lot more sense, but what have you done? You've de-normalized the data, right?" And so at this point then, why am I not using a first-class NoSQL database? I'm trying to turn my relational database into a denormalized database, and then the next step you'll see people do, I see it all the time.

We have thousands of RDS customers that are doing sharded Postgre, sharded MySQL. I mean, hey, there's things out there, there's technologies people have built, pgRouter and stuff like this to be able to support and I'm telling you, as soon as you shard your relational database, man, you've gone down the road, let's go into NoSQL, right? I can't join in class instances anymore, right? So now, let's go back into a database that's built for that.

Jeremy: Yeah. No, and actually, one of the former startups I was at, we built an entire, MySQL cluster that had like a master master sort of directory service that would tell you which shard a particular customer was in. And then you were replicating relationships and things like that and you're like, "Wait a minute, why don't I just store this in one place that I can actually read this data from?"

Rick: That's exactly right.

Jeremy: But yeah, no, so I totally feel the pain there. So the audit trail piece of this I think is something that's really, really interesting and another thing you had mentioned in your 2019 re:Invent talk was this idea of sort of partial normalization where you might have, and the example you gave was this big insurance quote and you said that, "You don't want to store copies of the same quotes, especially if it's big. You don't want to store the same thing over and over and over again, but the immutable data, sure. Things that aren't going to change, their address probably isn't going to change, things like that."

But if you're updating maybe what the value of the quote is or something like that, you actually talked about breaking those up into smaller attributes.

Rick: Yeah, yeah. So in that particular example, it was an interesting use case. We had a customer very happy with the system. They were an insurance service, they had about 800 quotes per minute I think was their update rate. They were provisioned about thousand WCUs and their use case was pretty simple. Users came in, they create an insurance quote. They might edit that quote two or three times. Then they'll go ahead and execute the contract or drop the transaction and every ... The way they were kind of store the data is they would have the customer ID was the partition key.

The quote ID and version was the sort key and if they came in to get a particular quote, they'd say, "Okay, select star from customer ID." Where quote ID starts with or where sort key starts with quote ID and they would get the quote and all the versions of that quote and then the customer could go back and page through them. The thing was each one of those items was 50 kilobytes, 99% of the data in those items never changed. So every time they created a version of the quote, they're storing 50 kilobytes of data that what really existed in the last version, right?

So that's basically what I basically recommended to him was, "Hey, create the first version of the quote and then store deltas. Every time someone changes something, just store what changed." Now when you go use the same query where customer ID equals X starts with quote ID, but what you're getting is the top level quote and all the deltas and then the client side, you can just quickly apply the deltas and show them the current version and then whenever they need to see the previous versions, you just back the deltas off as they back through the various versions of the quote. So this caused a significant decrease in their WCU provisioning after they went from a thousand WCUs provision to 50.

Jeremy: That's amazing.

Rick: That's a 95% reduction. So that was a really good example of how understanding that ... Don't store data you don't need to store. Denormalization does not always mean copying data, right?

Jeremy: Yup.

Rick: And you've got, and this is a really good example of how you can look at what you're doing with your data, how is the data moving through the system, right? Because this is really oftentimes what we find is we're reading data we don't need to read, we're writing data we don't need to write.

One of the biggest problems we see in NoSQL and it's facilitated by the databases that support these really, really large objects, right? Things like I think MongoDB supports a 16 megabyte document, right? And the reality is that I don't know, very many access patterns, and again, I've worked with thousands of applications at this point that need to get 16 megabytes of data and it gives you in single request.

So oftentimes you'll see in MongoDB these really giant data BLOBs and users are going to say, "Get me the age of this user." [crosstalk] into this big giant data BLOB and they'll pull out a four byte in, right?

Jeremy: Yeah, and you have to read the whole thing. You have to read the whole thing in order to get it. Yeah.

Rick: ...to get this four byte in, right? So one of the things I do quite frequently is I work with a lot of customers on legacy NoSQL technologies like MongoDB or Cassandra or Couchbase and I'll actually get them to the correct modeling state of their application and so they'll come to me and they'll talk to me about migrating to DynamoDB and when I'm done, they ended up staying on MongoDB and I'll talk to them and get in a year when they actually do have to scale, but you know what I mean?

It's like it's nice because the design pattern's best practices and data modeling that we've built and that we've developed over the years working with the CDO and doing that large migration, it turns out that all of that stuff is directly translatable to every NoSQL technology and what it really exposed was how wrong the implementation philosophy is and how much of the industry is revolving around some really incorrect assumptions and things that they say. And it was eye opener to go through that. I was one of those people.

Jeremy: Well, I mean, I think that's what's really great is the ... To see the thinking evolve though and sort of get to that point. So I want to move on to something else so just quickly ... That's the quote thing. So one of the things you had mentioned too is this idea of pushing a lot of that complexity in terms of maybe reassembling the quote like pushing that down to the client. So is that something you ...

Rick: Yeah, absolutely. You know what? Those clients are 99.9%, "I had a loop man. Make them do some work." Right? I mean, we [crosstalk] a lot though. I've got some reservation use cases where I was talking to some customers and maybe they were creating items in the database in the table, an item for each one of the availabilities in the calendar or something like that and then they would come in and they would update that item with who booked it.

And I was like, "Don't do that." Just store the items that people booked and then on the client side, when they just say, "Here's the day that I want to book an appointment for, send them down the things in the book then let them figure out what slots are available."

Jeremy: Which one's are booked.

Rick: Otherwise, I've got to do a more complex query to kind of figure out which items are available and which items had been booked and I got more work to do with the application server.

I am a big fan of pushing whatever logic I can down to the end point, right? Give them a chunk of data and let them triage this, give them enough data to do the two or three things that I know they're about to do as soon as they make that request, right? I mean, it's a way better experience for the end user to have a responsive application, right?

Jeremy: Yeah.

Rick: Preload some of that data so that they know, I know 99% of the users that come in here when they ask for this, the next thing they hit is that, okay, great or the next thing they hear is one of these three things. Great, guess what? They're going to get all three of those things and it saves round trips to the server and what are we talking about? Most of the time we're talking about pushing down a couple of kilobytes of data, right? And [crosstalk 01:09:14]

Jeremy: Right, it's not a ton of data.

Rick: Yeah, it's not a lot. So let's get it down there. Yup.

Jeremy: And now with 5G on your mobile devices ...

Rick: I know and you've got one of the unlimited data plans and all that stuff.

Jeremy: Exactly. All right, so you mentioned MongoDB and you mentioned Cassandra and obviously, one of the new things that was announced was managed Cassandra. So I know the characteristics are very much very similar to DynamoDB, but other than somebody sort of already using Cassandra, why would you use the Managed Cassandra? What would be the reason for ... Would you start with that or would you just suggest people start with DynamoDB?

Rick: I think what ... The Managed Cassandra service is awesome. Okay, it's actually DynamoDB DNA. So when you're using the managed standard service, you're using the backing, a lot of the backing infrastructure from DynamoDB, but it's not DynamoDB. It's actually 100% full version of the opensource Cassandra. What we basically did was replace the DynamoDB request router with the Cassandra instance. It's fully managed in the back-end.

So it's actually a really neat piece of technology. I love how they did the implementation. However, it is more expensive for us to run those Cassandra front-ends than it is the DynamoDB head node. So as a result, the Cassandra cluster MCS is going to be ... Not significantly, but it will be noticeably more expensive than DynamoDB. So if I was looking at a brand new workload today, I'd go DynamoDB first. That's still our approach, it has always has been our approach.

I mean, we released DocumentDB for a subset of customers that have the need to have a fully managed MongoDB solution. They don't necessarily want to pay two vendors, right? If you go to Atlas, you're kind of paying MongoDB and paying us through MongoDB. They wanted an AWS native managed solution. So we did that for them. However, we are still running a DynamoDB first philosophy and for all the reasons that we've talked about in the past, right?

Cost efficiency, scale of the service, the robust nature of the system is unparalleled. It's unmatched. We get a lot of that with MCS. You get almost all of it. As a matter of fact, you do get all of it, but you're paying a premium for that managed Cassandra head mode.

Jeremy: Well, I'm just wondering too because I mean we hear all this stuff about people who want to be multi-cloud and they want to be some sort of vendor agnostic or something like that which again, if you were to choose Cassandra, you're still locking yourself into a vendor, but I wonder, I just wonder if it's something that would help maybe customers that don't consider DynamoDB first that this might be ...

Rick: Oh sure. I mean, if you are hung up on a cloud agnostic or a vendor agnostic. I guess like you said, vendor agnostic, what does that mean when I choose Cassandra or MongoDB, but whatever. I mean, I think what they're really worried about is cloud agnostic, right? They want to and I've seen ... This is a fun factor argument for legacy technology providers, right?

I mean, when you go to the cloud, you've got two choices, right? I can lift and shift my existing data center and deploy exactly as is and I'll never know the difference and I've done it. I've taken very complex enterprise IT infrastructures and recreated them 100% to the point where the IT admins have no idea that they're not working on their whatever, on prem facility, right?

It looks exactly the same, okay? Now, that's not a really great way to use the cloud, right? I mean, you're not going to maximize your benefit, right? Then you're probably going to see a slight cost benefit, maybe even not a cost benefit, right? Because you're, you're really not taking advantage of any of those cloud native services.

They're giving the elasticity and the consumption based pricing and all the things that you need in the cloud. So and with databases, if you think about this, what ... It's not the database that locks you in, it's the data, right? When I deploy 10 terabytes of data someplace, I'm locked in. It doesn't matter if I'm on MongoDB or Cassandra or whatever. If I want to go somewhere else, I've got to move the data. That takes a long time, right? And then the third factor to look at here is that once you boil down your NoSQL to the lowest common denominator which is the data model, none of those other fancy features matter.

As a matter of fact, those are the things you'd never want to use. Things like aggregation framework or nickel or any of these other query languages. Now CQL I'm not going to say that because CQL did the right thing. They actually said, "We're not going to try and implement join operators here. We're not going to give people the ability to modify the data. We'll just give them a nice familiar syntax to select their items."

I like that. Okay? I really like that, but if you look at like nickel or you look at MongoDB's aggregation framework, the bane of my existence at MongoDB was going around and dealing with customers and all those terrible aggregation queries and unwinding all that stuff and changing their data models to be more efficient and you really don't want to use that stuff.

Anyways, when you boil it down to that lowest common denominator, it doesn't really matter if I'm using Cassandra, MongoDB, DynamoDB or Cosmos DB, who cares? It's all ... The data model is the data model and it's all select star where X equals and they all do that just as well as each other.

Jeremy: Right, yeah. And I actually, so I mean my recommendation would always be just because I've been working with it for a while. I really do love DynamoDB, but I worked with Cassandra and I had to sort of peripherally manage the Cassandra ring and I can say using Cassandra was great, NoSQL was good, but managing it was not fun.

So at the very least, having a managed services is a nice alternative to somebody who is really hell bent on using Cassandra. So anyways, all right. So I actually got a couple of questions from a few people that I'd love to ask you and just sort of just give me a brief ... Just your, your two cents on some of these things.

Now, of them was about analytics, right? We talked earlier about the difference between relational databases and NoSQL. Obviously, you're not running a lot of analytics workloads on NoSQL. But we have DynamoDB streams, we have the ability to do scans and exporting data and some of that stuff. So just what are some of the best practices when it comes to taking that data and being able to analyze...

Rick: Absolutely. Yeah. You hit the good points there. Streams. For operational analytics, things that need real time aggregations. We're looking at like top end, last end counts, sums, averages, computed KPIs.

Streams and Lambda is your friend, right? I mean Streams is the running change log of DynamoDB. It's like a change data capture pipeline that's built into the process, built into the system and so when you update the DynamoDB table with either a write or an update or delete, any write operation, it's going to show up on the stream which causes a trigger to fire and that trigger can be picked up by a Lambda function and the Lambda function can process that change and update any operational metrics that are affected by that change.

So this is a really neat system because there's 100% SLA guarantee between the update to the table, the right to the stream and the fire of the Lambda function, it's going to process every single update at least once. So this is really useful for customers who are trying to maintain these operational analytics because you're guaranteed the process, right? If you try and process it all yourself, I mean, how processes die, right?

And if you managed to update the table, but not your analytics, then you know you're not going to be able to make sure that happens. We'll make sure that happens for you. So that's really neat. The other process you get is like you said, you can table scan, export, but one of the things I see people do a lot is they just actually snapshot the table, it's one of the nice things about table snapshots is they're fully consistent, right?

Rick: A table scan is not necessarily consistent, right? It starts at the first item and ends at the last item. If anything changed in between, right? I don't know. So if you do a snapshot, then you can that to a new table and then you can table scan that new table and it's like a point in time picture of your storage which is really what most people want when they're running these types of offline analytics, right?

And so you're going to snapshot this thing, you're going to restore it to this new table and then you can go ahead and export it to S3 as parquet files. You can run a Athena queries on top of this thing and do whatever you want. That's not the most efficient way to query the data, but it's highly effective and what you have is it a report that doesn't run with high frequency, that can be a really, really nice solution for you because you don't have to export it into a relational database or even putting into Redshift or anything.

Now, if you're running constant queries against this data, then maybe a regular process to using Streams, Lambda to export the data in realtime into a relational database to maintain kind of a synchronized view, so to speak as a normalized structure that you can run ad hoc queries against. I see that a lot too. All right, so depending on the nature of the system and the requirements of the analytics, we can handle it, but let's just make sure we do it the right way so that those ... We don't want to end up having to run a lot of random analytics queries on the NoSQL database. It's just not going to do it, right? It's not going to do it well.

Jeremy: Yeah, and one of the ... I mean, I guess the mindset that I follow is if I'm doing time series data or something that is immutable, it's just not going to change, it's writing the data in. I like to dump that into like Kinesis data fire hose and maybe S3 and then be able to query with Athena, but I absolutely love when I use operational type data where it's a lot of crud type stuff that's happening. Just copying that over into a SQL database so I have all that flexibility, but from an operational standpoint, that's my source of truth and I think is...

Rick: Yeah, no, absolutely, absolutely. Yeah, absolutely. As far as the time series data goes, that's a really good use case for Dynamo and we'll see a lot of people roll that time series data and use that Streams, Lambda processing to update those partitioned analytics, right?

Top end, last end, average and all that stuff and then they'll age out the actual item data, right? And do exactly what you said. So they'll TTL that data off the table, it will roll up into S3 go into parquet files and sit in S3 and then when they need to query it, they just select the top level roll-ups out of DynamoDB.

If they need to do some kind of ad hoc query, then they do exactly what you said, run the Athena queries and whatnot, but for the summary aggregations, they're still serving that up out of DynamoDB. They're just doing it and it works really well like you said because once those time-bound partitions are loaded and they're loaded, they're in chain.

So why calculate it every single time? Right, exactly. Yeah. Yeah.

Jeremy: So one of the things you mentioned about Streams too is that it guarantees at least one's processing. So you still have to think about item potency and some of those other things, if you're updating something else. I mean are there any best practices that you can think of for that or ...

Rick: I mean, I think it's just always a one-off, right? I mean it depends on what nature of the computation ... Some computations aren't even affected. If you process it multiple times, others you're going to want to make sure that you included this in the average already.

There's usually what I'll do is maintain track of items, events that are processed, what are the last end events or something like that so that if something processes twice, I'll see that it actually made the update.

Write the configuration data that normally would write from the Lambda function. You know these things are processing in order on a per item basis. If you have per item metrics, you can always record the last ID, right? So that when you come to update again, make sure that the ID isn't equal to my ID. If it is fail.

I mean, it's basically, it's going to be some trick somewhere, somehow. Oftentimes I end up tagging a UUID onto the items that I can use in exactly that way so that I know it's processed, it processed right? That way, balance those double-process things. Now again, it doesn't happen very often. I mean, like one in a million is going to and how you could run for months and never see it.

And of course, there's going to be that one random time where the lightning hit the data center and the container crashed and the thing was in the middle of your Lambda process, right?

Jeremy: It's always the customer who pours over their data that's going to find the issue.

Rick: Exactly. He's going to notice it. Right. Exactly.

Jeremy: All right, So another question that I got was the performance impact of transactions.

Rick: Okay. Yeah, yeah. So transactions are heavy, right? There's no doubt about it. It's basically a multi-phase commit across multiple items. So I don't know the exact numbers, but it's about three X the cost, right? To have the normal insert. So be aware of that when you're using the transact write API. If you have the need for that kind of strongly consistent update, then let's do that, but there's a couple of caveats to transactions people need to be aware of.

First-off is the isolation level is low. So this does not prevent you from seeing the changes on the table, right? You will see those changes appear across the table. Someone selects an item that's in the middle of a transaction, they will see it, nothing blocks the read. So if that's ... As long as that type of transactional functionality is right for you and that's really what we're doing.

We're giving you an ACID guarantee. It's an ACID guarantee with a low level of isolation. You also get essentially the same ACID guarantee from a GSI replication and that is guaranteed and it doesn't cost you any more than the cost of the right and the cost of the throughput. So again, as long as it's a consistency at the client, really that's what transactions gives you. You won't acknowledge the write to the client until all the updates have occurred. That is the only difference between a transact write API and a GSI replication.

So if that guarantee is something you absolutely have to have, then great. And there are plenty of use cases for that. I want to block until I know that all copies of this item have been updated. I don't want the to continue until X, right?Great, no problem, but it was transact write API. That should be a real subset of your use cases. I wouldn't use it by default.

Jeremy: All right, and then what are some of the biggest mistakes you see people make with modeling? And we only have a few more minutes.

Rick: Yeah, no problem. Look, the biggest mistake. Hands down. We use multiple tables, right? I mean and the bottom line is multi-table designs are never going to be efficient and NoSQL no matter what the scale. I mean, you can have the smallest application that you're working with, you can have the largest application you're working with, it's just going to get worse, right?

The small application you might not notice the cost that you're paying, but you will pay more and honestly, it's not easier to write data, write the application for a multi-table design, right? It's just not. I have to write multiple queries, I have to execute multiple requests. I have a lot more code that I have. I mean, if you want to compare code, it's going to explode the code to run multiple payables, right?

So you're going to be running with less code, less complexity, more efficiency with a single table design. Let's learn how to use that. I think that's really the biggest mistake I see people make. The bottom line is if you can't get over that, then stick with your relational database. You're going to be way better off, right? So I would definitely advocate that if you're saying that my app is too small and I don't need a single table, then okay, your app is too small, you don't need NoSQL, right?

Jeremy: So what about like a relational modeling? Is that something you see quite a few people doing?

Rick: Oh, in the relational modeling in NoSQL?

Jeremy: Yes. Yeah.

Rick: I'd probably say 90% of the applications I see for NoSQL like single items. There's not a lot of highly relational data. I would say that the majority of applications, it's probably about 50 to 60% of the applications that we migrated at at Amazon were ... Had a fairly significant relational model and that was because we were taking ... We were under an edict that basically every app that we had, all of our tier one applications, most of our tier two applications were moving to NoSQL and there was no choice.

So across customers, I'm seeing a larger number of apps these days. I'm seeing people starting to realize that, "Hey, you know what? We can manage this relational data. It's just it's not non-relational. It's de-normalized." Right? I think people are starting to understand that that non-relational term is a misnomer and that we're really just looking at a different way of modeling the data. So you're starting to see more complex relational data in these NoSQL databases which I'm really glad to see and I expect that trend will continue.

Jeremy: All right, and then the other thing I think you've mentioned in the past too is we talked about it a bit with that partial normalization on the insurance quote is storing really large objects.

Rick: Yeah, yeah, don't store the ... Don't use that large objects unless you need them, right? I mean, if the access pattern is I need all this data all the time, then great. Use the biggest objects you can. It's actually drives a lot of efficiency. But generally speaking, I don't see that, right? Most applications need little small chunks of data, right? This couple of rows from that table, this couple of rows from that table, they don't need the entire hierarchy of data that comprises this entity in the application space, right? And so if you're storing blobs of data that represent every single thing that could ever be related to X, then the chances are that you're actually working and using a lot more throughput and storage capacity than you need.

Jeremy: Yeah, and I think a thing to remember about that too is you're not paying for the number of items you read, you're paying for the amount of data that you read.

Rick: That's right. Yeah, that is the case with every NoSQL database.

Jeremy: Exactly. And so what's cool about that too is, I mean, if you go back to those GSIs, you may have certain attributes in a document, some of which you have to index a different way. You can use sparse indexes and only replicate some of those attributes or some of that data to another index.

Rick: That's correct. Absolutely. And you can pick and choose which pieces of that hierarchy end up getting projected onto those index is absolutely ...

Jeremy: Right. So optimize for the write, right? That's another sort of main thing to think about, the velocity of the workload and so forth. Those are the ...

Rick: It's a choice. It's optimized for the read or for the write, depends on the velocity of the workload. Depends on the nature of the access pattern. There could be times I want to do one versus the other.

Jeremy: Absolutely. Okay. All right. So just a couple more questions here, but I think it's really interesting and I had posted something about this a while back and you had mentioned all the applications that you moved over from amazon.com to to NoSQL, what are those numbers look like? Because I think a lot of people are like, "Well maybe I can't use DynamoDB." Or something like that. And I always say, "Well, if Amazon can fit 90% of their workloads into DynamoDB, you probably can too."

So just, do you have some of those numbers? What has the growth been like and how much data do you process on a given day or whatever?

Rick: Yeah, sure. So a raw number is, I mean, if you want to just think transactions per second, 2017 prime day, I think we peaked Amazon Tables peaked it about 12.9 million transactions per second. We thought that was pretty big. That was 2017. In 2019, Amazon CDO tables peaked at 54.5 million transactions per second. So it's in a [crosstalk 01:28:11]

Jeremy: So somebody else's application can probably be just fine using DynamoDB.

Rick: And this is the thing, so we get this question a lot. I mean, I get the question of is DynamoDB powerful enough for my app? Well, absolutely. As a matter of fact, it's the most scaled out NoSQL database in the world, nothing does anything like what DynamoDB has delivered. I know single tables delivering over 11 million WCUs. It's absolutely phenomenal and then the other question is is not DynamoDB too much overkill for the application that I'm building?

I think we can have great examples across the CDO of services. Not every one of our services is massively scaled out. Hell, I've got services out there, I've got five gigabytes of data and they're all using DynamoDB and the reason why I used to think that NoSQL was the domain at the large scaled out high performance application, but with cloud native NoSQL, when you look at the consumption based pricing and the pay per use and auto scaling and on demand, I just think you'd be crazy.

If you have an OLTP application, you'd be crazy to deploy on anything else because you're just going to pay a fraction of the cost. I mean, literally, whatever that EC2 instance cost you, I will charge you 10% to run the same workload on DynamoDB.

Jeremy: Yeah. And actually, I really like this idea too of using it in just even small applications as sort of a very powerful data store that yeah, if for some reason I happen to get 54 million transactions per second at some point ...

Rick: I can do it. Right. Yeah.

Jeremy: Most likely, I'm going to have 10 transactions per minute or something on some of these smaller things, but I always love the fact that DynamoDB tables are such ... They're so easy to spin up, right? And again, you can add GSIs, but if I think about building microservices, especially with serverless applications, I don't want to be spinning up a separate RDS cluster or Aurora serverless for every ...

Rick: Yeah, no. I mean ...

Jeremy: It's crazy and if most of what I can do, if my workload is fit, if my access pattern is fit and there are some that don't, but yeah, I love it as sort of this like go-to data store that you can do all these great things in and the other thing is is that if you do have some slightly complex queries that you need to run, but it is small scale, a couple of indexes don't cost a lot of money.

Rick: Yeah, processing data and memory doesn't cost much money. I mean, the best example is in what the community tells us. I got a tweet from a customer the other day said, he told me just deprecated as MongoDB cluster had three small instances. They were costing about $500 a month. It took him 24 hours to write the code, to migrate the data and migrate the data. He switched it all over to DynamoDB, he's paying $50 a month.

Jeremy: Oh geez.

Rick: I mean, it's just, it's amazing when you look at it. I mean, when you think about it, it's like, "That actually makes sense because the average data center utilization of a an enterprise application today is about 12%." Right? That means 88% of your money is getting burned into the vapor.

Jeremy: And you're paying those people to maintain the ... Paying those ops people to maintain that ...

Rick: On top of that, exactly. And that's the thing. So this guy is like, "Hey, I'm saving 90% of my base cost and that didn't even calculate his human cost of maintaining all those systems."

Jeremy: Which is likely much more expensive than ...

Rick: Than the 500 bucks a month, right. Yeah.

Jeremy: I will just tell you, I actually had a small data center. I had a co-location facility that I used when I had a hosting company when I was doing a web development company and I used to go there and swap out drives and do all that kind of stuff. I can tell you right now, Amazon or AWS can do it a lot better than [crosstalk 01:31:39]

Rick: We've been doing it for a while now.

Jeremy: Right. Exactly.

Rick: Petty much the scale that blows anybody else out of the water, right?

Jeremy: Exactly. Exactly. So definitely a wise choice to do that. All right, so let's move on to tools for development. So there's the NoSQL Workbench, I've been playing around with it, very cool.

Rick: Yeah, yeah. No, I love this thing. This was a tool. Again, and I said it and reinvented, built by the specialists, for the specialists. The North American Specialists SA team had been spending a significant amount of time plowing around in Excel, manually creating GSI views for customers and demonstrate to them how these things will work. It's an error prone process. It's a pain. To try and create those pivot tables, it's like, "Look, you're copying data out. Oh did I get the keys right? Is the sort order right?" All this kind of stuff, right? Whereas with this, what you do is that those NoSQL Workbench for DynamoDB, you just take a bunch of JSON data, you can load it into the tool, it gives you a nice view of what the aggregate looks like based on the partition key and sort keys that you configure in the tables.

It gives you all your GSI views. That's the best part about it. As you go from GSI to GSI, GSI, it pivots the data automatically. So you can put the sample data in and then you can visualize what happens when I translate that data across multiple indexes and then see what those sorts look like. It has a code generator. I don't know if you had a chance to play with that yet, but...

Jeremy: I did.

Rick: Oh, it's really nice because you know what I mean? One of the biggest problems in dealing with any database development is writing the queries, right? I mean, it's like we all know what conditions we need. Okay. Write the code, make sure that there's no errors, everything's correct. And the code generator for the Workbench basically lets you just set query conditions and hit the button and it generates all the code for your application. You can just cut and paste it. Actually, generates a runnable, executable.

Jeremy: It does. Yeah, that's right.

Rick: You can validate it from the console. Just go ahead and run it. Yup, it works. Okay, great. You can embed it in your application.

Jeremy: Well, and and the other thing that's great about it too is, and I think this is something that, because there's no interface to do this yet and you still have to add it to JSON manually is the facets. And I've been playing around with those a little bit. I'm sure you'll [crosstalk 01:33:49] to do it through an interface, you don't have to download and stuff, but the facets are actually really cool because this I think will be helpful to people who are thinking about entity-based type stuff.

Rick: Correct.

Jeremy: So we can sort of create this new entity for each different ... Facet for each type of entity, you go in, you can enter data for each facet which it maps or aliases, your PK and your SK. It just makes a lot more sense I think than people just entering everything into one big table.

Rick: Yeah. You hit the nail on the head there. Facets were intended to be the entities in your model. For each type of object that you have on the table, that's a facet, right?

Jeremy: Yes.

Rick: Now, I expect that over time, the functionality of facets is going to grow. It's going to start to align well with like data migration where customers want to move from a relational database into workbench and we're going to start to provide tooling to help them do this.

That will help them translate their normalized relational models into a single table and keep them in the context they're used to mentally because these items are related to those items. Here's a one to one, here's a one to many, here's a many to many. I'll create all my facets. One of the facets is my lookup table. Just to help people organize their data the way they like because we get that question a lot. "Well, I put all my data in this table, how do I visualize it?"

Jeremy: Right. Right.

Rick: There you go.

Jeremy: No, that's great. All right, so then other tools, there's the best practices guide. On the AWS site, there's some courses out there, the Linux Academy course.

Rick: Yup, Linux Academy, we just put out that new course out. That was published just last year, it incorporates all of our best practices and design patterns, it's completely updated content.

We've got, of course my content online. There's some great content from the community out there. Alex DeBrie has some really good content out there for people to pick up and we've got a couple of books in the queue here, so look...

Jeremy: Yeah, I heard. No, I know Alex DeBrie is writing a book and then I think you said I'm not going to be outdone by Alex DeBrie, but I'm also going to write a book. You have a book you're working on as well?

Rick: Yeah, we do as a matter of fact. We're leveraging the entire North American Specialist SA Team at AWS. They're all contributing content. I'm going to be editing the content. I'll probably write the forward and I look for that to come out sometime probably early next year I think is when that's going to hit.

Jeremy: Awesome. All right, so Rick, listen, I mean honestly, it's been absolutely awesome to have you here. I will say I don't think I would have ever discovered or found a love for NoSQL and DynamoDB if it wasn't for the presentations that you've done and the work that you've done.

So I really appreciate it. I know everyone that's listening and everybody in the sort of the DynamoDB community is a very appreciative of the work you did so again, thank you so much for being here. So if people do want to find out more about you, follow you or whatever, how do they do that?

Rick: Sure, you can hit me up on Twitter houlihan_rick and or hit me up on LinkedIn and we can connect there.

Jeremy: Awesome. Okay.

Rick: Thanks so much Jeremy. I really appreciate it. It's been great.

Jeremy: All right. Thanks Rick.

Rick: All right, thanks. Bye.