Episode #141: MongoDB Atlas Serverless with Kevin Jernigan
June 20, 2022 • 57 minutes
In this episode, Jeremy and Rebecca chat with Kevin Jernigan about MongoDB's road to serverless, how it enables developer productivity, why it's so hard to build serverless databases, what a new serverless pricing model could look like, and so much more.
About Kevin Jernigan
Kevin started his career on the first product management team at Oracle, with responsibilities for utilities, benchmarks, and Oracle Parallel Server. After Oracle, he built a consulting business focused on data warehousing and high end transactional systems, and then built a SaaS business providing booking capabilities to the health club industry. He returned to Oracle to manage a team delivering storage and performance features in Oracle Database, and then joined AWS to launch Aurora PostgreSQL, which he helped build into the fastest-growing service in the history of AWS. In early 2021, Kevin joined the Atlas Serverless product team, and is focusing on bringing the Serverless from preview to general availability, and on working with customers to ensure it exceeds customer expectations in all dimensions, including ease of use, performance, pricing, scalability, functionality, and integration with the broader serverless application landscape.
Hi, everyone I'm Jeremy Daly. Rebecca:
And I am Rebecca Marshburn. Jeremy:
And this is Serverless Chats. Hey Rebecca! Rebecca:
Hey Jeremy, how are you doing today? Jeremy:
I'm doing really well. I'm excited about our guests but also I had a really good Monday night. Rebecca:
And? Please go on. Jeremy:
Yeah I went to see Sammy Hagar and the Circle and if you don't know who Sammy Hagar is, I don't know of this audience, I have no idea. Sammy Hagar was the lead singer of VanHalen after David Lee Roth but he now has a band called The Circle with Vic Johnson who is just an amazing guitarist and Jason Bonham who is John Bonham's son from Led Zeppelin and then Michael Anthony who was the original basis in Van Halen. I saw him, but also opening for him George Thoroughgood. Do you know who George Thoroughgood is?
No you don't. Well you would. So "Bad to the Bone," "One Bourbon One Scotch One Beer," "I Drink Alone" like all these great songs anyways he's 72 and it was just amazing to see him out there I mean he looks 72 Like you knew he was 72 but then Sammy Hagar 75. I mean when I'm 75 if I could even go to a concert when I'm 75 I would be pretty excited. But anyway so it was a good start to the week but yeah anyways.Rebecca:
Well I gotta say I think that you're still gonna be rocking out when you're 75. And I similarly had a past few days. One of my closest friends from Detroit is in an 80s rock band like very similar to a Van Halen style. They are from Detroit They were on tour for the last two weeks, came to Seattle. I got to see them and it's not every day that I go to a rock show but it was a rock show I mean like. Jeremy:
Moshing and crowd surfing and I was like wow this still DOES happen. That being said, when you said what you just did on Monday night our guest today gave literally on the screen two thumbs up, so I think that we might be in good company today. Would you like to introduce our guest? Jeremy:
I would love to. I had the amazing opportunity to speak at MongoDB World last week. I think we talked about this and I got to co-present with the principal product manager at MongoDB who's also the lead product manager for Mongo DB Atlas serverless. So Kevin Jernigan is here with us today. Kevin thank you so much for joining us. Kevin:
Thanks for having me Jeremy. And I'm very jealous about the concert you went to. I wish I could have been there. Jeremy:
It was pretty exciting. So before we get started with anything else let's get a little bit of background here. So tell us a little bit about yourself sort of what your experience has been in the crazy tech world that we live in. And sort of what you are doing at Mongo DB right now. Kevin:
Yeah, so my experience in the crazy tech world started a long time ago. When I graduated from college I got a job with what was a small software company at the time called Oracle. Back when Oracle had 800 people. So that was a long time ago. That was 1987. And I was part of the first product management team in the Oracle database team, which was the first product management team at Oracle, Cause that was the only product we had back then. And I stayed there for four years. We grew fast. We went from 800 to 8,000 people. Went from 125 million to about a billion in revenue in those four years. It was hyper growth times for Oracle and Microsoft, another smaller company that had gone public around the same time. And Sun Microsystems. So those were kinds of the peers back then. And but I left in 1991 after doing a bunch of database performance stuff. We launched some fancy things like Oracle Parallel Server the ancestor of Oracle real application clusters and other stuff. While I was there I wrote the first Oracle database performance tuning guide. And that was based upon the experiences we had with customers after launching Oracle version six, this new role level locking thing that was innovative in the database industry and was targeting real transactional workloads. So customers believed us and tried to use it and ran into a bunch of performance issues. So we worked with them to fix their issues and fix the technology and all that and learned a lot about how customers use databases and how they want to, how they want to interact with them and tune them and improve them. And that's why we wrote the book: The Performance Tuning Guide. We then tried to productize all that fun stuff. And that was the hard part. Even today it's kind of hard, but back then it was impossible. The stuff we wanted to automate, productize didn't have the data, in terms of workload data, didn't have the compute power. We didn't have the storage. Everything was thousands of times more expensive and slower than today. So these were nice ideas that were just impossible to do back then. And I left Oracle 1991, did a bunch of consulting in the 90s, built a consulting business focusing on what we called scalable solutions. So still doing database performance. We expanded. We hired about 45 consultants over time and focused on not just Oracle but DB two and Informix and ingress and the whole list of database vendors at the time. Ended up doing a lot of data warehousing. Which had just been invented as an idea in the early 90s. So we were chasing the holy grail of a one terabyte data warehouse. Which was really hard to build back then. Seriously it took a week to load a terabyte across thousands of discs. It was crazy. Thousands of tiny expensive discs. And eventually we sold that business right at the beginning of the .com bubble. I continued to do a bunch of consulting. I started another software as a service business focus, believe it or not, on the health and fitness industry cuz I've been an active squash player most of my life. And I wanted to book a court online. Which doesn't sound that hard except even today in 2022 the health club industry still doesn't do a good job of online self-service for their members. We tried and failed to build a successful business. We built a very good platform but not a successful business in that industry. Lots of barriers to selling technology to health clubs. I'll just leave it at that. Went back to Oracle in 2009. So I'd been away for 18 years. Came back to the same product management team of course was much bigger company, like 60,000 people. And Oracle was in the process of acquiring Sun at the same time. So we suddenly jumped from 60 or 70 to 110 120,000 employees. And you know, very big business. But I was in the middle of the product management team inside of the Oracle database team. Same team I left 18 years before. Focusing again on storage and performance related stuff. So I ran a small team. We launched about a dozen different things while I was there for about six years. And again the same challenge was there, It's like customers wanted to do all this performance tweaky stuff. They wanted to get their hands in there and change every parameter, turn all the dials and knobs, just to make it perfect for their workloads. And we still didn't have what I thought we should have had 20 years ago: the ability to do that automatically. And to do it at scale and to understand across customer workloads. Cuz Oracle didn't really have a cloud focus back then. And so when the opportunity came up to to leave Oracle to go to AWS, it just made sense for me to do that. And so I joined AWS as technically part of the RDS team: the relational database service team. But within the RDS team I worked on the Aurora Postgres project. So I was changing gears from Oracle to postgres, but it's still relational. So mental model wasn't gonna change that much. The more important thing was moving into the cloud and understanding how customers were using the cloud and specifically trying to use databases in the cloud. And at AWS, the opportunity was there .Well, hey we have visibility into all these customer workloads. Yeah we can't go look at their data, that would violate privacy stuff. But we can certainly see some level of metrics and characteristics about how their workloads work over time and different patterns and seasonality whether it's daily, weekly, monthly, annual- whatever it is. But we still, we're blocked, to a large degree, in terms of implementing the kinds of automation and kinds of- not just automation- but kind of proactive tuning, I guess, or automatic self-driving, if you will. We're still blocked from really doing a good job on that, partly because all the database engines we were running were somebody else's. Because RDS runs Oracle and SQL server and MariaDB and mySql and Postgres. The Aurora projects are based on Postgres and MySQL, still dependent on those open source projects where, the more changes we made to Postgres or mySQL, the more merge burden we had. Because we needed to stay close to the main line of those open source projects. So if we were gonna take the next release of Postgres and adapt it into what we did in Aurora Postgres. The more adaptations ,we added the bigger the merge burden. And the harder it was to keep up or hard it would be to keep up without introducing Aurora specific bugs, et cetera. So we were pretty limited in what we could do without taking on all that extra work on an ongoing basis. We were also limited, as I kind of discovered over time, by the relational model itself. And so when the opportunity came up to actually leave AWS for MongoDB that was a bigger change for me actually than going from Oracle to AWS. Going from on premises to the cloud wasn't as hard to get your head around or get my head around as going from, you know I've been doing relational for 30ish years and now I'm gonna think about non-relational and document model and all that. And why is this so interesting? I mean why does MongoDB even exist as a company? What happened that they got all this traction and they're doing so well? So I had to dig in on that before I felt comfortable moving into that, moving away from AWS to take the role at MongoDB. But joining MongoDB, my focus from the start has been working on our serverless offering. And I've been here about a year and a half now. So I joined early last year. And I joined and then we went we launched the preview of Atlas serverless in July. And we went GA last week at MongoDB World. And one of the things I've learned is just how customers use MongoDB. Certain things are just so much easier than what customers or developers have to do to use a relational database. So it's nothing to do with the cloud , nothing to do with serverless. This is just core document model versus relational. One of the things that you just take for granted when using relational database is that to change the schema you're gonna have to do some gymnastics. You might have to take some downtime. You you have to worry about the schema up front. And you have to think through, 'oh okay I'm gonna have these kinds of tables and they're gonna be related this way. I gotta build these indexes. Maybe my tools, my development environment will build those for me. But I still have to think conceptually that way.'
And that was all driven by how expensive storage was in the 60s and 70s. And so when you think about it, the relational model is optimized for minimizing how much storage you use to store your data. And that made sense. Storage was insanely expensive. Remember the Y2K problem? That came from storage being expensive. Let's not put one nine in the date because that let's save those two bites. Saves us a lot of money at scale. You fast forward 20 years ago even, storage was getting pretty cheap. And of course it's a lot cheaper now. So why are we still using a model that's optimized to minimize your storage footprint? When that's like the cheap fast part of your system? And so the document model that MongoDB uses is really focused on optimizing for something different than your storage footprint. And so when you start thinking that way 'Hey wait you're right. If I optimize that way then these other things get better.' And one way that that shows up is just in how you model your data modeling in MongoDB,.the way we phrase it is 'data that gets access together gets stored together.'
So that if you're looking up all the data about a customer, if it's all stored together you just do one or a couple iOS of contiguous data to pull it into memory. And it's all there. In a relational database you pull a bunch of rows from a bunch of different tables and then you do a join to connect 'em all together .Well guess what? That takes more CPU but it takes less storage. So you've optimized for the wrong thing. So if you optimized for faster access at the cost of maybe storing data more than once. So yeah you're wasting storage quote unquote wasting. But your access times are much, much faster. And so that core difference in the document model kind of filters through everything we do. And everything that developers benefit from using MongoDB, as compared to relational. So that was kind of the first transition. But then of course we have our own cloud, our own managed Mongo database MongoDB in the cloud service called Atlas. Which runs on all three of the major cloud providers. And going serverless is kind of a big next step. And you can't really do serverless well unless you get to the point that you can automatically manage most of what people manage by hand in databases. And so, inevitably, from my perspective, inevitably serverless is gonna lead down a path of automatic fleet management, automatic scaling up and down, automating things like indexes, automating things like scheme optimizations. All that stuff. We need to make it transparent and automatic so that developers don't have to worry about it, even at scale. I mean it's all it's easy to make things fast when there's five documents or five rows in the table. It's when you scale and have a million users and billions of rows that automating it is harder. But that's where I think we're headed. Jeremy:
So I wanna get into all that stuff. But first I've gotta apologize to our guest cuz clearly after we've heard your background you have no experience in this at all. So you're clearly not the right guest to be talking about this. But other than that... No, amazing. Absolutely. I mean you've seen all these different things. And I think that's one of those, I forget who says this, but there's no compression algorithm for experience right? So you can't just go in and start working on relation or non relational databases or whatever NoSQL databases and have that same perspective. So amazing that you have that perspective. But I know Rebecca has some questions that she's itching to get to so.Rebecca:
I do. I think that's a Wernerism, by the way. It might not be a Wernerism but I think it is. I think it is. He can correct us on that one. Kevin:
It's Werner or Andy Jassy. Rebecca:
Yeah it also might be Andy Jassy. If it was, maybe I wrote that. No I didn't. I did not write that line. I wish I could claim it. Andy I'm sorry. So that was an amazing, so from the dawn of Oracle, little baby Oracle in 1987, to serverless today, I think before we get too far into MongoDB Atlas for our listeners we should- there's actually an incredible stat that you referenced in your recent talk. And I think it's really helpful to even benchmark where serverless is and was even two years ago. To talk about where we think serverless is going and how MongoDB Atlas serves that. So in your talk you noted that the global serverless architecture market was valued at 7 billion USD in 2020. And it's projected to reach 37 billion USD by 2028, which is a 22% growth rate year over year. And that's according to verified market research. We always like to attribute it stuff. And that also means like there's gonna be a lot more voices and have been a lot more voices every year in terms of their use cases, their edge cases, where they wanna go, how they wanna use it, the futures and ideas, what they think they can solve with it, how they wanna stretch into that. That are stepping into serverless or that are now familiar enough with it to keep building on it and want to do more complex things or solve different problems with it. And so I'm curious, even in in that benchmark context right? Serverless is growing a lot and has already been growing a lot. How at MongoDB did you decide like what the right next thing to build is? And then how was that for example in terms of MongoDB Atlas? Like where did you arrive there? Where you're like okay this is where the market's going and this is what our customers need today. Kevin:
Yeah. So MongoDB's always been a developer focused company right? Since way before I got here that's how the company grew from zero to where it is today, was by really focusing on developers. I mean that was really the motivation for building the database in the first place. The original founders Dwight Merman and Elliot Horowitz wanted to make it easier for developers to build applications. And they had built a bunch of applications themselves before before starting MongoDB. And they were writing code like every other developer in the 2000s that manipulated JSON objects. And storing JSON objects in relational sucked. Cuz you had to use an arm to rip it apart into tables and rows and columns. And then the arm would reconstitute it back into JSON. Just an impedance mismatch. There was a lot of friction there. Lots of opportunities for mistakes and translations and sequel statements getting wrong and all that crap. And so they just said let's put together a database that stores data in the form that I use it in my code. So the motivation of the company from the beginning has always been developers. And as we grew, it made sense for us to build and manage Mongo database service in the three major cloud providers because developers were starting to build stuff in the cloud. And they were downloading MongoDB and managing it themselves on an EC2 instance or the equivalent. And the other cloud providers. So it just made sense for us to say, 'Hey well they're complaining about all the work they have to do to build and manage databases and do upgrades and patches and all that fun stuff Why don't we just take that load off their back? And we will do that with Atlas.'
BUt Atlas still at the time required them to think about infrastructure. You provision an Atlas dedicated instance today, you have to decide how big your instance is. How many CPUs how much memory? And of course there's a price with that. And so you're looking at the cost as well. Oh and how much storage do I need? And do I need more storage for more IOPS? Not just for storing data. So you have to think about all that stuff as a developer who's just trying to build an app that you might just be playing around with that might take off someday and into some big viral thing. But you don't wanna think about all that infrastructure. And so serverless is kind of the obvious next step where in that developer journey, in that 'well let's just give them a magic endpoint.' And that's it. And let the endpoint figure out what they need based upon how they're using it. And so today in Atlas serverless they choose a cloud provider and a region and they give their database a name and that's all you have to do. And they need to create a database or create what we call an Atlas serverless instance. But that really just gives 'em an endpoint that behaves just like a regular non-serverless endpoint, still running MongoDB. The same MongoDB code. Just they don't have to think about how many CPUs or how much storage or any of that stuff. We just automatically scale up and down and bill them for what they use. And that's what developers have been asking for. That's why we started on the serverless project. And why I think we're gonna head in the directions I've been talking about. That we're gonna keep making it easier. We're gonna keep making it more magic. It's not magic. It's all just real understandable technology. But it's gonna feel like magic to to most developers. Jeremy:
Yeah. And you had mentioned this idea too of just the founders originally wanna make it easier for developers to build applications. And just to do it more, not only more quickly, but just easier- reduce that friction. And Dev Ittycheria, the CEO of MongoDB, in his keynote actually gave this quote. And I pulled this quote out cause I thought it was really interesting. It said or he said "Everything we do is all about removing friction and increasing developer productivity. In the eight years I've been CEO of this company, customers have told me many things but no customer has ever complained about innovating too quickly. Legacy architectures with brittle and inflexible characteristics are what have held people back." That was just summing it up right? This is the problem right? Like we have all this old stuff. And not only do we have old stuff but we have old stuff that we've now moved to the cloud. So it's like you're doing the same old stuff, you're just doing it in the cloud now. And we talk about cloud native. And again I have a real big problem with what we've ascribed or subscribed or whatever to what that term is and what that actually means. But I'm curious your thoughts on that. This seems to be, even at the time MongoDB a little bit was I think the older version of MongoDB, a little bit more legacy, and now we get to Atlas and that's more cloud-related and you get to serverless and you get to those other characteristics that we wanna see. But I'm just curious your thoughts on that. Kevin:
Yeah. So there's a few things. A few dimensions to talk about there. One: even though MongoDB obviously was architected originally way before the cloud was much of anything, though right around the same time AWS was launching was when MongoDB was created as a business, 2006 2007 timeframe .But you know back then it was obviously not- cloud wasn't a thing you were gonna target. And, but, the way they architected MongoDB even back then, so the name MongoDB comes from humongous. They wanted to build a database that could handle humongous scale because they felt that relational databases could not handle the scale of web based applications that they saw already being built. And they saw more and more coming. And so it was all about scale right? So they wanted it to handle humongous things. And so they called it MongoDB. Mongo database. But because of that they built in the ability to scale out not just scale up. And by that I mean scale up is you scale up to the limits of the biggest computer you can find. And the bigger, the way you make computers bigger is by adding more CPUs and making the CPUs faster. And you know what people sometimes still call SMPs or symmetric multiprocessors. Biggest system you can get today probably has 64 CPUs maybe 128 at the high end. What happens if you can't scale up? If you need more than that more power than that big server? You need to scale out. You need to spread your workload across multiple servers. And so they built that into MongoDB from the start with what we call sharding, which most database people call sharding. To help you scale out your workload across multiple servers. So that was built in from the beginning. Now that plays really well in the cloud, when the cloud came along later.
The other thing they built in from the beginning was a high availability architecture using what we call replica sets. And a replica set is simply a set of servers, each of them running a copy of MongoDB code, that work together to provide high availability for the data that that replica set's managing. So there'll be a primary and multiple secondaries. Kind of needs to be an odd number when you think about failure cases, so that when you have say split brain, if you have a three member replica set and one of them gets split from the other two by a network outage you have to have two outta three to vote together to recognize that we have a majority right? Simple stuff in terms of how databases work these days. But this was kind of new for building a database in the mid 2000s and so those two things: replica sets with the high availability model plus the ScaleOut capabilities of sharding, were built in from the start. Which plays really well when you move into the cloud. The high availability model plus the ScaleOut model in a cloud environment makes MongoDB work really well at scale. Especially when you have the typical scenarios in the cloud at scale. There's always something failing somewhere. And you need to build it. You need to write every line of code you write assuming that something's gonna fail. That everything's gonna fail. And they were thinking that way from the start. But then the other dimension you talked about is 'okay I've got all these legacy things. Where legacy, to me, means relational even pre-relational some of the older mainframe stuff. How do I move those kinds of applications off of their old relatively expensive on-premises infrastructure into the cloud. A lot of customers are doing lift and ship. They just take the workload and move it into the cloud and say 'Ooh now I'm in the cloud. And I should be saving money. Or I'm no longer managing my own data center and paying my own people to rack servers and swap out discs when they fail. And paying for power space and cooling. That's handled by the cloud provider.' In my experience that's 5 to 10% of the benefit of moving to the cloud, is just lifting and shifting and no longer being responsible for the data center. The real benefits come from actually rearchitecting for the cloud. Taking advantage of things like MongoDB that are optimized for cloud environ. And in the case of MongoDB moving from, say a relational database to MongoDB, you get those other benefits you talked about earlier of going with the document model versus the more rigid schemas that you have to follow to use a relational database. Which opens up a whole bunch of speed and agility for developers since they're no longer hitting that friction of working with or I'm in the middle object relational map in the middle and working with a DBA on the other side who you have to schedule your schema changes with. You don't have any of that friction anymore. You can just go as fast as you want to go when you're working with MongoDB. Then with Atlas and serverless in the cloud the whole idea is just to let's just get out of the way as much as possible. Let's, again, give them that magic endpoint. But what you'll see if you dig deeper in what some of the stuff we announced last week at MongoDB World is a focus on helping customers migrate workloads. From relational into MongoDB or from self-managed MongoDB either on premises or self-managed in the cloud into Atlas. So part of our focus in that developer data platform is to help customers migrate those legacy workloads. Whether you consider legacy self-managed Mongo or whether you consider legacy, Oracle, SQL server, DB2 Postgress, MySQL. Whatever relational source it is. We're working really hard to help customers move those workloads. Though one of the stats we like to throw around is, there is some stat, I unfortunately can't quote the source, cause I don't remember it. But that the next 5 years something like 750 million new applications will be built. And more than all the applications that have be built in the last 40 years. And so a big part of our focus is making sure developers know all the advantages of building those applications on Atlas with MongoDB, rather than on a relational database. Jeremy:
Right. And you keep talking about relational databases to the document DB model. And you mentioned earlier that idea of writing an norm or something that transforms JSON into relational data. And I just had a waking nightmare because I remember writing nested sets- If anybody has ever used nested sets it's really brilliant technique for creating hierarchical data in a relational database but it is a... it's a nightmare. Kevin:
Done that before by hand. Jeremy:
Yeah that's what I did. Kevin:
Before JSON existed. I know what you're talking about. Jeremy:
I was actually using it to do XML, to constitute XML documents and split those apart into relational data. It was very cool stuff but also so much easier just to drop a JSON object in there than to do...Kevin:
Yeah when you're done with that you're really proud of yourself cuz it's really cool. Jeremy:
It is cool.Kevin:
Then you realize wait this doesn't scale. I don't scale. I don't really want to count on finding people who do this really hard stuff to scale my business or scale my application. Rebecca:
Something that stood out to me- and I also love that quote by Dev, the MongoDB CEO. But that last line. He had said, "Legacy architectures with brittle and inflexible characteristics are what have held people back." And I mean, on its surface, certainly I agree. And even below it I agree. But I bolded that because I've been thinking about this a lot recently but some days serverless will be legacy. And we often couple legacy and brittle together. And I think, I mean that's so many things right? It's cuz it it is built on older technology. It was on-prem. Then people just move it to the cloud. And then now we just have this problem in the cloud and there's so many things smashed together. And so there's so many pieces that could break. But are legacy and brittle inherently tied? Or like how do we keep that from also being the fate of serverless? Especially as we keep Adding bells and whistles, adding knobs, adding things that you can change or decide on, or all these things that used to be- and Jeremy and I talk about this a lot- maybe it's the theme of this season, but we love serverless. Also how do we actually keep that idea of simplicity that keeps that idea of undifferentiated heavy lifting, as AWS would say right- or non differentiated work- away from the developers having to do all those things just so you can totally focus on your code. So is brutality and legacy also the fate of serverless? And how do we prevent that? How do we either change our paradigms or our thinking? Or is serverless, too, just one day the next brittle legacy app?Kevin:
On the one hand if you could predict the future like that then Rebecca:
Then you'd be working at an analyst.Kevin:
Well no you wouldn't be working at all. You would've sold out of Bitcoin three months ago. Yeah so I think serverless is gonna go away. But not really. It's gonna go as a term. So in my naivete, when I was still at Oracle the second time and I was considering the opportunity to go to AWS, at first I thought the cloud providers, not just AWS but all of them, were already delivering databases serverlessly. But I didn't use the word serverless in my thinking. I just assumed that in RDS, cuz I hadn't touched it yet, I hadn't tried it out before with my job interviews. I assumed before I tried out that I would get an endpoint that just did what I needed. I thought they had already figured all this stuff out. And when I got to RDS, I was like 'Wait this is kind of clunky. I have to provision infrastructure.' Jeremy:
Just a bit clunky. Kevin:
Can you just be smart of this? You guys are already at scale. What the heck? And then I get there and I realize well it's a lot harder than I thought. And for a whole bunch of reasons. But it turns out that, Jeremy you say this in your talks all the time, most of AWS's early services that they launched with were serverless. In the sense that there are a bunch of servers behind S3 and there's a bunch of servers behind EBS and a bunch of servers behind all these different services. Which the customer has no idea what they're doing. They can't see them. They can't tune 'em. They can't monitor 'em. They shouldn't have to. But it's AWS's job- and Azure and Google Cloud and the rest of 'em- it's their job and their challenge to provision a bunch of servers in the big data center with CPU's and memory and storage and present the S3 behavior to the customer in a way that is affordable .That works. That scales. And that doesn't lose money for the cloud provider. I mean they gotta make money too or else they go out of business. So how do you do that at scale with the right performance and availability characteristics, but at low enough cost? And there's a whole bunch of hard work you have to do in the technology to manage density across a fleet of servers and do all the right things to give that experience to customers. And doing that for database is the hardest thing because customers or developers expect their databases to not lose their data and to be consistent and to be transactional and to scale and to not crash when you look at it sideways. They expect all those characteristics doing that at scale and not losing money, as the provider, is actually the really hard part. So I think serverless is gonna go away as a term because that's gonna be the only way you consume things in the cloud eventually. It's just that databases have always been the hardest thing, so that's why it's the last thing that you're seeing go that way. But you know fast forward five years from now, I don't think we'll call it serverless. Rebecca, when you ask about legacy, yeah I mean what's gonna be the next thing? If I go way out on a limb I'll say that a lot of the stuff's gonna get burned down into chips. Into workload specific chips. We've seen it a lot already. In the Apple ecosystem and the arm chips they're using there. Or in the graviton specific chips that AWS is building. I think we're just gonna see more of that. And so the stuff we're talking about now, we're writing in software that runs in processes in Linux and with all the good and bad parts of that. Once it's really nailed down we'll be able to burn into the chip. And then it's gonna be a different thing. It's just gonna be, I'm reluctant to use the word intelligence cuz that gets into AI, but there's gonna be that kind of magic capability. But it's gonna be burned down into chips rather than software running on general purpose servers. And why would that be better? Well it would probably be cheaper and faster.Rebecca:
It's like embedded efficiency. Kevin:
Yeah I mean, like I said, I'm way out on a limb here but cuz I'm 10 years out. But I mean... Jeremy:
Well, I'll go out on a limb. I'll say the next things that will be legacy will be Java as a language and Kubernetes. But that's just me. I'm just, I think that Kubernetes is already being absorbed 75%/ 80% of Kubernetes clusters are already run by managed service providers. So you're not even managing them yourself. So basically it's just a hosting platform. But anyways but so let's talk about serverless databases for a second. And you said this a couple of times. I said this in my talk. I say this all the time. And it's clear, based off of a lot of the offerings out there, serverless databases are hard. They are just really difficult to do and to do well. And again I think you could do them and lose money on them. But so if you wanna do them right and not lose money, it gets it gets a little bit tricky. But so beyond just the obvious things right? So you've got responsiveness and scalability right? Obviously those need to exist in a serverless database. Also serverless developers, they want integration with their serverless tools like frameworks and so forth. I think cloud provider flexibility is a huge thing. If you have a database that only runs in AWS or only runs in Azure, that becomes really hard. Especially where depending on where you're building your application. Like maybe if your application runs at AWS you don't wanna host your MongoDB cluster in Azure because your data is so far away from your application. You wanna get those close. And then of course consumption based pricing is the biggest thing. So how has how has MongoDB Atlas serverless, how has that addressed a lot of those concerns? Kevin:
Yeah, so you touched on multi, I'll I'll talk about multicloud first. That yeah from the early days of Atlas, way before serverless, we focused on making sure it ran the same way in all three of the major clouds, so that customers could do exactly what you described. They could say well okay I've built my stack on a on AWS in this specific location cuz my current customer base is there. I want low latency all the way around. So yeah I want my managed MongoDB database to be in that same exact location geographically. And the cloud providers don't have identical coverage geographically, first off. So we wanted to make sure that kind of at a base level that we could address customers' needs in terms of where they're building the rest of their apps. And some customers are deep into AWS, some are deep into Google, some are deep into Azure. So we didn't wanna force them to learn how to do stuff in a different cloud provider than the one they know today either. That was kind of a basic thing.
But then we built multi-cloud. And by multi-cloud I mean true multicloud. And in Atlas for, I think, the last two or two and a half years or so you've been able to build a single Atlas database cluster that spans regions within a cloud provider and will also span cloud providers within a single database. So remember that replica set capability that we built in from the start with the primary and multiple secondaries? Well those, that primary and secondaries, they can be spread across different cloud providers. They don't have to all be in the same cloud provider. And so we routinely demo this where we say 'here's your Atlas database running in AWS in US-east-1. Oh and wait you wanna migrate to US-west-2. And in AWS. Well you just create another replica set member in US-west-2.' We hydrated automagically and then you can just fail over to it. And now you're in another region.
Or you can do the same by going from AWS to Azure or GCP. You don't have to do a bunch of migration of data yourself. You just create another replica set member and we do it for you. And it's really just push a button on the console or make an API call to do that. You don't have to migrate, you can just run that way with different replica set members in different cloud providers. So that's a big win for a lot of customers, where they may not run that way but they may want to be able to move stuff around over time. Some of our bigger customers are in M&A scenarios where 'yeah we're all on AWS, but then we bought a company that runs an Azure.' And now what do we do? And so it just makes it easier to rationalize, to do whatever they feel like doing. They can just stay in two clouds. They don't have to bring it all to one. But some customers wanna, they wanna know it's gonna be easier to bring things together.Rebecca:
Well, let's let's talk about that data API a bit. What is it? How does it differ from the standard MongoDB connection model? And when or why should Jeremy use it with his serverless applications? Kevin:
Yeah, so before the date API we built up, pretty much from the early days, a bunch of drivers. Support for a bunch of different drivers from a bunch of different languages to connect to your MongoDB database. And of course customers use these drivers or however they use MongoDB, whether it's community edition that they just download and run on their own, or the enterprise advanced version with extra support and tools, or Atlas in the cloud. So we have support that we build, that we manage internally ourselves for a bunch of drivers. There's also some open source drivers. I think there's about 20 or so of them. And the drivers have lots of functionality to manage connections and do things at scale that are pretty useful. But they're drivers .And if I'm writing Lambda code and I wanna connect to a MongoDB database and I have to put a bunch of driver code into my Lambda function, that can be a little heavyweight, that can be add a little bit of friction. Especially if I'm just getting started. I'm just trying to build a toy application that might turn into a business someday. And so what we launched, and actually went GA with this at World last week, is the data API which lets you access an Atlas database, whether it's dedicated or serverless, through an HTTP call with the right parameters and access keys and all that. But you don't have to stand up a driver and create a connection every time you want to talk to a MongoDB database from your function as a service code. So the data API just lets you, real quickly, just start poking at a MongoDB database- works with serverless. So you could just, as I described earlier, create that Atlas serverless instance and use a data API to just touch it whenever you need to without having to think too much about it. Just so that you can keep focusing on building your application and not even worry about how to connect to it. Now what may happen is you might scale that application over time. You might end up needing some of the functionality that's in the drivers. And at that point it might make sense for you to switch over to using drivers. But you know, you don't have to. The data API is there just to make it very low friction. Again back to the concept of friction. It's a very low friction way of using your Atlas database from your serverless code or even server full code, If you want to even. The data API doesn't really care. You just, it's just an endpoint. You hit it from whatever code you have. Yeah so it's just again, let's reduce friction. Let's make it still easier for developers to get at Atlas however they want. Jeremy:
Yeah and I love the connection model for DynamoDB anyways. It's not much of a problem to use the connection, the drivers. Because you can be outside of a VPC. You don't have to worry about, like you do when you're connecting to RDS or something like that, you have to be in a VPC which means then if you want to access the internet from your Lambda functions, you need to have managed Nat gateway. Which we all know, I hope we know, how expensive that can be. So I do like that. But that flexibility of the connection model, I mean that HTTP connection was always sort of the holy grail for serverless developers. Like 'Don't make me connect to a VPC.' But even with the regular connection model it works really well. But we're running outta time. And I really wanna get this question in because this is something that was a little bit, I don't wanna say a little bit, a lot fascinating to me with the way that Mongo DB Atlas serverless did their pricing. A common complaint, often, from me, is that most serverless services, from a cost standpoint ,scale linearly right? So the more you use, this cost just keeps going up and to the right. And over time that can get really expensive as compared to provision resources. Now there is some sort of threshold- the two lines cross right? Where if you don't have a lot of activity, whatever, you eventually get to the point where it is cheaper to go provisioned or whatever. But I guess that there's a great flexibility having on demand pricing. Whether that's for an early app, whether that's for apps that are low volume and of course my favorite scale to zero use case is the idea of isolated instances for every single developer for PRS for feature branches, things like that. And you don't want to have those, don't want to have provision resources running in a hundred different accounts that are just charging you when you're not actually using them. But what stuck out to me about Atlas serverless was this idea of the transaction volume discount. So can you explain how that works? Cuz I think this could be a new pricing model for the way serverless services work. Kevin:
Yeah no we saw exactly what you're describing from other serverless offerings that yeah, the more you use, the more you pay, linearly. And customers can run into trouble with that model if they don't really understand their workload fully, down at that level. And we, even in the preview, we launched preview, like I said, last July, we had discount tiers in the pricing model then. But of course part of the reason for doing a preview is you learn from customers and their workloads et cetera. And our learnings there showed us that we could significantly reduce the prices and still make things work for everybody. For us financially and for customers, obviously financially for them. But also just still keep a high functioning, reliable, high performing fleet. And so when we went GA we dropped our prices by about two thirds, actually. And expanded the discounting, the volume discounting. And you can see it, just on our pricing page, the details of that, of how that works. But effectively at scale it's about a 90% discount If you get up to the higher end of the discounting. Now this kind of goes hand in hand with something else you mentioned which is 'Yeah I'm playing around and I'm just testing things out. And it's great that I have this pay as you go model. But I I might get to a steadier state workload later, when it's a more mature application. And provision might make more sense.' And we're working on stuff to make it a lot easier for you to move workloads back and forth between serverless and provisioned so that, hey, if you need to move to provision, great. Just push a button we'll migrate it for you with essentially no downtime and vice versa. So those capabilities are coming coming soon for the Atlas serverless or the Atlas platform. What I think is gonna come later, and I don't think I talked about this even in our talk last week at MongoDB World, is the concept of own the base and rent the peaks. Which I'm stealing from Corey Quinn, if you know Corey Quinn. Rebecca:
Indeed, we Kevin:
do. That's a great Quinny-ism.
Yeah I'm not sure if he coined the phrase, but I heard it from him. But the idea is 'Hey if I know my workload's gonna be at a certain level most of the time, and it's a non zero level, but I want the serverless capabilities, handling sudden spikes and surges. But I don't wanna pay for that peak.' Well maybe I could pay for the base level. Prepay, pre-provision the base level, for a provisioned price. And then have serverless behavior and serverless pricing above that. Jeremy:
That's the dream.Kevin:
So yeah cuz not every workload is at zero and then spikes. A lot of them are at some level and then spike. And serverless pricing, for that base level, is still going to probably be quote unquote too expensive or more expensive. And so I think eventually we're gonna get to a place where we do have a kind of 'own the own the base, rent the peaks' kind of model. It's not in the pricing today but I think we will get there eventually.Jeremy:
Yeah I mean that'll be amazing cuz I mean if you just look, and you also say at scale. And that number is not as high as I think some people, and people are like 'well what does scale mean?' I did the math and it's something like, if you spend $5 a day on read capacity for MongoDB, you then start to get a 50% discount on anything over $5. And then it's only like another $25 that you have to spend per day. Again that's a lot of money. I mean, not a lot of money, but compared to other things. But then you're up to a 90% discount if you're spending $35 a day or $30 a day or something like that. Whereas something like even HTTP APIs at AWS, for their API gateway, they do have discounts. But you have to spend $10 a day just to get a 10% discount. And then it just doesn't go down after that. So I think this is really interesting. And I do love this idea of own pay 'own the base and then pay for the overage, pay for the peaks.' II do love that methodology there and that thinking because I really think that is something that would hold people back to say 'Well why would I choose a serverless service If I know eventually at this point it's gonna cross that threshold and it's gonna get super expensive for me?' Kevin:
Yeah I just think we're gonna see that over time as customers bring bigger, more complex workloads into serverless. Where we're gonna see that pattern. They're gonna ask for that kind of behavior. Jeremy:
in the pricing. Rebecca:
So Mongo DB World was last week A lot of cool things happened, as you said right? Atlas serverless went GA. And then you also launched a bunch of other exciting things. So two questions: this is one of those sub-question bullets right? Where it's like that math problem that has 24 math problems in inside of it. Jeremy:
I have three questions each with 24 parts. Rebecca:
Yeah exactly. It's a 72-question question. So hope everyone has at least three hours. What are the future plans for Atlas serverless? And then what are some of the other exciting launches that people should know about if they missed World or if there's so much going on there, where it's like, 'Hey you should know that this is also happening.' Kevin:
Yeah. So first for serverless. I think I already described, we we will be adding more capabilities to migrate in and out of serverless. Back to dedicated et cetera. We did also announce some capabilities like cluster to cluster replication, which will help customers move data from say self-managed Mongo into Atlas or vice versa. And that Atlas could be serverless. So there's some replication tooling that we announced. Yeah I'm not gonna remember everything we announced but I think one of the more important things we announced which got a lot of attention is what we call queryable encryption. And so, lots of databases have the ability to do what's called field level encryption. Where you encrypt a field in the application. And so it's encrypted across the wire. It was already gonna be encrypted with TLS. But it's encrypted all the way through, even in the database's memory on the database side. So if an attacker somehow can dump the memory cache of that database server ,gonna get encrypted data. They still can't see it clear text. Problem is it's hard to query that stuff in the database. You kind of have to do a bunch of work after you get it back to the app to decrypt it in the app and do work there.
So we've got some technology, as part of an acquisition we did in the last couple years, that we've incorporated in MongoDB that basically lets you run queries in the server on encrypted data while it stays encrypted. And so the attacker can dump memory all they like, they'll never see unencrypted data. But the queries are still gonna run well, good performance, at scale, that kind of thing. And so we think that's a big differentiator for Atlas as compared to our competitors. And that did get a lot of press when you dig into it, for that exact reason. That it looks like an interesting enhancement that we put in that makes it that much harder for attackers to get at your data, even if they can get to the server. Even if they get root on the server, that just doesn't matter. They still can't see it see any of it in clear text. So that's gotten a lot of attention.
We added some capabilities to Atlas search. So Atlas search basically, you push a button on your Atlas instance and we will build a Lucene search index, right next to your instance. And then you can do all kinds of fancy Lucent based searching. We've added support for what we call facets in Atlas search, which let you do fancier searching and sub searching and all kinds of stuff that I don't fully understand, cuz I'm not a search genius. But yeah so, Atlas search is cool. Like I said, you just push the button or make the API call and we build and manage the search index for you on whichever parts of your data you want us to do that for.
I will rant about this: search is your only hope in life right? This is why Google is who they are. Cuz they figured that out early. But you know, and Yahoo didn't right? Yahoo thought that they could build a directory that would categorize all the websites out there. And that worked for a while, until people started creating too many websites. And that was, you know, your only hope is search. Your own hope is search for any pile of data. And so search is just hugely important and as people put more data, more and more data, into their databases.Jeremy:
Right. And then there's a stuff with the developer platform what's the developer platform?Rebecca:
Developer database platform.Jeremy:
Database developer database platform, yeah. Kevin:
We yeah, we call it the developer data platform. And some of it is simply 'Hey let's just give people a conceptual way to understand and describe all these different capabilities.' Like for example, in MongoDB itself, you can manage lots of different types of data. Of course the base document or JSON data, but also time series and graph data and key value data. You can manage data in a relational way If you want, though it's kind of an anti pattern. Cuz we we have full support for asset and transactions and you can ,you can join data between collections though you probably shouldn't, cuz that's not how you should use MongoDB. But you can. So there's different all these different ways of managing data, some of which we only just added in the last year or two. Then there's the data API which gives you a different way to access the data, queryable encryption, we've talked about the Migrator. And then there's a bunch of capabilities we added on the analytics side. With, especially with data federation, or query federation, if you wanna call it that, where basically you can run a query in your Atlas instance that can access data in that instance, of course. But also access data from other data sources, all from within that one query without having to think too hard about it. Once you set up the connections to those data sources including just files and S3 buckets. And then we added support for SQL. So Atlas SQL lets you use SQL to access your instance. And we have connectors to Tableau and I think JDBC right now. So that you can, when you want to, when it makes sense, you can use sequel to access the data in your instance and not- you may have Tableau itself, but you may also have people in your company who understand SQL and don't understand MQL Mongo query language. So, we're just trying to broaden the ways you can access and use the Atlas platform for managing data. So that's , we try to encapsulate all that in what we call the developer data platform. Jeremy:
Nice. It sounds like your dog is very excited about it. Which is great.Kevin:
Yeah. Yeah, sorry about that. Jeremy:
So we also, in the talk that you and I gave, you went into some stuff about auto tuning and some stuff like that. So I think our talk is gonna be published. It'll be available. People can see that. So I'll leave that as a as a homework assignment. But Kevin, thanl you so much for being here. This is awesome. I'm really excited about what Mongo is doing. There's a couple of other serverless database companies that are trying things like this as well. I just love this idea of pushing the market and really pushing this. But yeah, thank you so much. So if people want to find out more about MongoDB Atlas or Atlas serverless or find out more about you, what are the best ways for them to do that?Kevin:
Yeah I mean obviously our website has lots of information on it. But honestly go to the console. And poke around, create an instance. You'll see how easy it is .Yeahy, you have to create an account. But you know, that's kind of standard these days. You have to tell us who you are. But then see how easy it is to actually create and use a serverless instance. And there's lots of examples and tutorials and MongoDB University has a lot of useful stuff. Most of that's free. All the courseware and how to use MongoDB, how to use Atlas. So yeah there's a lot of resources on our website. And just, I mean the best way to learn is to do it. Just go poke at it and see what happens.Rebecca:
Cool. Well we will put all this in the show notes especially, or in addition to, how people can find you both on Twitter and LinkedIn: kJerniga. If anyone needs to find any of that, all the notes will be where they always go for the shows. And Kevin, thank you so much again. It was so nice to have you.Kevin:
Thank you Rebecca. Thank you Jeremy. This has been great.