Episode #80: Revolutionary Serverless at re:Invent with Ajay Nair

December 21, 2020 • 78 minutes

On this episode, Jeremy chats with Ajay Nair about recent serverless launches at AWS and the use cases they target, what makes serverless such a revolutionary way to build modern applications, and what AWS is doing to ensure a serverless future for everyone.

Watch this episode on YouTube:

About Ajay Nair

Ajay Nair is the Director of Product Management at AWS. Ajay is one of the founding members of the AWS Lambda team, in his current role, drives the serverless product strategy and leads a talented team driving the product roadmap, feature delivery, and business results. Throughout his career, Ajay has focused on building and helping developers build large scale distributed systems, with deep expertise in cloud native application platforms, big data systems, and streamlining development experiences. He is also a co-author of Serverless Architectures on AWS, which teaches you how to design, secure, and manage serverless backend APIs for web and mobile applications on the AWS platform.


Watch this episode on YouTube: https://youtu.be/QMOLE2-SUjU
 

Transcript

 

 

Jeremy: Hi everyone. I'm Jeremy Daly and this is Serverless Chats. Today I'm speaking with Ajay Nair. Hey, Ajay. Thanks for joining me.



Ajay: Hey, Jeremy! Finally on the show, yay!

 


Jeremy: Well, I am glad you're here. So, you are the Director of Product Management for AWS Lambda at Amazon Web Services. So I'd love it if you could tell the listeners a little bit about your background, kind of how you ended up at AWS and then what does the Director of Product for AWS Lambda do?

 


Ajay: Okay, first, thanks for having me on the show. I've been a great follower of both your talks and blogs for a long time, so I'm excited to kind of finish what's been an interesting year by spending time with you. I’ve been at AWS now coming up n about seven years; been with the Lambda theme for pretty much the whole time. Tim Wagner and I were the founding folks for AWS Lambda way back when. I spent a whole bunch of time at Microsoft and some other software companies before that in a combination of development and program/product management roles. I ended up at AWS only just looking for an opportunity to go and build a new product or a new service in the cloud space. I’ve done a whole bunch of things with developers in big data platforms so far and they signed me on this top secret effort which they said was going to be a new way of doing compute. Here we are seven years later with me as the Director of Product Management. 

 


So my role as Director of Product really is to help figure out the why and the what of what we should be building and evolving Lambda for, so everything that's happened to Lambda over the last seven years is in some way my fault. So, yeah, in all seriousness I get to spend time with customers to figure out what the right thing to go and build for them is and help the team figure out, build it, and then help the marketing and sales team sell it. That's kind of what my day job is and it's been a great ride for the last seven years and here I am. 

 


Jeremy: That's awesome. Well, I am super excited to have you here. You know, you said it was an interesting year, that's probably an understatement, but not only an interesting year in terms of everything that's been happening, but also an interesting year for serverless as well. And we just finished, I think it was ... what? ... like week 27 of re:Invent. Oh, no, it's just week 3! But it felt like everything this year's just felt like it dragged on incredibly long. But so there were a lot of really cool things that happened with serverless this year and in your purview is more around Lambda, obviously, you're the Director of Product there, but there's so many services and things that happen at AWS that interact with it.

 


And I think what would be really great to do, and I want to be respectful of your time and of our listeners’ time, because I'm sure you and I could talk for the next ten hours about this stuff and then have to take a break and talk for another ten hours. But so we'll timebox this a little bit. But I do want to start with just kind of a year-in-review of the things that have happened to, you know ... with serverless with Lambda. What are some of the new capabilities, what use cases do those open up? And so let's start with re:Invent. Let's start with the big ones that happened at re:Invent. We can work, sort of work our way backwards and then hopefully you can kind of put all this stuff together. But so let's start there. Let's start with the big one. At least I think this is a huge one because it opens up a lot of, I think, capabilities for other people to get involved and that has to do with container packaging support. So what's the deal with that? 

 


Ajay: Yes, the idea behind this, as you said, is allowing you to bring Lambda functions packages, container images, and run them on Lambda. You know, this is an evolution of the team we have seen for a while where there's a set of people who say I like to build my code a certain way, but I want to run it the Lambda away and zip just isn't my style. And actually more specifically, I think the interesting aspect that is Lambda is enforced this sort of dynamic packaging structure, right, like where the runtime and layers are bound and execution time was doing something statically and I think something has happened since the beginning of Lambda is this evolution of more consistency across local and online development and trying to push that forward. 

 


And we just saw a great opportunity of saying, you know, the container ecosystem’s done a really nice job on the tooling and developer for front of this, driving consistency across the two brings the best of both worlds over there. Then we try to do some interesting bits over there too, like with the runtime interface client allows you to kind of work with Lambda’s event over execution model while using the container development model. The runtime interface emulator lets you get much more consistency on your sort of local testing than we have had in the past.

 


I mean, you have great Community Heroes like Michael Hart essentially powering large pieces of that, you know, we have taken some of the burden off his back too by standardizing some of those components and taking it over there. But it just to your point opens up a whole set of new use cases, right? Like if you've previously committed to the container ecosystem as a tooling in a block for you, you now have access to all the goodness that Lambda was bringing for you as well.

 


Jeremy: All right, and that's one of the things that I thought was sort of Interesting when I first heard about containers on Lambda, I was like, oh no, what's happening here? And I love containers. I know I sort of joke about it. Not a fan of Kubernetes, but that's for different reasons. But the idea of containers running on Lambda, I was thinking that seems like … you know, now we're really confusing things, but it's not really like a container running on it,  it’s really the packaging format, right?

 


Ajay: Yeah. Yeah it is. It's just a packaging format. You know the team I joke that if people thought serverless was awake then wait till they hear about containers! You know you start realizing the one containers used as a packaging format, as an execution model, as a slang for Kubernetes, as a sub for an architectural pattern like microservies, and when you say go and tell people, “Hey, Lambda now supports containers,” they're like, wait all of the about our work on Lambda and so you kind of have to do a little bit of separation, say no it's the packaging format, the execution model stays the same. It's still that, you know, if I will model event open behavior that you get. You get still all the security and isolation that you’re used to, you can just package code in a much more familiar way and get access to a broader ecosystem of tools.

 


Jeremy: Right, and especially with the idea of the images being able to be 10 gigabytes, I mean, now you have this ability to put all of those libraries and their packages all together and not necessarily have to worry about the Lambda layers and connecting all those things.

 


Ajay: Yeah. Exactly. The size is one … that's one we actually debated quite a bit. Like I'm a big fan of less is more. I'm not a big fan of writing fat Lambdas and all the other variants that are out there, but I think the 10 gig one is really interesting especially for some of the emerging use cases like, you know, machine learning, and larger images, and dependencies coming in. So yeah, I'm just excited to see what people do with this larger limit and kind of play around with the two.

 


Jeremy: Now, I know Andy Jassy had announced EKS Anywhere and ECS Anywhere, but really what you kind of get with Lambda now, too, with this packaging format is you kind of get Lambda anywhere, right. You can sort of run this Lambda execution model in different places. Not that you would want to, but if for some reason you did that's certainly an opportunity.

 


Ajay: Yeah, I got it. I would say you can run Lambda functions anywhere, that's a key distinction. I think, you know, one thing I just want to make sure it is, it's ... this is not about recreating Lambda the entire service also, which is what ECS and EKS really let you do. They let you replicate the service in your own environment, but from a perspective of taking the same code and running it in multiple places, kind of what that container ethos really embraces, very much possible with the runtime available to you. There’s still a lot of work for you to do, you know Lambda does a bunch of work for you underneath the covers, but at least it's possible right now, much, much more than it was before.

 


Jeremy: Awesome. All right. So then the other big one for me had to do with reducing it down to one millisecond billing. And I’ve talked to a lot of people about this. They said, “Well, you know, my Lambda bill’s only $100 a month anyways, right? So it's really not that big of a deal.” We'll get into a couple of reasons why it is a big deal connecting some other services, but one millisecond billing. What's the thought behind that?

 


Ajay: You know Lambda has to get faster, better, cheaper. That's been what my driving philosophy was always from the beginning. It's funny, 1 millisecond billing was one of the things that Tim and I discussed immediately after Lambda GA and we were like, well, we're not quite there yet. We just got the service started. Let's figure out what the response is to the product before we go and push there. But realistically what we saw was, you know, you're kind of seeing this breadth of use cases on Lambda where a whole bunch of people are like, look it's a couple of seconds larger workloads, big data kinesis, etc. for these data intensive processes but a lot more interactive workloads, especially when there's a bigger push for performance, you know lightweight one times go and others are running and that sub hundred millisecond bucket.

 


And we’re, like, look, there's a way for us to save them, you know, 40, 50, 70 percent of their bill by just changing the billing granularity at no effort for themselves. Like, that's a huge rally point and if you can go and tell a customer saying that, hey, you actually making a performance better is going to make that difference right? You're actually literally saving money by making the experience better for your customers. That bag was a really compelling value proposition for us. Like we just felt like there was an entire class of use cases we could make 70 percent cheaper. We found a way to make the money work and we’re, like, let's shoot it.

 


Jeremy: Right. No, that's awesome. And I actually saw a tweet, somebody showed a graph of what their Lambda bill was before and after 1 millisecond and it was, like, a 40 or 50 percent reduction. I mean, it was huge. And I think there are some of those use cases where if you run enough of them and enough frequency, you know, that that really matters. And optimizing to get under a 100 milliseconds, you know, in not meeting it and running 401 milliseconds and getting built for 200 milliseconds. I mean, there's a huge cost savings there.

 


So another big thing just in terms of optimizing speed, you know, and this is something great that Alex Casalboni has done with the power tuner, right, now is 10 gigabytes of memory with up to 6 virtual cores. So now you have an opportunity to really tune that even more and get those things just cranking and of course the use cases that opens up.

 


Ajay: Yeah, exactly. I think one of one of the internal demos we had done was running a 30 millisecond ML inference which used about 4 and a half cores which spun up to about 15,000 concurrent at peak and the bill was, you know, sub 50 dollars at that particular point because it ran for such a short duration of time and I think for us these two features are sort of enabling different points in the spectrum, right. Like 10 gigs and 6 cores enables you to far more data intensive use cases, compute intensive use cases, for running sort of these bigger, beefier workloads not feeling capped or restricted by the amount of compute available to you.

 


And 1 millisecond was saying, well, you can still do that in this sort of probably microcosm of use case of usage to get that cost efficiency even when you're running these really, really large-scale workloads. And to your point, Jeremy, I think that the combination is really powerful because you can now performance-tune to a point where you get to sub hundred milliseconds and you are incentivized to do so, right. You're incentivised to keep pushing the number even lower with the new 1 millisecond billing.

 


Jeremy: Yeah, that's amazing. All right. So then another couple of other things that launched at re:Invent had to do with sort of just event sources and controlling event sources and that … this is a really long conversation, so we're going to have to try to keep it somewhat short, but the the big news I think was, you know, again Apache, Kafka, and Amazon MQ opening up as additional, you know, event sources for Lambda, which is important, I think, because you do have a lot of enterprise workloads or existing workloads that run on those types of services. So it's a nice little gateway into serverless to start introducing some more of these serverless things. You can comment on that if you want to; less interesting to me because I moved away from those because I use all serverless things now.

 


But what I thought was really interesting were two things that launched. One was custom checkpoints for Kinesis batches and for DynamoDB streams, right? So you have ... if people are unfamiliar with this, and you should probably be explaining this, but you have the ability to bisect batches in the past where you could say if the first half of the batch, you know, you keep splitting the batch so you can get rid of that poison pill, but you would reprocess the same event or the same message over and over and over and over again until it was finally in a batch that fully succeeded. So that's changed now with custom checkpoints. Can you explain that a little bit better than I just did?

 


Ajay: I'm going to try; that was a pretty good one. But that's the philosophy behind it is that this is a more advanced failure management scenario when you're processing complex patches, right. So, Kinesis Dynamo is one of the ... is the oldest event source for Lambda at this point, right? So you're going to see more of the enhancements coming out on that front and what this now enables you to do is get far more granular control when you're passing these larger batches, right?

 


So one core use case we’ve seen for Kinesis and Dynamo is these sort of analytical and aggregation patterns, API signals or you're bringing in, like, machine operational data in there and doing so, and we kept seeing this pattern of people saying well, it's not just a collection of records that's bad that showed up over the window of time. There’s this one record that has malformed patterns or records. And this just enables them to say that up to now I failed; let's stop right here checkpoint, move on. That was one thing that I really liked with the ... what they do with, sort of, KCL and if they’re self-hosting, but that sounded like an artificial choice. You know, like, I have to either use KCL or self-hosting. We'll go over there.

 


Jeremy: Yeah, no. And again, that just ... it just opens up a huge level of efficiency and the other thing is maybe people don't think about it this way. And I don't know why I'm always thinking about the billing aspect of it, but that reduces billing because that is less invocations of your Lambda functions in order to process those events, right? Because if you're processing the same batch over and over and over again, you’re reprocessing the same thing multiple times, we can just … another way to kind of bring that down.

 


The other one, though, is the idea of tumbling windows in Lambda, and this is super exciting for me because I actually … one of the PM's reached out to me many, many, many months ago and I gave some early feedback on the design on this, and one, I think that's a huge testament to, and I'm sure I was just one of hundreds of people who probably commented on this, but it's a huge testament to what AWS does in really getting these things in front of their customers early and trying to figure out what those use cases might be, you know, before they end up so you're not just, you know, sort of building things in a void. But tell us a little bit about tumbling windows.

 


Ajay: Yeah. No, it's funny that this is but this is actually a really good example of that because when we originally started out this feature, we just thought of it as stateful processing for Kinesis, and over time we were like, look it doesn't make sense for us to just go and say, “Here’s state; good luck.” You have to kind of make it work for a specific aspect that works in this particular scenario. And, again, tying back to that analytical use case, right? We kept seeing this repeated pattern of people running code on their Lambda functions, writing some little bit into DynamoDB, reading one record from DynamoDB, and then processing and moving forward over and over again. And we said, look we can just simplify this entire stack if you just enable Lambda to have a little bit of smarts and how it passes data back into the reprocessing of the Kinesis or DynamoDB stream and that's kind of where we took this tumbling window operation. That's a pretty standard one in most stream analytical behaviors out there and said, we now support that operator.

 


The actual primitive enabler is what’s more fascinating for me, tumbling windows is just one specific pattern that this enables, right? Like you could go to look forward and say hey, can you do other kinds of aggregations on this? Like, can you now have a formal … you know, this is a primitive reduce, can you do something even more smarter over there? Can you now combine it with something like step functions and start building even more smarts on how these all orchestrations come together. And I think that's what is really cool about this in terms of how It's actually built out. All right. It is one of the top ... it's only been … it's been less than a week since it's come out, but the internal data shows, it's quite popular already.

 


Jeremy: I can imagine. Yeah. No, it is. It's a huge solution because every other solution you'd build around that with something janky, right? It was like loops and step functions or, like you said, writing data into DynamoDB and then reading it back just to do some simple ...  just to pass in the aggregation, or whatever it is, and the aggregation, the state itself, I think can hold up to a megabyte of data, so, I mean it can hold a pretty good chunk of data that gets passed from invocation to invocation. And yeah, so just super interesting there.

 


You mentioned step functions. And again, I know step functions are a little bit outside of your purview, but super important as an interface into AWS Lambda because they enable you to do orchestration and this is going to go back to why I think the 1 millisecond billing is so important because what we used to see with Lambda functions is ... I'm sorry, with step functions ... is you would use that oftentimes to do function composition, right? You might have a function that does some sort of conversion of your data, another function that maybe writes that data somewhere, another function that then maybe, you know, processes it some other way or generates an event, and then maybe something that returns it, you know, and back to another system.

 


And that was always one of those things where it's like you're paying for every transition. You're paying a minimum of a 100 seconds for every single step function that runs and, by the way, it's asynchronous, so really it's got to be a background job that runs anyways. Synchronous express workflows, I think, are one of the coolest things I've seen come out of AWS in a very, very long time. Maybe even cooler than Lambda because what this allows you to do is this is the answer to function composition, at least in my mind.

 


So no more fat Lambdas, no more, you know, Lambda lifts, no more, you know, having to push a bunch of things behind the scenes. This is now a way that you can say, I have a Lambda function that does this very specific thing, generically, by the way, right? It doesn't have to have a whole bunch of specific things that it does, or it does have to be tied to resources. Then I have another Lambda function that does this, another Lambda function that does this, I want to put those all together and I want that to happen in a synchronous loop. That's possible now.

 


Ajay: Yeah. No, you know, it has been six years since Lambda came out. So it's about time something more cool than Lambda launched for sure. This was actually one of the really exciting launches for me too, so as you can imagine in my role I end up working quite closely with a lot of the broader serverless portfolio at AWSt anyways, and you know, the step functions of Lambda are sort of PB&J for us at this particular point. So this particular pattern … it's funny you bring this up, one of the things we were really excited about was this exact granularity of resourcing and duration that will enable because of the synchronous patterns, right? So we had customers who were doing metadata retrieval, analytical and processing using an ML model, and then a long I/O wait time to write the output into something all-in-one Lambda function, but then they had to run it as you know, it was like I think a 1.8 or 2 gig function because they were like, hey, we have to run this major processing on it.

 


Now that whole thing splits up where you have like the cheapest Lambda with like actually now you can do some 128 mills. You can have a 64 meg function just doing simple I/O finishing in a couple hundred milliseconds. You're ML model beefing up to 10 gigs and doing the whole thing, and then a simple I/O right there at the end, you know, doing super cheap. Your overall costs were reduced by at least probably 20 to 30 percent in that but your performance behavior looks kind of consistent, right? I mean, this is one of the things we are really excited, even when we put Express out there is it enables you to run a whole bunch of things at scale asynch of the first pattern that it went out with and now with sync use cases you have, like you said, all these new things that are opening up so that that was one of my favorite non Lambda launches inside re:Invent as well

 


Jeremy: But it ties so closely to Lambda and the reason … and then you mentioned this, this is the pattern, it's that 64 megabyte or the 128 megabyte of RAM in order to do something simple. And I wrote a blog post a while back that was basically, you know, about if you're paying to wait for Lambda, especially if you're paying to wait for something like an API call, then don't, you know, don't run it at a gig of memory. I mean, that doesn't make any sense, right. Run it at 128 because it's not going to be any faster or slower and if it's any slower we're talking milliseconds.

 


But this is what I love about single-purpose Lambda functions in the first place is the isolation model not just from the idea of, you know, just that code is very simple and it's running it there, but you have the security isolation. You have the concurrency isolation. You have the memory isolation, have all that stuff that's there. And if you think about the ability to say I can run this particular Lambda function to maybe, you know, hit an API and all it’s going to do is bring that data back and I can run that like you said 60, or get a 64 megabytes then pass that response into something that can do the actual processing on it or whatever. So that's super huge. I love that pattern.

And what's another thing that was launched at re:Invent that was announced, was now that you can invoke these synchronous Express work flows directly from HTTP APIs, so now you don't need a Lambda function sitting there saying, okay, I'm going to invoke this, wait for the response, and then spit it back. Now you just take that step out. You go directly into the step function and, again, any things you want to add, of course security and authentication, that's all added at the API level. But anything you want to do within that, you get all that data, you get everything you need to do these really complex synchronous workflows, by the way, in parallel, too, if you want to run parallel executions. There's all kinds of crazy things you can do all within this, you know, single round trip to the server. I'm gushing about this, but to me, this is just fascinating. 

 


Ajay: Yeah, Jeremy. I mean, the stacking was conscious. Right? So HTTP API is on synchronous API as a Lambda function was exactly the pattern you're going for. I think this is one I'm really excited to see how the CDK folks will respond to this because it lends itself really nice to, sort of, these programmatic creation of more complex things using these service primitives in the end, like the API in the workflow and the functions expressed this code and I'm bringing it together. And once even about the isolation model is one pattern we started seeing already is customers splitting up their execution roles for all these individual functions. So that the first function that’s retrieving metadata only talks to the DynamoDB table. The second function only talks to S3. And the third one only talks to the HTTP API, like, that granularity of isolation and you know, even interface isolation, so to speak, and what they speak to is extremely powerful, not to mention the fact that the teams can work on those things separately if they so choose.

 


Yeah. So yeah, this is one that hopefully next year you and I are having an entire dedicated session just about this. With my Step Functions friends, of course.

 


Jeremy: And the reuse. I mean, that's the other thing to me that I think is really interesting is the reuse aspect of it. So, you're right. You can have one function that can only talk to DynamoDB while another function only has the secrets available to communicate with the Twilio API. I mean, there's just so much isolation and then the, you know, principle of least privilege there, but then the ability to reuse it. I mean build a generic function that queries Twitter and all you do is pass in what the, you know, what the hashtag is that you want or something like that. I see there's just a lot of really cool stuff you can do with that.

 


Okay, I want to move on because we've got more to talk about. So, before re:Invent, there were a number of really cool things that came out as well. One of the big ones was EFS integration. I know this happened quite a while back, but this was something, I think, big not for a lot of my workloads, not a lot of use cases that I have, but certainly the naysayers on serverless ML, you know, you can't do machine learning in serverless. This was a big one, I think, to kind of quiet them down a little bit.

 


Ajay: Yeah, I mean … I will say one thing that is always fascinating to me is how much social media chatter happens about Lambda for using web and API use cases whereas how much of our internal use shows up for these really brutal data intensive use cases, right? Like one of them, you know, one of our big probably customers who talks about it all this time is, like, Fannie Mae running Monte Carlo simulations on Lambda, right, and, like, this whole ML inference is a huge segment that has kind of grown for us even more and you saw this in the NGB launch as well. I think for me EFS is a combination of things: one is, like you said, just knocking off the I can only use 512 megs limit inside Lambda.

 


You're actually getting a solid persistent store that comes with it that is as performant if not, you know, matching Lambda’s behavior in terms of familiarity and billing and others that go with it, but sort of just enabling sort of these early fast access patterns on Lambda that meets the performance needs the customers have. Like, I think Azurian has a story out there about how they're using Lambda for these ML model storages at this point using EFS for exactly that reason. Like, they're able to serve customer-facing requests on that particular stock running really, really fast and that combination of 10 gigs EFS etc. is kind of pushing you towards this new use case direction of ML inference that I suspect we'll be hearing a lot more of in the future. 

 


Jeremy: Yeah, no, that's ... and again, it's, to me, it's a lot about use cases. I mean even one of the other ones that launched sort of pre:Invent I think it was maybe in October or maybe November was SQS batch Windows. Simple, simple thing. I mean, but again, when you added SQS as a trigger to Lambda functions, I think was in 2018, that opened up a whole bunch of really cool things where now you didn't have to have cron tabs running to poll it, whatever, but then I think what people quickly realized was now, you know, if you've got small batches or things coming in too quickly or not quickly enough, I should say, you've got this issue where you're executing a lot of Lambda functions over and over and over again potentially needlessly. Whereas now you can put them in batches of 10,000 and process some big batch. Now again; no bisecting there, no iterators and things like that on SQS yet. Sure but yeah, but that, I mean, that's one of those things, that's one of those things. I thought that was a really interesting one.

 


And then Lambda extension. So that was another one that got … that was fairly recent. So this is something where, and maybe this is a pattern that we're starting to see, but the extensibility of Lambda, right, like, making it and, again it's called extensions, but no, this idea of extending Lambda so that you have more control over the runtime. You have more control into the execution model into the lifecycle hooks and things like that. So, what's the thinking? Well, maybe explain what Lambda extensions are and then what the thinking behind that is?

 


Ajay: Yeah. Now that's a good one for us to get into. So, the idea behind … so Lambda extensions is built of something we launched called the extensions API. So it's a pure to the runtimes API that we launched in late 2018 that allows you to access essentially life cycle events from the Lambda execution context. So, you know when the execution context is spinning up and it's been shut down when something is running inside it so when your function is actually being invoked. We also launched something called the logging API that gets you programmatic access with the logging streams that are being generated from the limited execution environment and the code that's running into it.

 


Now what this conceptually enabled ... this was very much one for our partner ecosystem. Right? So one thing we realized very quickly is customers like using their own tools, right, like it's good if there are defaults, but they like their own tools and we've had this whole great ecosystem even around Lambda with, you know, people like Epsagon and Thundra and, you know, IOpipe, and Datadog, and others who were trying to make sort of the Lambda debugging and diagnostic experience really powerful, but the fact that they didn't have access to this additional metadata was kind of kneecapping them. So we said okay, let's open that up.

 


You will notice one of the things we really tried hard to do with extensions is it doesn't change the experience of the person writing the function, right. The person writing the function still just either includes a layer or does something different. It's what  … it's these partner ecosystem people who are building the extension who get these additional capabilities like, oh, now I can know when a function started so I can start tracking a specific choice at that point. I know if the execution environment is spinning down so I can flush my log buffer and send it out there, or I can expire my credential because the function is gotten ... is finished.

 


And so I need to write something back to Vault or CheckPoint. Like it just enables a whole bunch of these patterns around how the function itself is operated that I think is really, really powerful and it's another one of those pots that you're clearing out for Lambda, right, where you’re like, I would love to use Lambda, but I can't use my own operational tools. Well, now you can with identical capabilities really to any other computes to that you have out there.

 


Jeremy: And I want to get into the partner aspect of stuff there. I want to finish up with, sort of, these launches and then we can jump, we can tie that … sort of, jump back to that. So, a couple of other things, and I'll just mention these quickly, EventBridge archive and replay events again, not necessarily something that you're working on directly, but I feel like most of the tail-end of events end up hitting Lambda function in some way and then you know x-ray integration with S3. Just this ability now if you're using S3 with your Lambda functions, being able to trace that all the way through is super important. There are a number of really cool launches with Amplify and some AppSync things, just giving you different ways to do stuff.

 


And these patterns and we talked about this a little bit earlier, you know, whether they're DLQs for Lambda functions or, you know, event Lambda destinations, or it's tumbling windows or it's iterator control, or it's more control over how Lambda consumes these events from other things. You know, what is it? They seem, I don't want to say they seem inconsistent, that's not the right way to say it, but some services offered X and some services don't. Is that a general goal? Was that something that we might see where we're seeing some more consistency across the consumption?

 


Ajay: No. So, I will say inconsistency is not the goal, but neither is consistency, right? So I think for me, event of an architectures is something serverless has brought to the forefront: the idea of composing services together with strong contacts in APS events is the lifeblood of what, you know, people like you and me spend our days obsessing about. So making that pattern more resilient and performant is something you’re going to keep seeing. What you're seeing with these controls is having them show up where they are most definitized, right? So with event and replay what we kept seeing was, sort of, this idea of backlogging and replaying state events, especially the services that our current coming to EventBridge made the most sense that ... so that's, kind of, where it showed up first.

 


With tumbling windows analytics was the big use case that we saw the control work the most for and that's why you're kind of seeing it show up with Kinesis first. My prediction, and please don't hold me as a roadmap goal on, this is what I would say is you will eventually see that sparse metrics getting more filled up right. Sometimes more is a bet that says hey, this is something we think will be useful for this customer base because we're seeing sort of this cross pattern, but in other cases just because, like you said, the demand will start showing up and I think DLQ is a great example of this. Like SQS started with it way back when, Lambda launched it, EventBridge now has it. And I would predict each of these integrations is going to see that pattern get more and more.

 


So, so yes, you will see this get consistent over time, driven both by customer demand and where we see opportunities to make life simpler for these event driven patterns built using AWS services.

 


Jeremy: Yeah. Well, I did a talk that I gave a number of times called “How to Fail with Serverless” that basically was analyzing all the different ways, all the different failure scenarios, and how AWS is built to handle them. A lot of different things, you know, synchronous versus asynchronous, for stream-based versus push, a lot of different ways that these things get handled. So seeing these patterns, you know, the broader so that you can use them on different services is, I think, is going to be a great thing. So awesome stuff there.

 


I want to mention two other things and then ask you a question about something that wasn't launched. So Aurora Serverless is v2, again outside of the Lambda team, but I just think generally a really good ... a much better way to do MySQL or postgres, you know, at a serverless scale. So just a really good on-ramp for serverless. I mean it just kind of, it lessens the pain of somebody moving into Lambda functions and realizing, “Oh, I need to set up RDS proxy and I'm going to have all these issues with things.” Just the scale, you know ... just the reliability of the databases when you overwhelm them. So when thoughts on Serverless v2. I mean obviously it's a good thing but, you know, just overall impressions on those types of services being built that are really, I guess, complementing the scalability of Lambda.

 


Ajay: Yeah. No, I think you're going to see this pattern of serverless-driven primitives, right, so databases as a service in the true sense of saying pure usage-based, highly scalable, and burstable like Lambda is really cost efficient on a program the basis of all more and more. You know, the Aurora team’s done a really nice job with serverless. We do ... I've had a chance to play with the early versions of it as well and kind of how it plays at Lambda. I do anticipate that will drive some consistency between that and IDS proxy over time so that you kind of have this continuum of saying your own database, connectivity serverless, you know, serverless database or RDBMs with Serverless v2 and then who knows, right?

 


Move on to DynamoDB if that's kind of where your flavor stands, but it's more about enabling that continuum for me and kind of making sure that you have good checkpoints along the way of going for it. So, if you are a customer who cares about the IDMS pattern, but is willing to kind of go AWS native on using some of these core capabilities with the cost efficiencies and performance you can get, it's a great choice that works really well with the way most people are running applications right now in this behavior. It's not just Lambda, right. Like even if you’re using containers are easy, too. It's the same behavior that you will see.

 


Jeremy: And I’d love to see v2 sort of handle the RDS proxy thing for you so that you didn't have to do that yourself, you know, and again data API was a step in that direction. Anyways, very cool. Right. I want to shift to my favorite sort of runtime environment, which is Node, right? So they announced the other day AWS SDK for JavaScript version 3—very, very cool because it allows you to import just individual service packages as opposed to the whole thing. Probably not overly exciting for you Python and Go developers and things like that, but exciting for us Node developers, but the question that I’ve got was why no Node 14 runtime for Lambda?

 


Ajay: Yet!

 


Jeremy: Yet!

 


Ajay: Look, I think we've had a pretty good record of keeping up with Node releases. We could not have done as well as in keeping the window close to 90 days as I would have hoped to, but, you know, it's a question of when, it's not if.

 


Jeremy: All right,  I'm sure that that will make people happy and again, you can always run your own custom runtime if you really need it. All right, so let's move on to the talk that you gave at re:Invent this year and it was called, you know, “Revolutionary Serverless” or something. What was the name of your talk?

 


Ajay: It was “Building Revolutionary Applications the Serverless Way.” So there were a couple of versions of the title then they kind of changed the launches coming in, but that’s the final one that ended up.

 


Jeremy: So I watched this talk and I was highly anticipating this talk because, again, you being, you know, involved with all the different teams that are building these features for AWS. And then honestly, you know, connected with all the other teams that are building these ancillary services and other things, I was really looking forward to this talk and I wasn't disappointed. I thought you did a good job of all of the things you can't say, because I know that's a tough thing with AWS. It's like I wish you could say, “Oh, we're building this, we're building this, building this,” but you can't say that and that's fine. But I think what the talk did, for me at least, was it reaffirmed the commitment I think that AWS has made to serverless.

 


And I know I was disappointed last year that there were very few mentions of serverless, even though there were things that were launched and there were a lot of sessions, I felt like serverless was very much so front-and-center this year, and I think your talk and also Dave Richardson's talk and things like that just kind of went over what has been launched and why you're launching it. What the … I guess the tenets, behind those things are, so I'd love to get into this a little bit and I'm going to put the link to your talk in the show notes. I think you probably have to be registered for re:Invent to watch it. It will probably eventually be on YouTube, but it should be on demand at this point. But I do want to start with the idea of these tenets and part of it is ... and I guess there's maybe three or four of them in here, but let's start with this idea of serving builders. What's that about?

 


Ajay: Yeah, I know. For me the reason I included that in the talk was to reaffirm the idea that the serverless motion is ultimately about delivering value to your customers, right. The ultimate customer for us is someone who's building software to deliver and customer value. The goal is not to make, you know, infrastructure cheaper. The goal is not to just, you know, drive utilization to Amazon servers. The goal is not to offload workloads of data centers. It’s to help builders go faster.

 


And that's a tenet that's repeated often within the team just to kind of reaffirm saying the ultimate customer is the developer. They have an entire ecosystem of people helping them out: you have operators and others to kind of go and do that. But there's a developer problem we’re solving. The developer's job is to, you know, deliver value. The challenges they'll face at doing that at scale, at cost efficiency, and others. The things they fret about, the things like security performance and scale, and that's kind of what needs to be our world and how we go and build over there.

 


And hopefully you'll see this reflected in the whole thing. Like you’ll see our services are designed as application primitives. We talk about applications front-and-center all the time. We talk about application patterns that are enabled. Everyone who's out there talks about how quickly they are able to build and deliver value, and I think that's what resonates for us when we say, okay, this thing has actually got legs, you know. This whole motion is about doing less to do more, as I think you've said quite often, too.

 


Jeremy: Yeah. I know, and I think the idea of, you know, being more productive, building things that are ... that aren't, you know, this term has been used a million times, but undifferentiated heavy lifting, right? This idea of doing the same things and just enabling people to build better things. I remember, I think it was last re:Invent, we were having breakfast together actually, and you asked, “How do we explain serverless to people?” or something like that. And I said serverless is just the way, at least for me, it is. And then it's funny because now, you know, “this is the way,” is from the Mandalorian. I don't know if you watch that show. Anyways, I think he's talking about serverless stuff.

 


But there's some design philosophies that have to sort of go into, you know, understanding how it is that you provide people with the tools that they need. And so you've got some design philosophies that are this idea of, you know, “ephemeral” which I love, right? Like this idea of, like ...I've had so many companies that I worked for that have had a server up and running for, like, you know, this server has been running for six years. Don't reboot it because if you reboot it we have no idea what will happen and that's the worst thing you can do.

 


Also this mixing request, right? So you have things that handle multiple concurrency. I know it sounds good that you can handle a lot of requests for the single server, but that introduces a lot of problems, right? And then you also have this issue where, again back to the idea of not being able to shut that server down, where if you have to babysit something because if something changes or the memory gets wiped or something happens, you know. And this goes back to that sort of “cattle not pets” argument. So talk a little bit about that design philosophy.

 


Ajay: Yeah, so I spoke about this in the context of what I call “compute for all,” right. So one of the big things the Lambda’s enabling was saying it's the ultimate democratization of compute. Like how can I give you access to the entirety of AWS’ compute power without you having to become an expert in distributor computing, right, like dealing with all the scale problems and others that go over there. And the reason we kind of picked these three as, sort of, are driving tenants was exactly what you called out, right. If you look at cold problems our customers deal with around, you know, security and maintenance and otherwise, part of it is driven by the fact that they assume this thing is alive for a long time.

 


The longer it stays the more craft it enables, the more complicated it becomes to spread the workload around, right? Like you get into, say, things like affinity and state which gets far more complicated. The entity becomes addressable not just for your attachment purposes, but for security purposes, right, and security is like really top 10, it is the top 10, and for us as we're building through and it's one that we want to pass on to customers on that particular front as well. And for me, you know, people often say, oh you're saying ephemeral; ephemeral means not durable. For me it's more the temporal ephemerality of it. Like it exists, but it only exists for a short period of time when you need it to exist. And, you know, I'm not saying that that means, oh there's a finite … as long as there's a finite bound to existence, that's what matters, and that's that sort of human contribution.

 


So if I start saying, oh, it's finite bound, but it's bound for a month, that kind of breaks the model a little bit over there, right, but like 15 minutes, 30 minutes, an hour? Like, sure, that's within bounds, like, that you're still within bounds of things being cycled and cleared out and going out over there. And then like you said the same thing with tenancy and isolation, right? Like one one request, one execution environment was driven by the same thing. You need consistency of performance, that's a hard problem to do in distributed systems. You need isolated resources to run and execute the code that you're doing. That's a hard problem to go solve. You can enable multi-tenancy.

 


But again, like you said, it sounds great in practice and there are a whole bunch of patterns you have learned in the past that do it really for efficiency reasons, right? Like the funny thing I keep and when I talk to people about multiple requests for execution, they're like, well, it's because it's cheaper. I'm like great. So what if made it cost you, you know, $0.01 per billion. Is it okay then? And they’re like like, yeah, oh, yeah, then I don't want to write multi-threaded code. I'm like great. So let me do that then I'll make it cheaper for you than forcing you to write a multi-threaded code on that particular file and it's that combination for it.

 


And for me the last bit that Lambda does really uniquely even now is saying you deal with resources, you get CPU, you get memory, you get configured. and your code is the thing that's important. That's the addressable thing; not the resources, you're not binding your function to a collection of machines or a pod or anything that's addressable. You're just saying I want this much memory every time it runs and whether I spin up 18,000 quarters in the backend or 1.8 cores, that's not your problem. That shouldn’t be something you worry about where those cores exist, where all that happens. That's going to be the driving philosophy of it. But then again the whole idea is the less things you have to think about from a distributed computing perspective more than what you need to build and deliver value to your customers.

 


Jeremy: Right. Yeah. And so the other thing, the other piece of this, is that you provide all these these primitives and capabilities and I love this idea of democratizing, you know, compute because that's one of those things where I remember long ago, you know, just having to order racks and racks of servers from Dell and paying thousands of dollars a month in order to put them into a colocation facility somewhere. I mean even just EC2 instances and VMs. I mean that was a huge step forward where I remember I was paying, I don’t know, something like five or six thousand dollars a month just to run a colocation facility. I moved it all into EC2 instances and dropped to $700 a month, right? I mean just that huge shift there, but now we're now we're not even talking about $700 a month. We're talking about, you know, maybe a dollar a month, if you're building a start-up, I mean, that you can get so cheap to do this that now the barrier to entry is incredibly low.

 


But there's a caveat, there's always a “but” to these things, and that has to do with one of the philosophies you mentioned was personal productivity. And I find this to be one of the most frustrating and the consistently or ... the thing I hear from other people I talk to all the time, is the frustration over developing serverless applications. There's SAM. There's serverless framework. There's Claudia.js. There's Begin, the architect framework. There's all these different ways that try to make it easier for you to build serverless applications, but one of the complaints that I've had, and I know other people have had this complaint, is that you know, AWS isn't always the best with tooling, right? I mean the tooling is somewhat disconnected.

 


It's not always consistent, but that's something where, I mean ... and again, I'm not even sure there's a question in here, but just more along the idea of saying I get that that's a tenet, and what are the plans? What are you planning on doing, I mean, to bring that to make people more productive? I mean, obviously I think the container aspect is one thing, right, you know, meeting people where they are, giving them more capabilities, but what are the other things maybe that you're trying to do or have done that you think are sort of solving that problem?

 


Ajay: As you can imagine, Jeremy, like, this is one of the top things that keeps me up at night as well. Like, how do you make ... on one hand we're saying that about making builders more productive on the other hand, you know, the feedback we hear is you're not doing enough. Like, you have to kind of keep pushing the bottle with that and there's a lot of things AWS does well. I think we are really good at building services and growing them, but when it comes to aspects like developer experience, and this is where the personal productivity aspect comes to me, I think the nice thing about the philosophy at least my team follows, and I know AWS overall does this, we can't do this alone. Right? Like, it's not easy to go in and just say, this is the way you're going to do it, take it or leave it and then get out of the way, right, and that's fine. That will work … that will always work for a niche audience. But if you want to go broad with your story, it has to be something that you do in combination with others.

 


So, I think for me a big push … that that's kind of where the big push between the runtime API container support and even the extensions comes in. Like how do you get the rich ecosystem of partners around AWS to help customers solve that problem and kind of do better on that particular function. I think that's one. I think the, you know, I'll go back even to Serverless Framework. Serverless Framework was actually put out by Austen and Go first even before SAM and others came out, right, like, that was one big enabler for the early drive around it. And that was really nice innovation out there. It kind of started tackling the problem of standardizing in deploying serverless applications that inspired a whole bunch of other tooling pieces that came around as well. So that's one.

 


I do think serverless has a unique challenge where you cannot have … there's a new conceptual learning that you have to go through. Applications are built composed of services talking to each other through APIs and events where there really is no defined pattern. So you're now starting to create tribes, right? So you have a bunch of people are, like, no this should look and behave exactly like Web Frameworks and I'm trying to build end-to-end stuff, and you have kind of the Claudias and others show up over there. Then others were, like, no, I'm just going to treat this as a general-purpose, slightly better infrastructure-as-code story. And that's where you kind of have the SAMs and the CDKs with their own tabs versus spaces debate that is kind of sparked off over there.

 


And I would say the real big opportunity, and I actually really like what Begin and architect folks are doing over here, is starting to sort of embrace that service for nature of it and go. How do you make it easy to compose ourselves together and build forward and do more over there. And I would say that's what you will see … the place you will see AWS potentially in when, and I do this pretty cross out of services, is when you see commoditization of these particular patterns, right? So that's kind of what happened with SAM. We saw every single tooling provider going and saying, I need a way to express a serverless application as a combination of services. We’re like, okay, all of you don't solve it ten different ways; we’ll do it. We’ll give you a default standard CLI, but by no means are we saying only use the SAM CLI? It's an easy and default way and we're going to keep trying to make it better but there's always a rich ecosystem of tools that you’re going to go and do it with. The same thing with diagnostics, with extensions, you've gone past CloudWatch. You know, how can you DataDog as much as you do over there.

 


So like, I gave a non-answer to your non-question. But the whole idea is that I think we're not going to be able to do this alone. This is gonna be an ecosystem story over there. AWS has to get better in offering more vertical solutions in these particular things and I think that's kind of where the space you will see as investing more. I love what the Amplified team is doing. That's my favorite example of a good, you know, simple experience, and then able to go with code defaults into an experience focusing on a single use case.

 


Jeremy: All right. Yeah, and then and that is always where … you know, and again, my criticism is only because I want it to get better, and I think that you know, the constructive criticism is always good. But AWS is always very good at building these stacks of these stacks of services … like, these services that do a very very good or simple thing. And and I guess what that's like you mentioned in your talk, you know, this idea of application primitives as a service which I think is is a really good, you know sort of way to think of it where you've got your computed data, you’ve got your integrations, you've got your tools, security and admin all baked in.

 


I think those are great things but you mentioned something that probably is the hardest thing to explain to people sometimes and you said, you know about architect sort of doing a good job of connecting services. Cross service connectivity is extremely hard. It is not an easy thing to, one, do, but also to grok.I mean, just to understand how, you know, service X connects the service Y and then, of course, the observability challenge that's in there. But so just a little bit more on that. Like I know there have been things that launched at re:Invent and this past year, but what is AWS doing to make that easier?

 


Ajay: Oh, man, I think this is actually something AWS just needs to solve on our own. This is not a tooling or ecosystem problem. We own the services beyond the interfaces between them, we have to make it easy. So as with all things, you know, security comes from so you will see sort of this consistent pattern of resource-based policies between each one, IAM-based roles policies between each one, granular tenets being talked about each one, across the spectrum of services that can talk to each other. I think for me the next big one is around sort of these reliability controls. 

 


So, the DLQs, the checkpointing, and others that enable to kind of go and do over there, because a message … you need to know who you're talking to, you need to be able to talk to them securely, you need to be able to get the message to them quickly, and in a  reliable fashion and you sure that gets one to the other. And I think then it opens up this pattern of saying now what are the new sort of line specific use cases that you can enable, right? So this is where your batching, your replays, your aggregations time windows, and all of those show up.

 


But, you know, we by ... because you own both sides of the equation that's kind of where the power of AWS can really come in. Like, we're really good at doing collaborative common security scale. Like we should make that as much as possible easy for you to do. I think my prediction for you is I would say you're going to start seeing a far more consistent API-driven story for enabling all these controls across all these connections. You're going to see less and less of this required to be solved at a tooling level and more of more of this being solved at the API level within the AWS services. 

 


And even in the broad … for the broader set of services to participate in this I think that's where EventBridge comes in. Like, EventBridge has to encapsulate all these connectivity as a service capabilities, and so then if you have your own self service and you're like, well, in order to talk to an AWS service, I need these, you know security availability, reliability controls, you just plug into EventBridge and that gives you all of that in a box, but for all the other connectivity, it should just be part of the API, right? Like my dream is the events all snapping that we have for Lambda just sort of universally works for any pattern that you see out there. X-Ray flows occur within the service, tracing is enabled by default, logging is enabled by default. But you know, you need an odd start to keep going through it and as with all things AWS, it comes together quickly and over time.

 


Jeremy: Yeah. Well, and I think, you know … you mentioned, too, the API economy in your talk, and I think this is fascinating and I know one of the developer advocates, I think for the Amplify team, had written an article about sort of a big difference between sort of the Haves and the Have Nots in the the API economy a little bit, and that's probably a longer discussion. But, you know, there are a lot of APIs, people are building service full applications now, that's just the way that it's going, right. So you have this idea of saying okay, if I want authentication I can get that from Cognito, but I can also get it from Auth0, and if I want I want email as a service I can use SES, I can use SendGrid, if I want SMS I can use Twilio or I can use SNS.

 


So, there's all these different services that are there. But what I think is really interesting in terms of what can be done, and you mentioned this, is that interconnectivity between those services where, you know, everything is in silos right now. So you say SNS is a service and I have to understand the nuances of interacting with that particular service like it is with any API and that I think is a huge opportunity for AWS to say if you just need a cue, you know, you just attach Lambda to it and it just handles all those things that you would want it to handle and you're not writing all that code, right, just reducing the amount of code that you're writing. And again, I don't know if there's a question in there. But just I think that's really interesting as ... especially considering the fact that serverless to me is more service full now, right, that's really a good way to think of it. It's just what are the, you know … maybe just for the benefit of the users or people who are not quite convinced yet. Why is this sort of service full movement such an important thing?

 


Ajay:  Flat-out, I think that is the biggest factor to speed that serverless brings to the table. Like the fact that you can cherry-pick components of your customer or product by relying on other people's expertise, right? So going out there and saying hey, I know Jeremy Daly, you have built this great chat service that has … and I trust you to offer me full lines of availability and a certain performance guarantee, and as long as the user API, I’m good, that the incentive for me to go and rebuild that elsewhere is negligible. Like it doesn't help my business to go and rely on anything else.

 


And I think what that basically does is your now recruiting an entire collection of experts of really deep domain experts to be part of your operational team, to be part of your development team where they’re continuously improving their portion of that tiny little product and making it better to move faster. The scale is getting better. The performance is getting better. The capabilities are getting better, while you innovate on the part of the stack that you want to. And what's fascinating for me is, you know, that is the true vision that we all had when we went on microservices development as well. Like you can do independent development of different pieces. They're all you know, small pieces loosely joined that talk to each other and they can innovate separately. The only difference is this is not just your organization sitting and doing it, your two-person startup. You have now, you know, 22 person startups and AWS innovating on your behalf, just to make your product better. Right?

 


Like, you're 1 millisecond example is a great one. Like if you were a start-up who was running on us today and you happen to use Lambda for your backend compute, your bill just got 40% cheaper, which you can now pass on as end-user savings with you doing nothing. Like imagine how much work you would have to do to go and get that kind of behavior over there and just one more thing, Jeremy, since you brought that up. I do believe the true power is going to be connecting all these ourselves together and getting them to interconnect a lot more. 


You're starting to see this with some of the bigger ones, right? So Twilio, Workday, Atlassian, they’ve all added this programmable size component to them. They’ve got Lambda based extensions that they showing up, like Twilio Functions and Netlify Functions and others that allows them to add just a little bit of logic to them to then talk to other services via API calls, and kind of build forward over there. So I think just the flexibility and power this enables is really, really cool. And the fact that you can swap out one API for another is quite a testament to the whole dance around “am I really locked into a particular provider or not?” because it's quite easy to change the API call more than anything else.


Jeremy: Right, No, absolutely and the speed of that is, I mean, just the speed and the less maintenance and all that kind of stuff ...  I actually saw a tweet the other day that said something like 99% … or your library only handles 99% of my use case, so I built my own. Right? Like it's the same thing with API, it could fly there too. So don't … you know, if it does 80%, just use that, you know what I mean, and work around another way. But yeah, building your own service is crazy.


All right, we're running out of time here. But I do want to just go over at least a couple more things quickly. One of the big things is that at the end of your presentation you had this slide that was, you know, why going serverless is revolutionary, and it was because it's 30x faster development, 60% lower TCO, and just you know, the availability of security, scale, all built in, trillions of invocations per month. The biggest thing for me here is the TCO because I think people miss that the total cost of ownership of having to maintain these other things like yes, maybe a particular Lambda function costs you 30, 40, 50 dollars a month, and maybe maybe thousands of dollars a month depending on what your use case is, but you might say well, it's, you know, it's 20% cheaper if I just run a, you know, an EC2 instance or maybe spin it up on containers or something like that. But there's a lot of operational work you're missing there.

Ajay: Yeah. I think this is the hardest construct for people to understand but also the most powerful one for serverless to internalize. To your point, infrastructure costs are different  to compare. I would argue what we've actually seen is in most cases unless you're running a really highly utilized EC2 instance or a container instance, Lambda would look cheaper for you as would most of the other managed services that are out there, but there are cases where you can say I can run this cheaper if I really squeeze it out of my own infrastructure that I want to. But at that point you are the one squeezing that money out, you are the one pushing the efficiency out of it, you are the one management infrastructure. And this is just my personal note, builders are really bad at putting a dollar amount to their own time.

Jeremy: Absolutely.

Ajay: They don’t know how valuable their time is. And you know, even if you just value your time at a hundred dollars an hour, but that quickly adds up. This is one of my favorite discussion points, people saying, well, I can run this on a three-person instance, and I'm like wait, that's good. How many people do you spend on this? It's like, oh, it's one on call for a month. I’m, like, great how much are you paying for on-call and then they're like, well, I don't know, what, five grand, ten grand a month. I'm like, okay. So now how did that compare to that Lambda bill that you just changed. And you kind of see the, you know, the GIF with graphs and figures floating around they had to go through. We have to do a lot more to help customers internalize that and you're going to see more, I think, material and content come out from the AWS team on helping customers understand both their individual costs as well as sort of how you think about the overall TCO.

There's a great paper by a VC out there, you know, it would be a good link to include in your in show notes as well, that people can read to get a really good model on how to think about the overall TCO, too. But, yeah, that that's the big one. Yeah 60% cheaper over a five-year window is big savings whether you are a small company or a big one.

Jeremy: Absolutely. All right. So again, we've been talking for a while. I do want to move on to a couple of other things. One thing that was really exciting to me was during Andy Jassy's keynote, he mentioned that 50 percent of all new services being built on AWS use Lambda, which is just an insane number if you think about that. Which is great from a serverless adoption standpoint. So that's great. And I wonder why this is … I wonder, you know, again, is it just because it's becoming more popular? Is Lambda just now one of those things that is becoming a little bit more mainstream and I'd love to think that, yes, it's just gaining in popularity. But I think part of that has to do with this, you know, we can't use Lambda because X, right, and all those objections that we've seen just getting crossed off the list and the talk that I want to bring up is Adrian Cockcroft had an architectures trend and topics for 2021, and he spent the first part of a talking about serverless.

And in the talk, he basically said serverless is the fastest way to build a modern application. I agree with this, you agree with this, we know this, right. Not everybody agrees with this, and most of those objections have been around things like portability, you know, scalability, you know, cold starts has always been a good one, you know, state handling, run duration, complex configurations, and there's been all these objections. And back over the summer, there was a conference that he gave a talk at where he basically just picked these things off one by one, and he updated that through this talk and he mentioned a few things like portability, new container support, scalability, now 10 gigs of space, complex configurations, AWS proton, which we don't have time to talk about that. But, you know, maybe some other time.

Ajay: Part two, right?

Jeremy: But just your thoughts on this growth of serverless, like, I mean, if you go all the way back to the beginning, I mean, it was a new thing. So, like, what's happened over the course of the last six or seven years that has, just, you know, that has made this thing such a juggernaut?

Ajay: Wow. I will say when we originally wrote the part for Lambda and we launched it, I remember Tim and I sitting up like the hour after it was announced watching the previous sign-up counter go up, right, like will people grok this? Like, will the idea that you can have a managed compute service that does things for you, this whole concept of events and others, really grok or they just work, and it did. Luckily here we are, hundreds of thousands of trillions of invocations, so to speak. I think for me, Jeremy, that the big thing what we’ve seen is, we found a new way of helping people move faster.

The core problem we are solving of saying we have democratized access to big distributed compute in order to build these complex applications at the way that you can't do before. Like that's always been the underlying philosophy behind it, really. You don't have to know how to build a service, just give us code and you're basically getting code as a service that goes over there. I think the journey of the last six, seven years has been enabling new patterns as you called out, by ... I would actually die back to the same talk things, right.

One has been expanding the capabilities the compute can do for you, things like EFS, things like firecracker, we’ve enabled a better isolation model, expand the amount of compute available to you for 10 gigs and otherwise, the second dimension is being sort of expanding your patterns, a big push being, again, event-driven computing connecting services to each other, service full architectures. I think we are at like hundred forty blocks event sources at this particular point in terms of the capabilities you can use with Lambda where we started out with three, across Kinesis, S3, and DynamoDB. That's been a huge growth factor for us. And now bringing that payment services that are non-AWS, so, you know, you call them in cue and self-managed Kafka now that we just announced and including AWS-managed Kafka, that pattern continues to be evolved.

And then sort of this third one around enabling more developer tooling and productivity. And I think that last one honestly internally the big flip for us was when we started becoming standardized with the internal AWS tooling, right, that was kind of when we saw the big inflection point and that was one of our earliest signals where he said, you know, you can have the compute ... be really powerful and enable whole bunch of new application classes and motivated people will jump the graph to do it, but you have to keep smoothing the plank, so to speak, to help customers come on board, which is where sort of this push towards opening up the ecosystem enabling the tools the customers care about is something that you will keep seeing us do a lot more.

I think for me, the biggest fascinating thing about the serverless ecosystem is what Lambda has sparked. Like, we never went in defining Lambda is serverless and serverless is the new way of doing things. So, like, hey, here's an easier way for you to run code, how go that sparked the serverless ecosystem. I think kind of services that we have seen inspired by the idea that you can keep having, you know, spin down to zero, highly scalable, completely apps built by the millisecond, conversations with services out there, which wouldn’t have been the case, you know, six years ago.

Like, we were still talking about billing instances by the hour, and discussing the next ..XXL.5p king thing that came out at that particular time and that conversation’s changed. Like, I was really … the most ironic moment for me was getting into a debate with a customer in June who was basically really upset at us by the fact that we were doing 100 millisecond billing. He's like, that is not acceptable. Like, that's a really low standard for AWS. I’m, like, dude, I'm so happy that you're complaining that I'm billing you for too many milliseconds. Like, that is the dream that we're getting into that argument versus anything else.

So, and, you know, when you look at companies like Steadi or stuff like Joe Emison and Steve are doing over at Branch. Like you're now starting to see this new flavor of single-digit people startups that are just going to become, you know, a hundred, two hundred billion dollar companies and this is very similar to the way of that you saw in the early days of AWS as well. I think there's a new microsized ecosystem that serverless has spun up that's really, really fascinating to me which then also feeds up into how other people are building applications, right? Like, the services themselves are going to be enabling new application patterns.

So, I do feel the big thing that's happened is the vision of saying, like, “builders build, let them do more with less” has been realized, not just because of Lambda. I think the entire ecosystem has evolved around it as people have realized and kind of putting things together. And I think that's what's the biggest excitement about this for me is that now people are building things that they never thought they would,  they're launching companies they never thought they would, you've seen this whole wave during Covid when people are building, you know, response sites in distribution systems and others in like, weeks that they couldn't have thought of handling and it's handling like millions of requests as it's coming in. And for me that's really humbling and powerful, right. Like, something that you built is enabling other people to build things really faster and deliver value and I just hope to see more of that coming out.

Jeremy: Yeah, and I love that you use the term “revolution” or “revolutionary” in your talk, because I've heard a lot of people be like, oh, it's an increment. It's an evolution of whatever and I just don't think it is. I think it is a revolution. I think it is a completely different way to think about building applications and it's a revolution in that it's the people that can rise up to start building these things where there's just not that walled garden like you should have talked about in the past.

All right, I want to ask you one more thing and I think this would be a good way to sort of end this conversation, and that's to go back to the idea of the partners. Because I think AWS has been very, very good about, you know, it creating partnership opportunities for people. But you also run this sort of interesting, I don't know, this sort of interesting dichotomy of building the tools to allow people to build things and then trying to build the tooling to solve the other side of the problem, right?

So you think about observability. I know observability is a frustrating thing for a lot of people, you know. CloudWatch had some ... you know, CloudWatch is CloudWatch, you know CloudWatch laws. I mean you add in metrics, there was insights, there was other things that have certainly gotten better over time. But really they don't compare it to the Epsigons and the Lumigos and some of those, like, they just do a better job, you know. And so the question is that ,you know, where is the line? Is AWS the product or is it the platform, right? And where do you see that sort of going in the future? Like what are … I know I'm asking a lot of questions here, but I guess just for me I'm curious what you see as the continued opportunities for builders out there to build tools and services for other builders?

Ajay: I should just say yes and call it done, right? But I think it is going to continue to be both, Jeremy, and I think this goes back to AWS’ philosophy of building backwards from the customer. Right? I think what you're seeing reflected in the way AWS is evolving is also the sheer breadth of customer feedback and signals that you end up getting, right? And I think in my time at AWS what I‘ve seen is, there is a class of people who  want AWS native, right. They’re, like I need this to be AWS, otherwise I'm not going to get it approved, I'm not going to get it through, I'm not going to use a, you know, pick your own from the toolbox on the side thing. You have to give me a native solution end to end and it needs to do the basic—that's good enough, right? So there's one school over there.

And there's the other one who’s saying, no, look I won't use what I consider best degree that works for my style, my productivity ,and others over there. And like I said in the beginning, right, like there's no way AWS can do this overall. So I think for AWS, you're always going to see the core investments in what we consider sort of the core aspects of the service, right? So security, performance, scale, capability is on enabling sort of API and service driven innovation. You know, I always think of this as the space being big enough. It's not like if, well, if AWS releases services, it's one and done. Ultimately customers are going to use the best services that they care about and I don't think AWS is the only one who can build the best service.

All the examples you just called out, right. Like, that are great ecosystem stories that are thriving and big over there that continue to go big. I think you see the same thing reflected in the way Lambda’s evolving like we talked about opening up the … the core aspect of the service which is sort of that distributed compute, democratization is going to continue to be something we innovate in. We have opened up an API on the areas we think where other people can do stuff. Like, well, hey, you want to bring your own runtimes and patch it better than we do? Go for it. You think you can manage operation controls better than you do? Here's an API go for it. Have fun with it. We're going to continue to offer an end-to-end vertical solution for those who do care about it. Right? Like so you do want sort of the sensible defaults, I think is a strategy that we're going to go over to there.

One thing I have a lot of partners bring up is how you can enable better crossovers, you know discovery, how do you make sure that it's not like they're able to just pick CloudWatch because they're forced to being picked into CloudWatch etc. And I think that's something we're going to continue to evolve. I actually like what the EventBridge team has done really nicely over here. So if you just go to the Lambda console and try to select an event source from EventBridge, it shows all the different ones who are out there, like all the different size providers show up at the same footing as any other AW service and I think that's a philosophy you will see evolve more.

So, I think for me the … if I kind of bring back to your original question, I think AWS’ core value-add is going to be solving what you call undifferentiated heavy lifting which lends itself to be more of the service tier, so to speak, not necessarily a platform there, and then enabling these experiences on top of it, some which are going to be able to be AWS native and others which are just going to be really enabled by the broader partner ecosystem over there.

Jeremy: Yeah, no. And you said to me earlier, you know, the goal is not to get everything right, it's just to make everything possible.

Ajay: Yes.

Jeremy: Which I think is quite fascinating.

Ajay: Yes. Exactly.

Jeremy: Well, listen Ajay, this was awesome. I love talking to you. Maybe I'll stop recording and we can talk for another 10 hours and not necessarily for the listeners. But this was great. Thank you so much and not only just for being on the show, but for everything that you're doing at AWS the, you know … I know you were there right from the beginning with Tim Wagner and the others, and just you know, making this what it is at this point. I think I've said this to others who have been involved early on, like, this has just changed my life, right? It's been a revolution for me and the way I build applications, the way I think about applications, and I think this has changed the world for a lot of people so “revolutionary” is the word that I would use.

So if people want to find out more about you, find out more about serverless, what AWS is doing there, how do they  do that?

Ajay: So first of all, Jeremy, like, you know, thanks to you and the community as well, you know, it's the customers who keep us honest in helping the world of service over here, and I think one of the biggest powerful aspects of serverless has been the community around it and, you know, keep spreading the word, keep telling the builders or listening to your talk about how we can do better. Like I said, the path to the revolution is serverless and the fastest way to build is serverless and we're going to keep taking that through. I hang out on Twitter quite often. You can find me @ajaynairthinks on Twitter. You can find me on LinkedIn, and I'm usually pretty responsive over there. But otherwise you could always go yell at Chris Munns who is our Principal Developer Advocate and he has a way to find me, too.

Jeremy: Right. And a great resource that was launched not too long ago was serverlessland.com, which is really good. So go check that out. Awesome. All right, Ajay, thanks again.

Ajay: Yes. Thanks, Jeremy. Happy Holidays.

This episode was sponsored by Epsagon: https://epsagon.com/serverlesschats