January 6, 2020 • 44 minutes
In this episode, Jeremy chats with James Beswick about the most popular serverless tools and services companies are adopting, how built-in cloud features are making apps more resilient, and what serverless will look like in 2020.
Jeremy: Hi, everyone. I'm Jeremy Daly, and you're listening to Serverless Chats. This week I'm chatting with James Beswick. Hey, James. Thanks for joining me.
James: Hey, Jeremy. Good to see you.
Jeremy: So you are a senior developer advocate at AWS. Why don't you tell the listeners a little bit about your background and what you've been doing on the AWS developer advocacy team.
James: Sure, so I've been working with serverless for about three years now. So I'm really a self-confessed serverless geek. I've used it to build quite a few applications, front to back using only serverless. And then in April last year, I joined AWS in the developer advocate team, and so this is truly the best job in the world because I like talking about serverless to people, so I get to go around doing conferences, blog posts, webinars, applications, and also some other things to show people how to build things. Since then I've just been going all over the place doing these things, but it's been pretty amazing just to see what customers are building all over the place with these tools.
Jeremy: Awesome. All right, so I was talking to Chris Munns when I was out at re:Invent, and I put together a podcast there, and we were talking about all these new things that AWS was launching. And I think what happens with serverless is that it's moving so fast that things are constantly changing. There's always new things being released. What serverless is is still up for debate, right? I mean, there's still a lot of questions around that.
So I wanted to talk to you because you and I talk as much as we can because I love talking to you. You have great insights when it comes to this stuff, and I wanted to talk to you about sort of what are we going to see with serverless in 2020, right? Because this is the year now where all of these pieces are starting to come together. We've got all of these tools, all of these things we've been complaining about like RDS Proxy, and we can't do this, and we can't do that. These problems are going away at a rapid clip. Maybe you can give me your take just on, I mean, what does 2020 look like for Serverless?
James: It's a great, great question. In the last five years, you know Lambda's really five years old, what's been happening is the space has been emerging and developing so quickly, we're simply seeing customers pick up the tools and build things and then find they need more features. So we've been building all these features as quickly as possible. And I think what's different this year is that this whole space is starting to mature very rapidly. And we're seeing customers, both startups and hug enterprises using all of these tools at scale. And starting to see the same patterns emerging from their use cases.
So what we're doing for the next 12 months is essentially looking at the entire list of requests that's coming right from customers where they want certain things and dedicating those resources to building out the features they want. So AWS is famous for listening to customers and building those features, but I'd say in serverless, I mean it really is the case their entire road map is coming back from these early adopters and these users and helping us to find what we now build.
Now in terms of actual concrete things, most of that comes down to improving performance all the time, always making sure we can make performance as good as possible but also improving tools and making sure that we integrate with developer tools that they're using all the time, and just making sure that all features, we sand off any rough edges that we have. So a lot of the time with AWS features, what we're doing is we deploying them out to customers as quickly as possible so that people get the first look at what we're building. And then when we get that feedback, then we build the additional bells and whistles to make sure it's exactly what people want.
Jeremy: Yeah, no that's great. And the other thing that I, I keep hoping for this, right? And maybe we're not there yet, and I ask everybody about this, but I really want serverless to go mainstream, right? Like it's just what you're doing. It's the way to build cloud applications, right? Because I think you have all of these use cases that are out there now, and from my newsletter I'm always trying to capture use cases to say oh someone's doing this with it or someone's doing that with it, and they have these interesting ways of solving those problems.
And like I said, these problems now have official solutions in many cases. What's your take on this idea of it really becoming mainstream and more customers starting to use it for, or just being the first choice of what to use when they're building something in the cloud.
James: So in my career, I've been one of these early adopt people where I was one of the first in the cloud. And I used mobile and got into mobile development very early. And one of the patterns I see over and over is that the tipping point of things becoming mainstream isn't always obvious. You go through this period where it seems like you're always walking uphill to convince people that this is something that's going to become the standard way.
And then magically at some point, it just does, and you didn't notice it happening. And I started to feel that's becoming the way of serverless because many of the groups I spoke to a couple of years ago, who didn't know what serverless was or they didn't think it was a good fit for their use case, are now starting to openly talk about serverless as an option at least and yet discuss how they could use it.
The great example is that last year I went to the DC Public Summit for AWS, where all of the government customers were there, and a lot of people were very interested in serverless. And I've seen the same thing at all the summits and events we go to that even people who haven't actually done anything yet are interested in what it can provide in terms of both agility and scalability for building their applications.
Jeremy: Awesome. So you mentioned tools and giving people tools in order to build stuff. One of my complaints from serverless, right from the beginning, is even though we are abstracting away all of this infrastructure, there's still a lot of configuration that has to happen. And with AWS that comes down to ultimately using either cloud formation or writing complex interactions with the APIs, which nobody wants to do.
So the CloudFormation side of things, there are extractions on top of that. We've got SAM, the Serverless Application Model that is, makes it a little easier. It's very similar in fell to the serverless framework. Then we have the CDK, which is relatively new that allows you to just write code, and that will generate an infrastructure for you. There's Amplify. I just talked to Nader Dabit the other day. We were going through this amazing tool that is Amplify and how it sets up all these things for you, does back end, does front end, and ultimately all of these things end up generating cloud formation.
So the question I have is, which one do you choose. If you're new to serverless or you've been using serverless. Maybe you're using something like Claudia.js or you're using Architect or you're using Serverless Framework and you want to use something that more AWS native, where do you start? Which one do you choose?
James: I think it depends on where you're coming from. If you're a startup and you're using things, that you're building greenfield applications where you can pick whatever you want, that's a very different situation to be in than if you're enterprise and you're migrating Legacy software into serverless. Most of the combinations of these things are designed for really different developers and different use cases.
So I'd say if you're in a greenfield space and you're doing some scratch, then using a framework like SAM or Serverless makes a lot of sense because you're starting at a point where it's going to build everything out the right way for you. Otherwise maybe if you're in an enterprise and you've got a certain set of tools, you might find that CEK is a more comfortable way to go. But all we're really trying to do is instead of saying to people this is one tool for everybody to go and learn, to really meet developers where they are and give them the tool they feel most comfortable with given their use case.
Jeremy: Yeah, I think that makes a lot of sense. I know for me that again sometimes you have to get into that cloud formation template and start doing things in there, and it can be a, I always complain about this, but again it's configuration, it's [inaudible 00:08:21] language. It makes sense. I mean, it's just as hard with Terraform or something like that, but you are, you know, there are a lot of configuration options there. So certainly as a developer that is new to infrastructure in a way, it is certainly a leap to learn some of that stuff. But again, those tools do make it easier.
James: Yeah, and if you look at something like Amplify, what's been built there is really interesting because when you've got cloud formation, it essentially gives you every knob and lever you have on the entire infrastructure, as you know, as YAML basically, and then when you look at something like Amplify, what it's doing is it's looking at the most common, sensible defaults for given use cases, and helping those developers in an opinionated way.
So if you're building those sorts of apps, that's a great fit. So when we hear customers saying to us that we want to have certain types of use case over and over and over, and they don't need to have all these controls, then we're happy to build tools that simply that.
Jeremy: Yeah, that's awesome. All right, so let's get into some of these tools and products, right? This for me, and of course your feedback on this or your insights on this is, I think will be probably more enlightening than mine. I think there are a few things that are, even things that haven't just recently launched, but tools that have existed. The way that we've been building serverless applications in the past, I think that there are a bunch of these tools, some of them are new, some of them are existing, but these are the ones that excite me the most, and I'd be interested to hear about this from you as well.
But for me I think one of my favorite AWS products right now is EventBridge. And when this first came out, which by the way, was back in July, right? July of last year. Warner announced it at the New York summit. When it first came out, there were a bunch of people in our space, you know we've got a very tight-knit group of serverless geeks that like to write about this stuff, but there was all this talk like, this is going to change the way we do serverless or this is the biggest thing since Lambda itself.
And I totally agree with that. There hasn't been a lot of fanfare around this. That sort of came out and then not much. There was no cloud formation support, which I think was part of the problem, but then you got all of this new stuff like the Event Schema Registry and some of these other new features that launched at AWS re:Invent this past year, so what are your thoughts on EventBridge and just how do you think people are going to be using it in 2020?
James: Yeah, I'm one of the people who are really super excited about EventBridge. I think it has a transformational possibility for the way that you build serverless apps. Because at the very least it can help decouple these applications. So if you built complicated serverless apps, you often find that you end up getting functions and services that become entangled with each other by accident. And by putting EventBridge in the middle, you can totally decouple the producers and consumers in the way where your application is so much simpler.
Now I think what's really interesting is in the last month or two, some of the features that have come out have really evolved the product in a very dramatic sort of way, so the schema registry and discovery features that you mentioned are, to me, just fantastic because I know from building event-driven applications before, one of the hardest problems is just keeping track of events, knowing what they look like, how they're shaped, and when services change versions, the events change.
Having a registry that's built for you just make like that much easier. Then the discovery feature where you essentially just pipe your events to this discovery service and it builds out the schemas is just amazing because it does all the work for you. You get 5 million events per month for free, and that should cover most use cases. And then once you've got a schema in place you can pull it into your IDE and then build applications directly off of that and use events as classes in your applications [inaudible 00:12:23] types. Those two features alone are just amazing.
And then recently we've introduced content filtering. So what that means before we just had rules and rules were just kind of this blunt object, things either match or they don't match. Content filtering now you can put much more dynamic rule sets in place in terms of ranges of values and things make it much more queryable. So the net effect of that is that you can push that logic back out of your code into a service so we're back into the business of less code, more serviceable applications. So all of these features have been coming out pretty fast.
You know EventBridge has a huge road map of things ahead of this, so I just keep watching in amazement. I'm super excited about it.
Jeremy: Yeah, so, one of the things I really like about EventBridge, and maybe this is even too geeky for this podcast, but so if we look at architectural pattern, right, and I'm a huge fan of architectural patterns, so we've got our monolithic applications and then we went to service-oriented architecture and then we went to microservices, and now people call things like nanoservices, which I'm not a fan of that terms, so if you think about the way that microservices, and that's how I like to think about serverless applications, building small services with clear boundaries, their own database to back them.
It might be four or five functions or a hundred functions that are part of one service, but essentially you are encapsulating all of that service logic in one cloud formation template. Or, you know, whatever, but basically you're breaking it down that way. With service-oriented architecture, that's when we introduce the message buses and things like that and being able to pass messages. But we were still sharing databases, and so again there's probably no comparison here.
But what I find really interesting about serverless applications, and certainly when you're thinking about serverless applications as microservices, and then introducing something like EventBridge, is it now what you're doing sort of using this enterprise service bus, if you want to call it that, that handles this communication asynchronously. So everything is completely decoupled. You're adding in the rules and the filtering into EventBridge, but then all of the configuration for it all tied back to the individual services that are subscribing to EventBridge, so you now have this sort of new type of architecture.
And I don't even know what you call it, but microservices maybe, but the way that we communicate with asynchronous just feels so much different. And honestly it feels a lot better to me. I don't know. What are your thoughts on that?
James: I think a lot of what we're building makes distributing computing just easier for developers, and when you think about the scale of lots of developers now have to face with their applications, even things like mobile apps, these are complicated problems to solve when you get spikey workloads and just huge numbers of transactions coming through. So a lot of these tools just make it that much easier.
But the mental hurdle is going from this synchronous model to this asynchronous model. And so if you're used to building synchronous APIs, initially it can seem a bit alien trying to figure out the different patterns that are being involved. But it seems like the natural evolution given the fact that you've got all these services in the middle that have to handle all of this traffic, and the timing issues involved, you know, start to evolve from where you are in the synchronous space, but I think what's been put in place is not too difficult to understand.
Once developers start using this, they find actually for many cases, it's the right way to go, but it's interesting to watch this because I know that just even 12 months ago people were talking about the API Gateway, this 29-second, 30-second limit problem, do all this stuff throughout your infrastructure. Or you heard about the Lambda limits of five minutes, then fifteen minutes because people were trying to work this way.
I think now we're going back to thinking about how do we break up these tasks. So it's shorter-lived tasks that run between services in an asynchronous fashion. So the whole model is really evolving.
Jeremy: Yeah, and so actually that is a good segue into talking about failing in the cloud, right. And so I'm doing a couple of talks at some serverless days this year, and the title of my talk is How to Fail with Serverless, and basically I should have a subheadline that like How to Fail with Serverless so that Your Serverless Applications Don't Fail, or something like that.
But basically what I'm talking about is, when you start doing things asynchronously, I just generate a job and now my Lambda function or whatever my service is, my client that generating that says, okay, here's a job or here's a request and then it says, okay I got the request and then it disconnects. So now this is somewhere out there in the ether. You have some thing and it's routed through EventBridge or it's in an [SQSQ 00:17:18] or something like that.
So you just have this thing out there, and at some point, you hope, that it will trigger something else to process that and do something with it. And those guarantees are very, very strong so you don't have to worry that it's not going to process it. What you do have to worry about is what happens if when I go to process it, something goes awry, and that the Lambda function fails to process it or there's some conversion issue and it can't insert that into the database because that was what it wanted to do.
And what I find really interesting about what AWS has done with the way that they've architected this stuff is to say, listen we know things are going to fail, as Werner says, "Everything fails all the time." I think I've said that about 12,000 times on this podcast, but I totally agree with it because I know. I watch things fail all the time. So when something fails, there are provisions in place that the cloud will handle those for you. There are ways to configure things to be handled for you.
So you have things like the DLQs, which used to be the primary way that you would take an asynchronous event that called the Lambda function if that Lambda function failed, it would put that in a DLQ. They just introduced Lambda Destinations, so now rather than using a DLQ and just getting the event itself, now you get all kinds of context along with that, why it failed the stack trace, things like that, which are super helpful.
You also have a success path, so if the Lambda function succeeds, I don't have to go ahead and put some code in my Lambda function and say, oh now do this with it. It will just automatically do that as part of the configuration. You have failures built into SQS. You added DLQ for S and S now. I'm sure there's all these different ways that we should let the cloud fail for us and not be capturing these events or swallowing these events with try-catches.
So I have a whole bunch of stuff that I'm working on to try to come out with that. But your thoughts on that, what is AWS's, if you can answer this, what's their philosophy on this because this is an essential part of distributing computing?
James: Werner's quote is the philosophy, that everything fails all the time, and so the question is how do you make your application resilient to survive those failures? Most of these new features are really just extensions of ideas we've had, they're in the infrastructure already in one way or another, but you know if you look at DLQs, they've been around for quite a while and Destinations is an extension of that.
Now it's not that we're telling everybody to go and replace DLQs with Destinations. It just becomes another way that you can handle a failure if you choose to, and so there are so many different ways of figuring out where failures work in your application. It depends on the sort of scale that you're working with as well. But I still meet lots of developers who take advantage of these features in the way that they should.
Although our infrastructure is very reliable, it's not 100% reliable. There's always a possibility that you have a service disruption in different services, so these features give you the ability to improve that reliability even further if you use them appropriately. But we're starting to see now with some of these new features with Destinations that it makes it easier to understand as a concept. So I think now developers are getting more comfortable with how you can build this into their serverless applications.
Jeremy: One of the things that I really like about Lambda Destinations is the success path allows you to, again, just write code. So traditionally what you would be doing is you'd say, okay I have to include, like maybe I want to write the information to SQS when it's, I do some processing, I do some transformation, I want to send that back into SQS to do something else, I would have to include the AWS SDK. I would have to make a call to the SQS service in order to post that event, or to post that message.
Now if that fails, I could retry it in my code or I could do some of these things, but there's a lot of logic I was building in there. So now I don't need to do that, right? I can send stuff to EventBridge, which again opens up a whole new possibility of what I can do with it, right? So I'm not just limited to SQS, S&S, a Lambda, or EventBridge. I basically have every service that EventBridge integrates with that I can utilize.
But my question is is that I think that some of that reliability and some of those retry mechanisms that people were trying to, and this is probably not the right way to say it but, sort of jimmy-rig them in a sense by using a Step Function. Because Step Functions are great, and you should totally use Step Functions if you have complex workflows. But even for some of those simpler workflows, it was just easier to say, hey this is supposed to do X and then send the data to SQS, if that fails, I want to retry it. I can encapsulate that in a Step Function workflow, and then that would kind of handle that retry for me.
But you don't need to do that anymore because of some of these new features that are added. But obviously Step Functions don't go away, but what are your thoughts on, you know, this obviously makes it easier, right?
James: I do get asked quite a lot by people, should I use destinations instead of Step Functions. The answer is usually no, but also it depends because if you've got a very, very simple process and it's really just a couple of steps, and you don't want to incur the cost of using Step Functions, perhaps this could be an alternative for you. But generally speaking, Step Functions provides a lot more functionality to that in terms of both length of the workflows and the complexity of things that you can build in. So in most cases you wouldn't want to go from Step Functions to this.
But it really is another option for developers to use when they're figuring out the right sequence of events for their Lambda functions. And the net effect of all of this is just less code because there's this boilerplate that you talk about, if you build it into one function, by the time you start building out these applications at 20, 30, 50 functions, you've got this duplicate code that you've got appearing everywhere. So if you can take that all out of those functions, it's a huge win in terms of shrinking your code base.
Jeremy: Absolutely. That is certainly something that I'm pushing in 2020 is, you know, do some research, figure out how to fail correctly because that is, there are so many features built in, and there's retries, there's throttling, there's all kinds of things that are automatically built in for you.
I think a lot of things a lot of people don't understand, I've probably mentioned this before, but if you call a function asynchronously, or you invoke a function asynchronously, and that function gets throttled, there's actually a built-in cue that will throttle that for you. So when you talk about using SQSQs and concurrency, function concurrency so that you can do some throttling, so you can reduce back pressure or pressure on downstream services.
But some of that is actually already built-in and if you don't need visibility into those cues, and there might just be temporary times where the concurrency spikes a little bit, some of that stuff is just automatically dealt with for you. So understanding some of that stuff and not arbitrarily putting another SQSQ in front of it when you most likely don't need to, I just think are really interesting things.
Of course you have to know that, which is part of the difficulties of serverless, is sort of understanding how some of these pieces work.
James: Yeah, so the team I'm on has grown from just Chris to seven people now, and so what we're trying to do is just surface some of these things in examples and things we've written. So my friend and colleague Ben Smith wrote a great piece on some of this. How you can figure out the LQs and retry mechanisms through applications. And so we're hoping, as the months go on, we'll start to build out more of these examples to people to make it more obvious.
Jeremy: Awesome. Let's talk about something else, and hopefully you won't get in trouble for answering this question, but why should you never use Provisioned Concurrency?
James: So Provisioned Concurrency, as a feature, is pretty interesting engineering. Behind the scene, how it's been built is pretty extraordinary. In the general serverless space, we like the idea of on-demand lambdas and mostly we focus our time on how do we improve the performance of those all the time. But there's definitely a subset of cases where you have this requirement for close to zero latency. And so you find that there are some of these cases where there's an enormous burst of traffic at a given time of day.
And the scale is so enormous that someone needs 15, 20,000 functions to run immediately. So really this is a great solution for that, and we've already seen since releasing it there are so many people who have used it for exactly that, and it just solves that problem because it solves both the cold-start problem of setting up the execution environment, but also the cold-start problem in your static initializer. So it's a really neat solution for that.
Now where it isn't designed for is just for the everyday lambda use case that generally people use. If you're using asynchronous flows, it's not something that's going to provide you any value, and in many use case it's not something that you'd necessarily want to add to your applications. It's an interesting feature just because where it's necessary it's absolutely necessary. I works really well. But you need to evaluate first whether it's right for your use case.
Jeremy: I totally agree, and I'm joking obviously about never using it, but I think that it's one of those things where we have, where cold starts, right from the beginning when everyone was like cold starts, cold starts, and I started noticing them very minimally when I was doing a lot of user facing stuff. And all of a sudden you get that, like through API Gateway you get a 10-second cold start, but then you take things out of VPCs and of course that problem has gone away too, and you start optimizing your code and you start tweaking some knobs here, and suddenly that cold start is two seconds or whatever.
I get a two-second delay sometimes when I go to load another website, so you know, those aren't cold starts, those are coming just from network latency and some of these other things that I think most people are pretty used to at this point. I mean, I watch my kids, if something's not loading, they just hit the refresh button about 7,000 times. So clearly that's not going to be the limiting factor.
If you were getting them all the time, it would be a huge problem, but I just found that for most applications, the cold-start piece is not a big of a deal. Certainly if you needed to do the pre-warming and some of these other things for having the amount of concurrency available to you, that's a different things, but certainly to solve the cold-start problem, I guess I'm not onboard 100% with that being a necessary solution, I guess.
James: I think with cold starts, it's a complicated problem because I would say 80% of the time when I meet people who are new to serverless and they find the cold-start program, it turns out they just haven't allocated enough memory to the Lambda function.
James: And so you meet people and they show you how something's taking 6 to 8 seconds, and I'll have a look at their function and you just change the memory and the problem goes away. But there's also lots of other reasons in terms of the code that's being implemented. So I think as people come onto the serverless way of doing things, they start to learn that time matters and the resources you use matter. They start to improve the code and the performance improves overall. And a lot of these issue start to go away by themselves.
Jeremy: Yeah. Oh and also the thing that's great about serverless is that you're not really touching or in control of a lot of that underlying compute. A lot of those optimizations, you know, Chris said this, you basically just make them and implement them and you just start seeing the benefits immediately.
But speaking of speed, so one of the things that has been a complaint for quite some time has been the latency that has been added and the complexity that has been added through API Gateway. And so the new HTTP APIs are out so why are these so much better?
James: This is a feature I really like as well because the API Gateway as a whole service has a lot of extra features that many people are often unaware of, but yeah because it provides all these extra features in terms of managing stages and API keys and DDoS protection, incognito integration, all these other things that, if you're using them all, it's fantastic because you basically pay a fixed price and you get all this extra feature set applied for you.
But in many cases, customers have told us they don't want all those features. They want to have a more vanilla API in front of the servers. And they want to have something that costs less. So this is really for that set of use cases, where you want something that's much more straightforward and slimmed down. And the nice thing about it is that typically we're seeing latency levels much more consistent and lower because it's a smaller service, and people have taken to this because obviously it's over two thirds cheaper. It's a dollar per million transactions and that drops to 90 cents at a certain volume.
So again, it's something where, if you're using all the features of API Gateway, you probably don't care, but if you only need this smaller feature set, this is great because you can use this and save quite a lot of money on your AWS bill along the way.
Jeremy: So what is the use case for HTTP API versus API Gateway because API Gateway has service integrations and, like you said, it has some of the quota management and some of these other things that are there, so what can I do with HTTP API, what can I do with it that I could with API Gateway?
James: For the basic API management set that you would expect, the things that pass through to Lambda functions and interact with your application and you don't need anything beyond that. That is essentially what this is designed for. And you see actually a lot of applications that people have written where, especially for internal applications or things that are just small scale, this is absolutely fine.
But it's really, I still think there's this combination where some people use API Gateway, the original version, with all of these features in place, but many use it without knowing those features are there, so the conversation I frequently have with people is they end up building that functionality inside the Lambda functions, then you show them something like VTL or I could just do it all on API Gateway, so I think it's a good opportunity to look at what feature set each service has and see what fits your application.
You tend to find that one or the other is a very strong fit rather than being a toss-up between the two.
Jeremy: Yeah. I've been using WebSockets quite a bit, which obviously is an API Gateway feature and not a HTPAPI, but certainly there are a lot of use cases where the more straightforward I just need low latency and that routing of course the Lambda Proxy integration, that's sort of how that all works and so it's a very, very cool service.
All right, so another thing I'm going to ask you about is, RDS Proxy came out, so the question is should we just forget about DynamoDB and use RDS again?
James: People have this reaction often where you use one or the other. I'm a huge fan of DynamoDB. I think it's an absolutely essential database for serverless development. The incredibly low latency, the massive scale it provides, and it's just the pure simplicity from a serverless point of view. And I've used it for awhile, so I've learned a lot of the ways you have to construct your application around it. So I'm still very much in DynamoDB camp.
But at the same time, everybody's got a [inaudible 00:33:11] database somewhere, and you speak to enterprise customers who have all their data in RDS, so they have to RDS as a data source for their application. So I think for customers in that position, it makes a lot of sense to use proxy because it just takes a lot of the headache out of managing this connection problem where when your lambda function scales up, you can drain the resources of your database, and we've made some other improvements there in too as well as through security and also fail over speed and other things.
But I think from a conceptual level, this DynamoDB-or-RDS question is one that you and I will be talking about for a while because I like some of the alternative ideas where you use both. You know, why not have your DynamoDB database as your operational database for your app and then use streams to push the data to RDS for analytics? And so I think there are lots of interesting other ways of doing things. But for certainly for people who just need to use RDS and not worry about it, the proxy's a great answer for them.
Jeremy: I think the other thing about the RDS proxy, which I really love the idea of because I do agree with you. There are people who have analytic purposes or analytics workloads, I guess, and you need to use something like RDS. You need to use MySQL or [inaudible 00:34:27] or whatever. What I don't like about some of these things, and this is just opinionated on my side, is what I liked about the fact that it was kind of tough to use RDS with serverless was the fact that it kind of forced you into using something that was a little bit more cloud scale.
And so if you were using RDS with a lambda function, and you had a low workload, right? Say you're doing some ETL task with it or you're doing administrative APIs or something like that where it's low interaction, low concurrency, it was never really a problem. Those zombie connections eventually cleaned themselves up. I built that serverless MySQL package that sort of worked really well for those sorts of use cases even if you get to a point where you were using close to your limit for connection.
But now with this what I'm sort of afraid of is that people will be like, oh I can just use my relational database now with lambda functions, and that kind of goes away. But I do agree with you that this hybrid approach is probably the best way to go about it. And I have this in a ton of applications now. I have DynamoDB as the operational database. You can pound against that thing. It will handle as much traffic as you want to throw at it. It will handle the right speed, the right [inaudible 00:35:53] you put on, it's amazing.
And then you just have a DynamoDB stream setup, and then you just take that data and you push it into RDS. And what's interesting is what I've been doing lately is using the data API, which again I think people forget exists maybe. But what's great about the data API is it doesn't require the VPC. It doesn't require you lambda function to be in a VPC. So you can have a VPC running and you have your RDS database in that VPC, obviously it has to be aurora serverless to use the data API.
But you just take that data off that stream and then you just use the data API from a function that is not in a VPC, right, so you don't have to worry about configuring that or whatever, and then you just push that data over there. So I really like that combination of things. Now granted, I see your point. There are many people who are on RDS and need to use relational databases to do it.
I still think that even though RDS Proxy is going to handle the connection issue, still think you're going to run into scale problems at some point.
James: There's a couple of things that I was thinking about recently. One is it comes down to choosing the right database for the right reason.
Jeremy: Also true.
James: It's been so easy to spin up MySQL for database for so long. It's becomes almost a habit that you lean on the database's capabilities because it handles so much for you in terms of multi-threading and developing large applications. And so now we have all these other tools available. I think you have to reevaluate. Are you using RDS for the right reason? And it opens up this broader question of where should your data live in a serverless application because now we have all of these solutions. You know, the S3-based ones, RDS, MySQL, and everything in between.
And so as developers we actually have a more complicated choice now about where the data goes and how I should manage it. But I think overall if you make the right choices, it gives you more resilient applications.
Jeremy: Yeah, and I think you make a good point about the source of truth, like where do you want that source of truth to be? Obviously in something like MySQL, you can export that, you can move that to other places fairly easily. It's not quite as easy to export data out of DynamoDB. I mean you can just run scan operations and you can do that, but I really like the idea of having that data in DynamoDB as that sort of source of truth.
But I agree with you on that. Choosing a purposeful database, or purpose-built database is certainly a smart move. But anyways, we could probably talk about the DynamoDB versus RDS for quite sometime. I mean obviously I'm also solidly in the DynamoDB camp, but I do greatly appreciate and have always loved the ability to write queries in SQL as it's quite easy.
Although it is quite easy to write them in Athena and so if you're pushing data into S3 and now with the new DynamoDB connector for Athena, you can actually query DynamoDB directly using Athena, which is just using regular SQL syntax as well.
James: With DynamoDB I think it's one of the greatest things I did as a developer before I joined AWS was taking the time to learn how it worked. A couple of times I almost gave up because the model is so completely different. But in recent months you've seen all of these new tools coming with DynamoDB like the Workbench, there's a lot of materials coming through that show how to use things. So I think it's easier to pick up now.
There's a Rick Houlihan video that's become legendary at this point for training on DynamoDB, but once you click and you realize how it works and you see it work at scale through applications, it really is just an amazing service to have in your toolbox.
Jeremy: Yes. And speaking of toolboxes, I do have the DynamoDB toolbox that I'm working on, which again just makes writing data, right now it's mostly focused on writing data to DynamoDB, but that is actually one of the complex things that you have to deal with is the fact that you have a different type of query syntax in order to pull data from it and also in order to do these complex updates. You know, sort of the put items is simple, but then when you do the update item, there's a bunch of syntax things there, so that's actually what my project does around that.
All right, so there have been some changes with some of the run times. We went to No. 10 then we went to No. 12. Those all seem to be pretty stable. Everything sort of worked out there. I've been pretty happy with the performance around 10 and I started using 12, and that's great. But there are some things changing with some of the SDKs and there's something changing with the Python SDK that's sort of important, right?
James: Yeah, so what's happening is that we've changed the way that boto core works so that the request module is no longer part of that. And we've unvendored it, enables some additional flexibility in the way that boto core operates. Now from the point of view of using Python and lambda, what this means is is you're already bundling your version of the SDK into your function, which is the best practice and keep doing that please, you don't need to do anything. That works just fine.
If you're not doing that, if you're relying upon the included version execution environment, when that changes, you will have to make some changes too. So what we've done is we've published some layers you can use that just give you the option of continuing to use that request module that you want to use within your function.
I've just written a blog post about this that went out on the AWS Compute blog that gives you step-by-step instructions, but we just wanted to make sure everybody who's using Python in lambda who's relying upon the request module is aware of these changes that are coming up.
Jeremy: Awesome. All right, so, last thing, then. 2020 serverless, what are your general thoughts? Is this going to be the year?
James: Yeah, it's really snowballing in terms of popularity and certainly seeing just the sheer number of people from all these different companies. You have startups and enterprises and so many different types of industry all starting to pick up serverless tools. And a lot of things that we talked about just a year ago, that really seem an incredibly long time ago now, the conversations that don't really necessarily matter that much anymore.
There was a discussion about what is serverless and all these sorts of things. And now we're starting to talk about architectural patterns, and starting to talk how it's not just lambda anymore. Serverless is this concept of taking different services from different providers and combining them. So I think, you know, we see people building things where you connect API Gateway, DynamoDB, S3, but also with services like Stripe or with [Orsero 00:42:41] and then lambda is just connecting things in the middle.
There's just a lot changing in the way people are building very sophisticated applications at scale, and I think it's finally gotten to that tipping point where it's becoming generally adopted.
Jeremy: Yeah, that's awesome. All right, so, James, thank you so much for being here. If people want to get ahold of you, how do they do that?
James: So I'm available on Twitter at @jbesw or I'm on LinkedIn, people often send me questions on there. If you look up my name James Beswick. And I'm also available through email at jbeswick that's B-E-S-W-I-C-K @amazon.com, and I'm on Slack and everything in between, but essentially anytime you send me a message, I'll do my best to get back to you as quickly as possible.
Jeremy: Awesome. All right well I will get all that into the show notes. Thanks again, James.
James: Great. Thanks so much, Jeremy. Take care.