Episode #91: Streaming Data at Scale Using Serverless with Anahit Pogosova (PART 1)

March 8, 2021 • 44 minutes

In this two part episode, Jeremy chats with Anahit Pogosova about where and why we'd use Kinesis, how Lambda helps you supercharge it, how to embrace (and deal with) the failures, common serverless misconceptions, and much more.

Watch this episode on YouTube:

About Anahit Pogosova

Anahit is an AWS Community Builder and a Lead Cloud Software Engineer at Solita, one of Finland’s largest digital transformation services companies. She has been working on full-stack and data solutions for more than a decade. Since getting into the world of serverless she has been generously sharing her expertise with the community through public speaking and blogging.

Twitter: @anahit_fi
LinkedIn: https://www.linkedin.com/in/anahit-pogosova/
Solita: https://www.solita.fi/en/
"Mastering AWS Kinesis Data Streams, part 1”: https://dev.solita.fi/2020/05/28/kinesis-streams-part-1.html
"Mastering AWS Kinesis Data Streams, part 2”: https://dev.solita.fi/2020/12/21/kinesis-streams-part-2.html
AWS Community Day Nordics 2020: https://youtu.be/gtE2o8qsq-4

Watch this episode on YouTube: https://youtu.be/U4snzWHMrtU

Thanks to our episode sponsor, Epsagon.

Transcript

Jeremy: Hi, everyone. I'm Jeremy Daly and this is Serverless Chats. Today I'm chatting with Anahit Pogosova. Hi, Anahit, thanks for joining me.

Anahit: Hi, Jeremy. Thanks so much for having me.

Jeremy: So you are an AWS community builder and also a lead cloud software engineer at Solita. So I would love it if you could tell the listeners a little bit about your background, and what it is you do at Solita.

Anahit: Right. So yes, so I have been working at Solita for pretty long time. So it's a digital transformation company. It was originated in Finland over 25, 26 years ago, and out of those years, I have been on-board for 11 years. Which sounds extraordinary nowadays, I suppose, because everybody gets surprised. But during those years, I've had several roles as a backend and full stack developer. And then I moved to the cloud, to AWS and started doing all the cool stuff with serverless. And I have been also working as a data engineer for several years with one of our customers, so a lot of different stuff.

And we actually have offices in six countries in Europe, of course, they are empty at the moment. And I'm based here in Finland. And yeah, we focus on software development and cloud integration services, analytic services, some consultancy, and service design. So if you're interested, we are hiring. And yeah, that's about Solita and me.

Jeremy: Well, any company that can retain someone for 11 years, sounds like a good place to work at.

Anahit: Right? I think so too. No, apparently, it sounds suspicious to many people. Why exactly?

Jeremy: I don't know. That's a conversation for another podcast, I think, about the job-hopping thing. But anyways, well, I'm glad that you're here. And thank you very much for taking the time to talk to me. I'm super, super excited about this topic, actually, because I came across this blog post that you wrote. Now, this was actually the first version of this that you wrote was, or the first part of this, I think was maybe almost a year ago now or something like that.

Anahit: Yeah, something like that.

Jeremy: But then you had a second part of it that came out in maybe November. And this was two posts, they were called "Mastering AWS Kinesis Data Streams." And now the cool thing about Kinesis is, it's a super powerful service. I think we learned from a recent outage at AWS that Kinesis, pretty much powers everything, every backend service at AWS is powered by Kinesis, which is pretty cool, but also scary at the same time. But, but it's a fascinating service. And I want to warn the listeners, because I want to get super technical with you. I want to get into some of these different details about how this service works, some of the limitations, some of the use cases for it and things like that.

And I would absolutely suggest that people read the two posts that you wrote, now they are very, very long, it took me a long time to get through them. But they are excellent, they're really well written. And it reads a lot easier than the documentation, and you give some good examples in there and some good reasoning behind it, which the documentation doesn't always do. So first of all, I want to start with why you wrote this post in the first place because there is a lot of documentation out there. But why did you write these two posts?

Anahit: Yeah, these two very long posts, as you said. So maybe to give some background, I've been working with Kinesis a bit over three years now with one of my customers, who is at the Finnish National Broadcasting Company called YLE. I always bring this example, you can think of it as BBC in Finland, highly respected accompanied with a lot of content and a lot of viewers as well. So our team is responsible for streaming the user interaction data to the cloud. And at the moment, we have something over 0.6 terabytes of data per day. In the moment of writing the first blog, it was half a terabyte, so it's growing constantly.

And yeah, so we did with Kinesis. And when I started like three-plus years ago, I basically had no production experience with it, just like the "Hello, World!" kind of a thing. And most of the things I learned, or most of the things that are in the blog post, I actually learned the hard way, so by making the mistakes, and by seeing the failures, and that kind of things. And I actually wish that blog post, or two blog posts, like that would exist back then when I started, because as you said that there's a lot of documentation on AWS, of course, but for example, in the case of Kinesis and Lambda, you have to read the Kinesis documentation, and then you have to read the Lambda documentation, then you have to marry them together. And it's a lot of reading and not necessarily too clear.

So I wrote this in a short way, I wrote it to myself three years ago, that kind of thing. And I hope it will help others not to make the same mistakes that I had to make myself. So maybe it will help somebody who has already started their Kineses journey or just thinking about it. And the thing is that while I was writing those blog posts, or before working with Kinesis, I have learned so much when I started to dig under the hood of how the service actually works. So I have learned so much about how the AWS services work in general. So like digging deep or understanding deeply, just one service, in my opinion, gives you a wider understanding of all the other services. So even if you're not that interested in using Kinesis, I would still recommend reading my blog post.

And I actually point out some of the common issues or things that are common for other services as well and distributed services in general things like idempotency, and timeouts, and error handling and that kind of stuff. And to tell the truth, I still use or I do use my own blog post as a reference manual, pretty often myself, because I have a horrible memory, especially when it comes to exact numbers. So it's nice to have a one place where I go to look for stuff. And yeah, so to help myself and to help others is the short answer to your question.

Jeremy: Well, no, I think that's great that, first of all, that you did that to help others, but the fact that you did it to help yourself, that is not an uncommon thing. I know, for me, most of the blog posts that I wrote were just ways for me to make sure that I wrote something down, and it would actually live out there that I would be able to go back and reference myself, just like you said. Because I figured out things my own way, and then it's really helpful for me to go back and see how I did it, as opposed to try to find a needle in a haystack somewhere else. So yeah, so awesome. So again, I think that's amazing. And I'll say it again, I read those blog posts, and I learned so much about Kinesis that I thought I already knew, but just seeing it in that different way was really, really helpful to me.

Anahit: Oh, great to hear that, especially from you, because I assume you do know quite a bit about Kinesis already.

Jeremy: I know a little bit. Yeah, no, I've used it quite a bit, but I mean, just in terms of like failure modes and some of these other things and the different caveats you run into, which is something that the documentation doesn't capture as well as it needs to. And that's one thing I find about AWS documentation, but documentation in general is, it's very easy to get to that, "Hello, World!" phase, like you mentioned, but then to get over that hump and bring it into production, I mean, that's a whole other beast.

Anahit: Yeah. And maybe the simplicity of the serverless, and the managed services nowadays is also quite deceiving in that sense, because nobody reads the documentation from start to finish anymore. You just go skim through it, and then it's like, "Okay, I will try out and see how this works." And then you try out.

Jeremy: And you can get going.

Anahit: Yeah, you get going. And I said, "Okay, this thing is working. I know how it works." Yeah, you do until something fails, because it will.

Jeremy: Exactly, exactly. All right, well, so let's start. Let's take a step back, because I know what Kinesis is, you know what Kineses is, but I'm not sure everyone knows exactly what Kinesis is. So let's start there. Why don't you give a quick overview of what is Kinesis, and why would you use it?

Anahit: Yeah, so Kinesis is massively scalable, and fully managed service in AWS, which is meant for streaming data, huge amounts of data, really. And what they say is that it actually scales pretty much endlessly, not unlike Lambda functions. And it has a lot of service integrations, like other services can send events to Kinesis, for example, AWS IoT Core has that functionality, CloudWatch events, events and blogs. Even some more exotic options with database migration service also has some sort of integration with Kinesis. So it's pretty common to use them in that combination.

And then as you mentioned, in the beginning, it's actually a pretty crucial service in AWS itself. And not everybody realizes that, that a lot of services use Kinesis under the hood, like the CloudWatch events themselves, use it under their hood, the logs, IoT services use it and even Kinesis Firehose use Kinesis as their underlying service. And as far as I know, they're one of the biggest customers for the Kinesis team, so it's cross pollination in that sense. And yeah, that outage last November, it actually showed. I would say that many people we don't know about the Kinesis before the power outage I suppose, or not too much, at least.

And yeah, so Cognito failed, CloudWatch failed. And then there was this chain of failures that they experienced for entire day because Kinesis didn't work the way it was supposed to work. So pretty important service, no matter do you use it or not in your everyday life.

Jeremy: Right, right. Yeah. So in terms of what it actually does, you mentioned it's a data streaming service for high volumes of data. And AWS is famous for creating a bunch of services that do very similar things. We've got SQS EventBridge exists now, SNS is a pub-sub type thing, which I guess you could think of Kinesis that way as well. So, I guess maybe why not use SQS or EventBridge or SNS? What specific reasons would you use Kinesis over those?

Anahit: Yeah, that's a really great question. And I think it's a question a lot of people struggle with, especially when they just start their AWS journey or messaging service journey or whatnot. Because there are so many services that look alike, and it's very difficult to distinguish which one of them do you actually need to use and how to actually choose from them. I have a feeling that those services have been converging lately. I think they are becoming even more close together than they used to be. For example, with like SQS an SNS FIFO support that they recently. So those are more similar than they used to be, than back in the day when they added SQS Lambda trigger that wasn't there. So it used to be SQS, SNS, Lambda pattern, and now you can do it directly. So that went to that direction as well.

And now, especially when they added, I think it was before re:Invent this year, or at the re:Invent, I don't remember anymore, they added to SQS, that batch window support exactly the same actually as Kinesis has. So in that sense, they are exactly the same now, so the same amount of, or the same time, or the same amount of records that you can batch before reading them to a Lambda function, which is quite cool. But then they are getting even more closer. And the question is, what would you actually choose?

And I think that the truth of the matter is that in many cases, you can go with many of those services. It wouldn't necessarily be a wrong choice. But there probably is going to be one particular service that is going to be better tuned for your particular use case. And in that case, you basically ... what it comes down to, there are like several questions that you need to ask, for example, the throughput requirements. So how much of throughput are you going to handle? Is it like individual events every now and then? Or is it a stream of events and huge volumes of events? And then again, what's the size of the events? So for example, as SQS can support, or SNS can support two big overheads, it's like 256 kilobytes or something. And with Kinesis it's one megabyte, so that kind of thing.

Then you should think about the data retention requirements, because like some services can store data for a longer time and others can't. Ordering: how do you want to write data to the stream? Do you want to batch the records, do you want to write individual records, do you want to have direct integrations or custom code that writes to the source, or how do you want to consume the record. So do you want to do the pops up, as you said, or do you want to call? What do you want to do? Or do you want to batch this once again, or do you want to be ... So several questions you can go through before deciding it.

And actually, Kinesis in that sense stands separately from all the other services, because it's not even in the same part of the service least in the console. It's considered to be an analytic service, as opposed to application integration service. So you made that distinguished quite a lot. And basically, with Kinesis, as I said, you have virtually limitless scaling possibilities, using the shards, so you can have more shards. And you can scale more, depending on how much data you need to accommodate. And one record can be as much as one megabyte. So it's a huge chunk of data that you can pretty much send to any other service for that matter.

Jeremy: And so, I want to talk about shards, but let me interrupt you for a second. The thing that is interesting about, like you mentioned with SQS, is SQS right now, with FIFO, The first in first out, you can do ordered records. So that's one of the things that Kinesis has always done. I know that one of the big differences, I think, though, is that SQS can really only have one subscriber. Once you take the message off of that queue, it's gone. Whereas we can, excuse me, as with Kinesis, you can actually have multiple subscribers. And as you said, with the data retention, you can go back in time, right? So I think that's another big thing. But it's funny, you mentioned the analytics versus application integration, because I know way back in the beginning, Kinesis was a really great choice for application integration and people were using it almost as like EventBridge essentially, do like eventing and stuff like that, or as a common thing, but of course you had to have multiple subscribers and it was sort of a pain.

Anahit: That's an interesting piece of information. I didn't even know about it actually. Because now the distinguish ... they are trying to make the difference I think bigger now between Kinesis and the other services now. Like the analytic services, they stand separately from the AWS point of view. But of course, it doesn't mean you can't use it. And actually, you can pretty successfully as a messaging service.

Jeremy: Right, yep.

Anahit: And, yes, so I was, you actually mentioned yourself that the big difference with SQS and Kinesis is that you can have multiple consumers for the same stream. But then again, SNS has that as well, and I think EventBridge as well?

Jeremy: Right.

Anahit: But for Kinesis, you can't actually even do the filtering that you can do with SNS and EventBridge, so it's ...

Jeremy: It's also true.

Anahit: You have to send all the events or the same events to ...

Jeremy: Can't they build one service that just does everything for me?

Anahit: Right? That's what I'm thinking. And this message retention is actually pretty funny that you mentioned because they have announced, I think, once again, before re:Invent, this extended message retention. So before, it used to be that you can keep your messages in Kinesis from 24 hours to up to seven days if you need to. And now you can it have up to one year, which makes a database out of it all of a sudden. And I think it will bring all sorts of new use cases with it, because if you just can put your data in this Kinesis and then do whatever you want with it for inside here, in many, many cases, you don't even need to deliver it to any destination after that, it's just fine like that. Of course, you have to pay extra for that, but that's a different conversation.

But yeah, that's a pretty big difference to pretty much any other of the messaging services, because you can't do that with that. And even with ordering, though SQS, and SNS also have ordering. But at least with SQS, the FIFO queues, they have lower throughput than the normal queues. So there is already this limit. And we can use if you don't have that, because ordering comes pretty much out of the box.

I think the main difference for me personally is how they work with Lambda functions, because I think Lambda has a wonderful support, and it's improving every year. And this year, they again added new possibilities or the functionality there. It has a great support for handling Kinesis records or batches and errors, which is always an interesting aspect for me. So, that's a big difference. But of course, like big pink elephant in the room here, is the cost. That's what everybody is concerned about. And I have heard so many times that Kinesis is too expensive. And I think it's still a bit more of an enterprise product rather than smaller company startup thing, because I think mainly because it doesn't have free tier, that's my opinion. Because you just start to pay immediately from the get-go like SQS at least have those three messages per month. And we had this interesting conversation with Yan Cui a while ago who was talking about the sweet spot between SQL and Kinesis, that there is ...

Jeremy: Yes, I remember that.

Anahit: ... actually a point here after which Kinesis actually cost you less than SQS. If you have the big enough amount of incoming data or your data is large enough in its volume, then SQS will start to cost you much, much more than Kinesis not to speak about how difficult it will be to manage really like the consumption and all that things, so. Yeah, but here are few differences for you to consider, but I think each service has its stronger suit. And as you said, we don't have one service that has all of the features that we would like them to have. So every one of them is suited better for a particular use case, I'd say so.

Jeremy: Right. So with Kinesis, another thing, again, that I think separates it very much so from your SQS in your EventBridge is that you do have to set up the shards. So you have to actually provision something in order to send data to so it's not like just an endpoint where you send data and it'll accept as much as you want. So explain shards and then partitions because this is something we could go super deep on this, but FIFO queues and SQS, for example, have a group ID or a message group or whatever that allows you to do sharding there as well, but without provisioning it. But let's keep the conversation focused on Kinesis here. So shards and partition keys, what are those all about?

Anahit: Yeah, so as you said, unlike Kinesis, or unlike SQS, I'm sorry, Kinesis does need to have provisioning. And you can think of a shard as some sort of order queue within the stream. So your Kinesis stream is basically combined of set of these queues, and each queue comes with its own throughput limitations. So you can send 1000 records or one megabyte of data per second to each shard and then on the out, you can get like two megabytes per second. So if you have more data, you basically need to add more shards to your stream and that's the way your stream is going to scale. So of course, each shard is going to cost you, so that's why you have to consider how much shards you are actually adding to your stream.

And the way your data is spread across the shards in the string is by using the partition key that you mentioned. So it's basically just a string that you add to every single data payload that you send to your stream. You just add a separate stream called partition key. And what Kinesis does is it calculates a hash function of that string, and based on that hash function, it decides which shard the record belongs to. So each shard is assigned a range of hash values which don't overlap. So basically, when you send your records to a stream, it ends up in exactly one shard in that stream. So that's the mechanism, it's pretty simple mechanism, but it's pretty powerful as well. And yeah, and the records, as I said, they are ordered inside each of the shards. So you have this ordering out of the box on the shard level.

Jeremy: Right. And then the sharding itself, so if you have five streams set up, the algorithm will split that into ... then, of course, the partition keys have to be different enough, right?

Anahit: Yes.

Jeremy: So that it can actually split them, but you can't send just like one or whatever, just send like a single-digit or something ...

Anahit: No, you can, but you probably shouldn't.

Jeremy: Right. And actually, you could probably control which shard it goes into by doing that. But then if you want to expand, so let's say that you're writing 4995 records per second across five different shards, and you say, "Okay, now I need to add a sixth shard, or a seventh shard," or whatever and keep adding shards, how does that rebalancing work?

Anahit: Yeah, so you can do it by doing so-called resharding, so you can add new shards. And there's actually two ways to add shards. You can split the existing shards as far as I remember, and then you can add a separate shard. So when you split a shard, the partition keys are split between those two shards. It's more or less equally, because the idea is that as you said, you have to have, or it's better to have a random partition key, because in that case, your records will be distributed equally or uniformly across all the shards instead of sending all of them to the first shard and overwhelming the first shard and then the rest will be just idle, and not using the capacity it could have been using. So a random enough distribution of partition keys is very important.

But then if for some reason, for example, you have a shard, which is overwhelmed, so it has more records coming in than the others, then you can split that particular shard and make it into two and then can you just have to take care of spreading the records between those two based on the partition key.

Jeremy: Right, right. And the fact that you have to do that manually, that you have to say, "Okay, this is a hot shard, or my numbers are going up, I have to add something separately." The big question is, is this really serverless?

Anahit: Yeah, that's a big question indeed. And my blog post, I actually argued that it's not entirely. So ...

Jeremy: I have this ongoing thing with Chris Munns at AWS where I think it's serverless, and he thinks it's not but ...

Anahit: Okay, so I'm more on ...

Jeremy: So kind of serverless?

Anahit: ... his side. Yeah, but getting there. In my blog post, I actually compare it to DynamoDB in a sense, in early days, because DynamoDB also started out without auto scaling, without on-demand capacity. So your provision capacity, and not unlike shards, and then you pay for what you provision, even if you don't use it at all. So it's pretty much the same. And then you have to use API calls to add some capacity and remove capacity, but everybody was not too happy about it. But still, it was assumed to be a serverless service, right? DynamoDB ...

Jeremy: Right.

Anahit: ... always from the get-go. So in that sense, Yeah, kind of, but here as well, we have the same fully managed service. But again, we need to take care of the throughput ourselves. So there is no mechanism that would take into account the incoming and outgoing records and decide, "Okay, now I scale up." You can build it. And there is actually a blog post about Kinesis auto scaling, which uses like five other components to do that. So you can automate it, but it's still something not supported by the service itself. Though everybody's holding their breath for it to come any moment now. I actually was hoping it will come at re:Invent, but well, what can you do? I guess the outage was a bigger thing to concentrate on.

Jeremy: That was a bigger thing they had to deal with. Yeah, maybe they were pushing the auto scaling functionality and they broke it, but ...

Anahit: Actually, that's exactly what I thought when they broke it. I was like, "Yes, auto scaling is coming," but then it turned out there were some other issues with that.

Jeremy: Right, right. Yeah. So well, anyways, so alright, so Kinesis though in terms of getting data into it, right, there's a number of different ways to send data into Kinesis. And another thing that's fascinating too, I think just about the Kinesis service in general is Lambda. We think of Lambda because its function as a service as a very serverless service, it sits perfectly in the serverless ecosystem. So whether or not Kinesis is 100%, serverless or not, it is used in a lot of applications, right, applications that have nothing to do with serverless applications or anything like that. Just it is a really good service that powers a lot of things, as we said. So, what are some of the different ways that you can get data into Kinesis? Because you mentioned batching, and some of those other things?

Anahit: Yeah, sure. So, of course, it wouldn't be too useful if we couldn't get data into it, right?

Jeremy: Right.

Anahit: So ...

Jeremy: And quickly.

Anahit: Yeah, that as well. So there's actually many different ways to do that. And, for example, one useful way, if you are going to stream your data from outside the cloud, to the cloud is the Amazon Kinesis agent, which is a standalone application that you run on your server, for example, that can stream files to Kinesis. So for example, if you want to stream your logs from your server to the cloud, that that can be done with the Kinesis agent. So, that's one way.

Then as I said, there are some direct integrations of some services, actually can push events directly to Kinesis, like CloudWatch, and stuff like that. One interesting service of those is, of course, API gateway, because it does require you some work because it basically acts like a proxy over the API calls for Kinesis. So you need to do some VTL magic and stuff. But it comes with some throughput limitations, of course, as with API gateway in general, but it's very useful for many cases.

And then there is tons of community-contributed tools and libraries that you can use to do it, but I think the mainstream, or the most common ways to write data to the stream is actually, either using communities produced from write library, so KPL, in short. And it's basically another level of abstraction above the API calls. And it gives you some extra functionality, but it also runs asynchronously in the background. So you need to have a C++ daemon, right, running on your system all the time. But it will collect the records and send them to Kinesis synchronously, which means that you might have some delay, or latencies that come with it, so it won't push them immediately. But the biggest issue with it is that it's actually only available in Java. So a bit limited use case.

And then the most favorite of mine, because it gives you the most flexibility when you need to, how to write data and how to handle the errors and stuff is the AWS SDK, which is basically the API calls. And luckily, there is a lot of SDK language support. So you don't have to be bound to just Java. Though I don't have anything against Java, I worked with for like, eight, seven years. I don't remember anymore.

Jeremy: Yeah, I'm not a big fan of Java anymore. But I write a lot of Lambda functions, so every time you ... I've never been able to get them to boot up quickly. The cold start has always been horrible with Java. So I've stuck to mostly Node and Python, just to ...

Anahit: Same.

Jeremy: ... keep things simple. But so all right, so you mentioned the Kinesis Producer Library, which I actually remember way back in the day, we had like a Ruby ETL thing, and we were using in the consumer library, and it was a mess, it was just a lot of things that had to happen. So it's easier if you can just have a nice simple SDK, or even better, have the service just natively push it into Kinesis for you, and then have another native service consume off of that, which is super easy. But so there are a lot of use cases with Kinesis. I think people can probably use their imagination for high throughput streaming data, click tracking, ad network type stuff for all kinds of things that you would need to see. Your sensor data, you mentioned IoT integrations and some of those things. So I think that makes a lot of sense. But what are some of the less common use cases? I know you have some ideas around how you can manipulate the system in a way to use it for your benefit, that's not super fast, high streaming data?

Anahit: No, we are mostly using, it's really not with my customer. We do use it mainly for the big data and streaming all that user interaction data, like the classical way of using it. So, that's what we do. And then there is actually one more use case nowadays, you can use it with DynamoDB as the events trace. They edited just again, just recently. So that's another cool thing, because I think DynamoDB string had some extra limitations with Kinesis doesn't, so.

Jeremy: Yep.

Anahit: But yeah, actually, as I mentioned in the beginning, the difference between the service integration services and analytic services, it doesn't necessarily exist, it's more likely in our head, is we don't really have to use Kinesis with the huge amount of data. And one use case that I personally found extremely useful is that when you, for example, have a Lambda function that needs to consume events from some stream or queue, and you want to invoke exactly one Lambda function at all times. So you want to process the events in order or basically consequently, not in parallel.

So with this SQS, what you can do is to use the Lambda reserve concurrency for that purpose. So you can say that, "Okay, I only allow one Lambda execution of this particular Lambda at all times." But it will mean that all the others will be throttled, then you have to take care of SQS visibility timeout and make sure that the retry attempts are big enough, so your valid messages don't end up in a dead letter queue and all that kind of extra worrying, I would even say, that is not necessary.

And what I found very useful is that with Kinesis, the way Lambda works with Kinesis is that it gives you one concurrent Lambda execution per shard. So, if you have a Kinesis stream, which is attached to a Lambda function, there is going to be as many concurrent Lambda executions at any given time as you have shards. So each Lambda will be reading from each dedicated shard.

Jeremy: Right.

Anahit: So basically, if your throughput requirements are okay, and you can have a stream with just one shard where you push all your events, then you have a Lambda consuming from it, then out of the box, you are getting a situation when just one Lambda function is reading from the stream at all times, and you don't have concurrent executions, you don't have to take care or worry about all the throttling and stuff. And then out of the box, you get all this nice functionality for error handling that Kinesis comes with. So I actually love it for that use case and it won't cost you millions, it probably will cost you like couple of hundreds per year. And I think it's pretty much well worth it if you think of all the management costs that you are avoiding that way.

Jeremy: Right. Yeah. And actually, the SQS, the reading off of the queue, the Lambda trigger for that, I believe that you need to set a minimum of five, concurrent ...

Anahit: Yep, yep.

Jeremy: ... for that, because that works that way. But I think and again, I could be wrong about this, because again, how can you possibly know all the services in AWS? But I believe if you use SQS FIFO queues with a single message group ID, that will also only invoke one Lambda function. I'm not 100% sure of that, but yeah, but either way, no matter which service you use to do that, that is a really cool use case. Because I can think of some cool things like, I don't know, maybe you were billing, you were doing like shipping labels, and something where you needed it to be like one after the other, they needed to be sequential, there could be some cool, definitely some cool use cases for that type of stuff.

Anahit: Yeah, we have found it very useful in one of our use cases. And the fun part was that I was struggling with SQS, like, "How do I do this properly? And I don't like this," and like ... then it was like, "Okay, I have been talking about Kinesis for like two years now to everybody around, so why didn't I think about it in the first place?" But yeah, it's a fun way to do that.

Jeremy: Right. All right. So let's move on to consuming data off of the stream. So there are a bunch of different ways to do this. I mentioned the Kinesis Consumer Library, which I think is also Java-based, and you need to run it in. But anyways, the easiest way to consume data off of a Kinesis stream, you've mentioned this, I think most people would agree would just be to use Lambda because it is a really, really cool integration. So what's the Lambda Kinesis story?

Anahit: Yeah, so I've mentioned it several times, because it's really my favorite way of consuming data from Kinesis. You don't need all that extra headache of keeping track of where you are exactly in each shard, and each stream on every given moment of your life. So it's very nice. And I think it takes care of a lot of heavy lifting on your behalf from reading or reading from the stream.

Jeremy: Right.

Anahit: And well, as I said, like, error handling is one thing, one big thing that Lambda makes also much easier for you with Kinesis stream. Then batching, so Lambda can read batches of records from the stream up to 10,000 batches of records in a single batch. So yeah, those are keeping track of ... that's the most important probably, keeping track of where actually where exactly you are in the stream because otherwise, you have to have some external ways to do it. And, for example, can this consumer library uses a DynamoDB table to do that, which it actually spins up behind the scenes without you even probably knowing about it.

Jeremy: And it's provisioned, too.

Anahit: And it's provisioned, and ...

Jeremy: Not on demand.

Anahit: ... it's pretty low. And then one day somebody from your team comes knocking on the door and saying, "Hey, I'm getting this weird DynamoDB provision throughput exceeded errors. We don't have a DynamoDB table." Hmm, where does that one come from? So yeah, it's much easier but then there is, of course, other services that we want actually to mention here is Kinesis Firehose and Kinesis Analytics, because those two are the other services in the Kinesis family. And they both have a very nice integration with Kinesis streams, so they both can be attached to a Kinesis stream as a stream consumer. In case of Kinesis Analytics, it actually can be a string producer as well.

So, Firehose is a service that is used for streaming data to a destination. So if Kinesis streams is just for streaming the data, and then you have to consume it somehow, the entire purpose of Firehose is to deliver data to the destination. So you can connect the two, to stream the data and then deliver it to the destination and the destination can be S3, Redshift, Elasticsearch. And I think that the coolest one recent one is the HTTP endpoint. So basically, you can deliver it anywhere you want. And then the Firehose has also some pretty neat features like batching, and transforming the data, and converting the format.

Jeremy: Yeah, transforming.

Anahit: Converting from like, for example, JSON to Parquet, that's what we use a lot, in our case, compressing the data ...

Jeremy: And then you can query it from Athena, for example.

Anahit: Yep, from Athena spectrum, and all that things. Yes, so it's very, very useful. And you can connect it directly to Kinesis streams, and it is truly serverless because you don't have to provision it. So it scales.

Jeremy: But there are no limits, though, right? I know I can just get a Firehose, like if you're choosing between Kinesis data streams and Kinesis data Firehose, there's an upper limit to the Firehose, right?

Anahit: Yes, that's true. I don't remember the exact limits. But then, the scary thing about Firehose was several years ago, that there was no mentioning anywhere that any of the operations can fail like rising to the stream, or to Firehose can actually fail. Because Kineses had all these metrics with exceeding the throughput, for example. So you see, you have a metric that says that something bad happens, so you know that something bad can happen. With Firehose, they didn't even have the metric that would tell you that, "Hey." So what they had is documentation that says that it's endlessly scaling service or something like that. And then there is the fine print with, "But if you use it with this, and this and this ..."

But as far as I know, if you're using it with Kinesis streams, it actually adjusts to the throughput of the Kinesis stream. So those limits don't apply anymore. So there's ...

Jeremy: Ah, interesting.

Anahit: ... yeah, there's this separation.

Jeremy: Interesting.

Anahit: Yeah. And then the other service from the Kinesis's family was that the Kinesis data analytics, which is one of my favorite ones, really, because it seems small, but you can do a lot of neat things with that. So what you can do is that you can analyze your streaming data in near real time. So you can basically write SQL queries with Kinesis data analytics, and it will perform joins and under different filters that aggregates over some, for example, time-based window. And then it can send the results of that aggregates to either another stream or another Firehose, or actually, it can send it to Lambda, so you can do whatever you want with it. So there's a lot of cool use cases that come with Kinesis analytics. And they both integrate very nicely with stream, you need to stream, but the "Got you," moment here, which apparently not many people realize is that both Firehose and Kinesis analytics, they act as a normal consumer for the stream.

So I mentioned that there is this throughput limit for each charge, right? So there is only one megabyte per second that you can write, and two megabytes per second that you can read. So this in practice means that you can have two consumers reading from each shard at the same time. And Kinesis analytics and Firehose are both considered consumers. So you can if you have a Kinesis analytics application, and the Firehose attached to the same stream, and then you want to add a Lambda function, for example, then you might exceed, end up exceeding that throughput. So you have to be careful about that, so yeah.

Jeremy: Yeah. But so then with that, though, so again, that makes sense. You can have two consumers, but they added something called enhanced fan-out. So how does that come into play?

Anahit: Right. So I enhanced fan-out is funny in the sense that it's very difficult to understand what actually happens by reading the documentation. I think that part took me actually the longest time to figure out because I personally don't use it at work, so it was a research project for me, mostly. I'm trying to figure out what is happening there because like all the combination, like enhanced fan-out and how it works with other features. But what it basically is, is that instead of sharing this two-megabyte throughput, outgoing throughput with all the other consumers, instead, you can have a separate, king of your dedicated elite highway that you get with the stream and then you get your own two megabytes per second of throughput. And you can have up to 20 consumers at the moment, I think, that each of them will get the actual megabytes. So you can basically consume a lot of data with that.

And the nice part is that the latency here is also much lower than with the standard throughput. So I think they claim it's 70 milliseconds of latency versus minimum of 200 milliseconds for the standard throughput, which is a big, big deal. And it actually stays the same in contrast with the standard throughput with where it goes up with each added consumer. So it's a really nice feature. And how partly how they achieve it is by using a HTTP two persistent connection, instead of HTTP. And the consumer, actually, instead of polling, as it does with standard throughput, instead of polling for records, Kinesis actually pushes the records through that persistent connection to the consumer. So in that way, we avoid all the limitations that come to polling the records from the stream, we can't get records API and that kind of thing. So that removes all the headache, but you have to pay for it.

Jeremy: Of course, of course.

Anahit: So, that's the problem. And the thing is that it sounds very cool, and you might think, like, "Why wouldn't you use it all the time?" Well, you have to pay for it. The truth of the matter is, in most cases, you don't need it. So if you have just up to three consumers for your stream, you're probably going to be just fine with a normal shared throughput model.