One our latest webinar, we got together with some very special guests, Bart Farrell from the DoKC (Data on Kubernetes Community) and Chris Jones from Platform9, to discuss the burning question – can data on Kubernetes become declarative, just like its infrastructure?
We had some very insightful discussions that we wanted to share with you. You can either watch the recording below or read the transcript just below.
(00:00) Roey Libfeld: Today we’re going to talk about data on K8s, and if it can be a declarative resource. So what is the declarative data for support? So it’s data that is mounted on declarative agile infrastructure that can be mounted anywhere it’s supported by cloud independent storage, which means that it can be placed and be available on whatever public cloud we would like it. And it’s globally available upon demand, meaning that it comes with the ability to replicate itself or mirror itself across distances. And finally, it shares the same declarative user experience as the stateful Kubernetes deployment that we all came to love.
The DoK foundation did a big survey about the challenges that are stopping the adoption of Kubernetes when it comes to data and in stateful sets in general.
Bart, can you explain a little bit about that?
(01:28) Bart Farrell: Absolutely. First of all, very nice to be here. Thanks a lot for having me. As Roey just mentioned last year as a community, we interviewed over 500 organizations to see how they saw the challenges and opportunities of running stateful workloads on Kubernetes.
And as you can see right here, the results – one of the primary ones is the lack of integration with existing tools. How can we sort of shoehorn these, things in to get the right fit? Matched up with that is the lack of interoperability, not having that operational flexibility.
Also, when we talk about things like portability, that being a particular challenge, then also vendors not being able to meet the needs that end users are having. So finding that to be challenging, that things very much have to be custom made in general, a lack of standardization when it comes to tackling this problem, then we get further down the list as well,
you’ll see talent. It’s no secret, that every organization seems to be hiring. The challenge of finding the right people.
So getting organizations up to speed is one of the reasons why our community started in 2020 so that there would be a common meeting point for people to exchange ideas and best practices. Then touching once again, on the point of interoperability, the fact of vendor lock-in, that there isn’t that flexibility and feeling that you have to be very much attached to one, public cloud vendor.
So in conclusion, with both of these being mentored, is that there is this lack of standardization, which is once again, why the community got started.
(03:38) Roey Libfeld: So Uri, what needs to happen to give stateful workloads the same experience as stateless? What’s your opinion as our DevOps engineer?
(03:48) Uri Zaidenwerg: That’s a good question. I think we first need to answer what is this stateless experience that we have been talking about? It has two main parts.
One is that stateless applications are declarative. So when you deploy a stateless application, all you have to do is name the container that you want to run, as long as your cluster has access to the Docker or container registry where this container is hosted or registered, you’re sure that no matter where your Kubernetes cluster can run this thing.
And the other part is once the Kubernetes cluster got access to this resource, this container, and ran it, if something happens to the infrastructure, the physical infrastructure that is currently running this app, it can always move the application from one node, from one physical infrastructure to another.
So these are the two things that we need to bring into stateful applications. We need to consume data as code, regardless of its physical location. We need to make it available as a declarative resource that will be available for application no matter where it’s running.
And this is why we partnered with Platform9. Together we provide a complete solution for stateful applications and make them as easy as stateless applications. Platform9 takes care of the application layer, meaning they can deploy your Kubernetes application on any cloud provider on any infrastructure. Whether it’s on-prem, AWS, Azure, GCP. And together with Statehub, we provide Kubernetes native storage that can be accessed from anywhere. So you can deploy your applications, whether they are stateless or stateful.
And together with these two solutions, you can just move your applications around with a single control plane, moving from one point to another, without having to worry about infrastructure or cloud-specific skills.
These two solutions together make it easier from the design and architecture stages of your application, because you don’t need to overthink about a single provider or what are the features that are coming with this vendor. Am I going to do this or that in the future, if I want to, because you’re going to be able to move your application anywhere at any point.
As for Day1, our solutions are effortless to deploy. Both of these solutions come with maximum automation and are made to be as K8s native as possible.
And as for Day2 and maintenance, we also take care of a lot of things that you don’t have to worry about anymore, like recovery copy management.
(08:43) Roey Libfeld: Thanks a lot Uri. Michael, our head of product, will now present the demo.
(08:50-17:02) Product Demo
(17:03) Roey Libfeld: Chris, how do you think this will work when it comes to your clients and the challenges that they have in Platform9, dealing with stateful workloads?
(17:15) Chris Jones: That’s a good question. It’s interesting seeing the survey results at the top of the session, a hundred percent that resonates with basically everyone we interact with at Platform9. We have a free platform. People jump on, try things out, get on slack, ask questions. We’ll be running that since March 2020. And I can attest that nearly all users that come in have questions about storage, how is it working? Why isn’t it there? What do I need to do?
And that isn’t even like getting up to that level where they’re asking about how do I do this with a real-world application and how do I then keep it available? Let’s say if Azure or AWS does have an outage, right? There are a lot of businesses that learned at the end of last year, that running just one region was a big problem. And then you have these Platform9 customers, for example, that also run in AWS with EC2 machines that are using volume mounts and their availability zone locked. So to me, it’s a pervasive problem. I think the survey really calls out that people have no idea how to even begin solving this. How do we see our users and customers solving it? They don’t have a solution to this, right? Some businesses might just say, I’m just going to go and consume another public cloud service and try to outsource the database problem. That’s not necessarily going to give you flexibility.
It might give you some single cloud availability, but it’s still relying on another service. And if that region’s got an outage, you’ve got a problem. Or if RDS is having a problem, that’s pervasive. You still have a problem. I don’t think anyone’s really solved this.
Specifically, how I would see our customers using this, it’s getting out of the trap where people are dependent on what’s available in an availability zone to their EC2 instances and stepping away from that as well as moving directly into recoverability. So giving themselves the ability to say, we’re going to run in this region anywhere in the world and if there is a problem, we’re not going to wait to see how long this goes on for, but it’s going to immediately fail that over to another region and off we go. That’s the biggest strength that I’ve seen here.
This is a great demonstration of doing a cross multicloud, so AWS and Azure I don’t see it as being any different from an AWS to AWS perspective, right? There are still things that need to be protected. Still, things need to be replicated. And I think it’s significantly easier and potentially safer than trying to stand up your entire environment and then stop backing up into the older individual applications with their own type of tooling. So if it’s running Mongo using a Mongo backup, having Mongo running in your other cluster and replicating it over that’s going to drive up costs when really what you’re looking at doing is making sure that you’ve got the entire stack well-formed and declarative. So when the data’s there and Mongo comes up, it just continues operating like it was on the other cluster.
(21:01) Roey Libfeld: Cool, thanks a lot, Chris. Michael, can you walk us through the architecture of our joint solution?
(21:12) Michael Greenberg: There are two main components, right? There’s the application stuff, the Kubernetes clusters that are provisioned and managed by Platform9 in the customers’ environments, and the data component is handled completely by data.
So we have the control plane for the application. The knowledge of what’s running where at any given moment on Platform9 and the ability to basically have the data waiting for you wherever you’re going to start your application provided by Statehub. Every bit that’s being written to disk by Mongo or MySQL or any other data application, is getting replicated in real-time to all of your other locations.
So when the time comes to bring up your application there, you can just switch ownership and you will see the application coming up on the other side with all of its data and without any sort of gaps in the data, nothing is missing. You can resume your entire operation somewhere else with all of your data immediately.
And to tie it all together – we haven’t mentioned explicitly yet that Statehub is a service. So the way you would consume EBS or Azure disk or a GCP disk is pretty much the same way you would consume Statehub from a Kubernetes standpoint.
What we give you is upon a persistent volume claim creation from your Kubernetes cluster, we would provision volumes that are replicated with all of your locations, right? So the way you work with Statehub is that you would register your clusters and then they connect to our data fabric.
And we have a presence in all of the public files, regions, currently, AWS, Azure, and GCP coming very soon. And we take care completely of all the networking and data operations that come into play when you need to replicate that data. So the data is being replicated automatically within all of your locations, you don’t need to solve a networking problem when you have a replication problem that you’re trying to solve.
(24:33) Roey Libfeld: Just to sum up what everyone said, the problem is that stateful Kubernetes is complicated. And what we usually do, is we go to whoever provides us the storage or the infrastructure and we stay put. We need to think a bit ahead and try to enjoy cloud services that don’t close doors, but actually open doors. We can reduce the level of complexity by consuming two very simple managed services, that almost don’t have to maintain, or worry about storage and replication.
(25:44) Chris Jones: You touched on a pretty important part at the end. Setting up a cluster, deploying your app, and then not really worrying about the storage, replication, and all those 101 things. There are a lot of customers that we’ve been talking to recently that have been using Kubernetes for a while, went through a painful journey, and now running at scale in the public cloud, right? These are people that have lots of clusters or even a low number but thousands of nodes. And they’re worried about the blast radius – when something happens in one of those clusters, it takes out everything, or it has the potential to take out everything.
So people with these large singular clusters, I try to compose them down to the smaller pieces and people that have the smaller clusters are worried about how do I get it all back up and running and working if, there is a problem, right? And this comes down to the original purpose, how do you make data more declarative in the world of Kubernetes?
What we’re hearing is people want entire clusters in the entire application stack and everything related to it to be 100% declarative. So instead of saying, I’ve got GitOps managing stuff in a cost of once a cluster is up, it’s saying, I want to be able to one-click, have a cluster come up from start to finish, in a declarative way. This is taking it to another level, it’s saying, I don’t just want the cluster to be able to come up, I need everything to be there and be available because I have stateful applications and data running in Kubernetes, which is what people are more and more moving on to, especially as newer enterprises move on to using Kubernetes, or probably a better way of saying that is enterprises that are just getting started and are containerizing their existing applications, which a lot of them are stateful are moving on to Kubernetes. They’re going to be like, that’s great, you can stand out of clustering in as fast as it can be built, but how do I get my data from A to B, how do I know what’s going to be there, and be available and work.
And I think this fits in with solving that problem in a very elegant way. The data can be in that “to be” state, the cluster doesn’t even have to be there, right? It’s already just replicated by Statehub to where I want to be, which means I’m not spending any money on anything, just the storage.
And then in the event that I need to recover, I can use my preferred GitOps workflows or the tool that we’ve been working on at Platform9 that’s about to be open-sourced and say, go, and we’ll go and rebuild that entire cluster from the ground up, install everything.
And then once that cluster comes up, those apps are going to say, hey, I’ve got access to this volume. It’s been mounted through Statehub and it’s there and they’re going to continue running. And there’s been no manual intervention from an operations side. It’s all been that design and architecture upfront. And everything’s been done through a declarative. So the state isn’t going to be a problem here, things will work because it’s all running identically across every single place it is.
(29:30) Uri Zaidenwerg: So together we can create a fully automated from scratch solution, in a single command. That’s very powerful automation.
(29:47) Michael Greenberg: You don’t have to reference storage anymore. You reference your PVs because they don’t exist when you try to move their application from one place to another.
They don’t exist when you try to move your application for one AZ to another, let alone a region or a cloud vendor. So instead of referencing storage volumes, for example, a C drive is always different on each machine. But here you can have an abstract DB for your application and data.
What we’ve created is a global PVC (global persistent volume claim), which references your data and not the physical storage on which it is located at a certain point in time.
(31:06) Roey Libfeld: If we stop associating storage with data and start thinking about a Dropbox type of experience. I don’t know where Dropbox is holding my data, I just know that it’s available when I need it.
And that’s the most important thing. Storage is a physical product you buy from an infrastructure provider. Data is what your application needs, and giving it this global persistent volume claim that allows the data to follow the application and give it the same agility as the application layer is a game-changer when it comes to how we look at data.
(31:44) Michael Greenberg: But unlike Dropbox, when we talk about Statehub, the data is physically in the same regions as your Kubernetes clusters are. When you register your clusters with Statehub, it builds a private storage network for you, that’s available at your locations, wherever your clusters are.
(32:48) Roey Libfeld: Bart, do you have anything to add?
(32:31) Bart Farrell: We’ve been around for almost two years, and when I first got to talk to the folks at Statehub, we’ve seen how they got it.
A lot of times the people might get a little bit reserved or not so confident. And they’ve been told just do everything statelessly. It’s very comforting for me and energizing to hear Chris’s comments about how we’re getting to the point where organizations will be like, what about my data?
And what if I want to move it from here to there and approach things with that stateful mentality. For me, it’s really good to be here, having been in the community now for over a year and a half to see us getting to this point where we’re doubling down on these kinds of conversations and seeing that they’re being valued by customers.
(33:51) Chris Jones: Let’s say I’m Chris and I want to learn Kubernetes. What do I do? I have access to AWS and I’m going to make an EKS cluster that comes with storage. It’s just there, right? EKS is free, but all the resources underneath, that’s what Amazon wants you to use.
And Google and Azure do the same thing. But then as soon as they start trying to get to the next level, they have to start thinking critically about storage. Not only is it complicated, but it’s also changed, right? CSI drive is relatively new in the world of K8s. So no wonder people are getting stuck.
(34:35) Uri Zaidenwerg: I cant imagine the poor people who deployed their stateful applications on Kubernetes using the in-tree storage and persistent volume drivers and have to migrate all their Data into CSIs.
Even, redeploying your existing application, just because your K8s cluster crashed or malfunctioned, is very complicated.
(35:32) Michael Greenberg: Especially if you forgot to change the reclaim policy.
(35:58) Chris Jones: There’s a flip side to that. Let’s say you did change it. You did it in development. And all of a sudden you’ve got clusters that are creating all these volumes. You’re not paying attention.
So then you’ve got a cost problem, so you can get burned on both sides.
(36:15) Roey Libfeld: Yeah, there’s a lot of ways that we hear from our customers of their stateful challenges. The main challenge that we see is that people don’t want to deal with this. It’s not even about I don’t have the time to learn or the necessary skill set. They just don’t want to deal with it. It’s 2022, and it’s about time that the experience of stateful and stateless will even up. So we can enjoy a singular workflow, support CI/CD, fewer vendor dependencies.
(37:10) Chris Jones: We’d like to call that “democratizing the cloud”, making it equal. These are generic services. It’s infrastructure. Thanks for setting it up. Let’s just consume it. Let’s not care which one of the people we’re paying for it to, let’s just make sure the cost is within our budget.
(37:26) Uri Zaidenwerg: I’ve been waiting for this day, for the Uber of clouds since I first heard of K8s. But then, I wasn’t thinking about it as I’m thinking about it now. As an abstraction layer above all the providers.
(38:35) Chris Jones: So what’s the best way for people to get going next? Where should we go?
(38:40) Bart Farrell: Everyone has a home in the Data on Kubernetes community. We’re very, active in terms of the content we’re putting out, also in terms of blogs. And we also have open source projects so that people can get hands-on experience, regardless of their experience.
When we got started in July of 2020, it felt like we were just the nerdy kind of outcasts that no one wanted to talk to. But by reaching out, interacting with more and more folks, and then also getting end-users to come to the table. It’s one thing if it’s someone who is just working on technology when they say we’re going to bring in a bank or the largest e-commerce provider in Europe, these things make people start to take these things seriously because as we say in the United States “if it doesn’t make dollars, it doesn’t make sense.”
Technologies can be wonderful, but if people aren’t willing to pay, if it’s not adding business value, it’s going to be tough that people are going to listen.
I also recommend checking out the research report. It’s not long, it’s very intuitive. We used part of it today, but there’s a lot of other stuff in there as well. We’ll be repeating it this year, so it’ll be really interesting to compare and contrast the findings that we got from the first round with the second round that will start in September.
(40:10) Uri Zaidenwerg: You have the Platform9 Fortnights if you want to learn a little bit more about Kubernetes and you have our webinars we’ll be posting regularly.
(40:24) Michael Greenberg: Go to Statehub.io/start to create your account with 100GB of free multi-cloud multi-region replicated storage. How do we get started with Platform9, Chris?
(40:40) Chris Jones: Go to Platform9.com/signup, it’s pretty straightforward. It’s free, off you go. There’s some cool stuff coming out in a couple of weeks so stay tuned.
(40:53) Roey Libfeld: You can start your Kubernetes journey for free, with storage that is completely Kubernetes native, and it’s ready to deploy whatever infrastructure you would like using the Platform9 Kubernetes as a service platform. So I want to thank everyone that joined us from the DoK foundation and Platform9. If you have any questions, feel free to shoot us an email or a chat on our website.