What is Self-Healing in Kubernetes?


One of the great benefits of Kubernetes is that it allows your infrastructure to self-heal. By that, I mean that if something goes wrong, Kubernetes will do whatever is necessary to bring things back to normal. A containerized application or component will automatically be redeployed in its intended state whenever a failure occurs.

Kubernetes implements self-healing at the Application Layer. This means that if a pod crashes, Kubernetes will work to reschedule it as soon as possible: 

What is self healing in kubernetes?

As it was nicely put in a CNCF guest blog post by Atul Jadhav, self-healing in Kubernetes is just like Bruce Banner’s ability to turn to the Hulk:

Captain America asked Bruce Banner in the Avengers to get angry to transform into ‘The Hulk’. Bruce replied, “That’s my secret Captain. I’m always angry.” You must have understood the analogy here. Let’s simplify – Kubernetes will self-heal organically, whenever the system is affected.


Why is Self-Healing in Kubernetes Important?


For instance, if I have a certain number of containers with a specific job to do, Kubernetes will vigilantly monitor them. If they fail, it will try to restart them on other available nodes. This principle extends to every Kubernetes resource.

The neat part about Kubernetes is that it is declarative. You explain to Kubernetes what you want your environment to look like, and Kubernetes tries to bring the actual situation to the way it is described in the configuration. So, no matter what goes wrong, Kubernetes is trying to take the necessary steps to bring your infrastructure to its desired state.

An intended byproduct of this approach is self-healing – the idea that an application will maintain its operation regardless of a technical glitch, update, or disaster. That is great when you’re running microservices on Kubernetes, but what happens for stateful applications?


Auto-Healing doesn’t Protect your Data


You may be wondering about how this self-healing works with your applications’ state. The bottom line – it doesn’t. The self-healing property applies only to Kubernetes resources but not to data.

No matter how you look at it, persistent data is almost always outside of the cluster. It’s the only thing that Kubernetes doesn’t control.

This kind of self-healing will not work for any stateful workload that needs persistent data. The self-healing capabilities are only limited to things that Kubernetes has control over. Kubernetes can create a runtime environment, a network configuration, it can fetch your container images and run them, but Kubernetes can’t recreate the data that your business has accumulated over the years.

Kubernetes’s self-healing property ensures that the clusters always function at the optimal state. But it only ensures that the containers themselves are running – not what was stored in them. So if something goes wrong with a disk or a database that your applications rely on that’s outside of Kubernetes, there’s nothing that Kubernetes can do. It can claim a new disk, but that disk will be empty. You will have your infrastructure restored exactly as declared. But your business data won’t be there, rendering the entire exercise moot.


Why Extending Self-Healing to Your State is a Must


Your business IS your data; it’s not the configuration of your containers. Data is the lifeblood of modern businesses. If your critical databases and stateful applications are migrating to Kubernetes, watch out for accidental data loss.

Imagine the moment you realize that you just lost critical customer data, for instance, the records for all of your e-commerce sales in the last 24 hours. “Lost” as in gone forever.

Or, in a slightly better scenario, your data might not really be lost, just inaccessible for a while. But for some business applications, this is not acceptable either. Imagine that hundreds of thousands of your customers cannot access their bank accounts to withdraw money.

When thinking about data loss, you’d always want to avoid that if at all possible.

As you can see, Kubernetes self-heals resources automatically. But stateful applications and databases require special care to ensure that the data is not lost when a container, node, cluster, or even a cloud region fails or gets deleted. We explore further on protecting your K8s data in this blog post


From Self-Healing to Foolproof Kubernetes Clusters with Statehub


Kubernetes does well with automating container deployment and ensuring they are running in an optimal state, but it cannot do anything for the data. Stateful applications like databases need special care to prevent accidental loss of critical customer data – a scenario no company wants to face.

Without a solution for data, self-healing for your containers is meaningless when it comes to stateful applications. You can restore your infrastructure, but what about the data? In short, your infrastructure gets restored, but your critical data is simply not there.

Wouldn’t it be great if we could extend Kubernetes self-healing capability to include the data layer of your application stack? Imagine having the ability to replicate your entire application, including the data, across regions on demand. Finding a way to safeguard your data is key when designing thinking about building resilient systems that can recover from failure automatically every time there is an issue.

With Statehub, you get self-healing for your actual business, not just your infrastructure. Statehub offers a simple way to store your data outside the cluster and make it available to any cluster anywhere in the world at any time. This ensures resilience and application mobility across different geographical regions or cloud vendors providers without sacrificing data accessibility.


Statehub Enables Self-healing for Your Entire Business, Not Just Your Compute Infrastructure


  • Provides a simple way to store data outside the cluster while making it available and accessible anywhere in the world
  • Keeps all of your Kubernetes clusters in sync with persistent volumes, assuring that they are always configured and ready to be used
  • Enables migration between regions and cloud providers regardless of location or the managed Kubernetes provider with a click of a button
  • Delivers the persistent application data to any location and keeps the data in all of your Kubernetes clusters in sync.

Just like Kubernetes delivers configurations to the application layer, assuring each container configuration and code are synced, Statehub makes your entire business self-healing, including the data.