Attacking and Defending Kubernetes Cluster with Divyanshu Shukla

Host: Hi everyone. Thanks for tuning into another episode of Scale to Zero. I'm Pursottam, co founder and CTO of Cloudanix. Today's topic includes Kubernetes Security, both from an attack perspective and also from a defense perspective. We'll be doing this slightly differently than our usual podcast. It's more of a workshop-style session today and to do the workshop we have Divyanshu Shukla with us. Divyanshu is a senior Security Engineer with over six years of experience in cloud security, DevSecOps, web application, pen testing and mobile Pen testing.

He has reported multiple vulnerabilities to companies like Airbnb, Google, Microsoft, AWS, Apple and many more. For reporting the issues. He has received many CVS as well. He has authored Burpomation and very vulnerable serverless application. He's part of AWS Community Builder for Security and Defcon Cloud Village crew member as well. He has also conducted many trainings and seminars in events like BSides, Nullcon Iitharwad, Girl Script, Chandigarh University and Null Community. Divyanshu, it's wonderful to have you in the show for our viewers. Can you share a bit of background and how you became interested in security and cloud-native security in general?

Divyanshu: Yeah, sure.

So I started when I went into the engineering, I was not from the coding background. So when I started I was not able to grasp the basics of how coding works. But before that, when I was in college, I used to do basic surfing and I was able to understand how phishing works or back button vulnerability at that point of time in our code. So I was not sure of what I was doing. But I knew this is not the regular stuff and this is not usually this should not be allowed. So before that I was into commercial application, I was not into the computer, so I was not very much confident. But when I started my engineering there, I saw that apart from software development, there are other fields as well, like networking and basically Linux admin was there at that point of time, then security.

So when I started working and learning, I was able to understand how Linux works. So Windows was always a tough OS for me. So when I put my hands on the Linux, I was able to move around smoothly and I was able to understand how things were working. So when I went into my first training in my first year of the college, at that point of time I was able to solve the CTF, No, Basic, CTF and the packet tracer. So all those things helped me to understand that there is something else from the software development. So I realized that this is something I knew from the past once I was able to correlate looking the dots backward. And at that point of time I decided to pursue it as a career.

So then I went into the networking trainings and I did my CH training. So I just did the training because I always assume or like I always think that certifications are not that important, it's about the skill. So I started picking up these skills and trying to do hands dirty on the ISPs like the local ISP providers and switches. So I was able to move around the things, I was able to find issues in my own ISPs. So that's how I started. Then I went into the internships, did a couple of internships as a DevOps intern. Then once I was very much confident about the Linux and networking, I started working as a full time security engineer and from there I picked my path, did all that pen testing, did DBWA, vulnerable applications and so on.

So because I had previous knowledge about the DevOps and Linux administrator, I quickly picked up the cloud. So I started cloud, it was new in 2016, 17, companies were moving in the cloud and it was new. So I knew that one day or another security would come into the picture and I had that knowledge, I was able to troubleshoot, I was able to understand how the cloud infra was working. So that's why I started learning and meanwhile I was doing my pen testing and backbounting. So once I was working as a full time pen tester, I got the opportunity in one of my previous organization to look into the cloud security also basically it was Elk monitoring the logs and creating the use cases on the threat hunting of the cloud attacks. So that basically helped me to build the basics of cloud security. And then in my last organization I was doing it as a full time engineer where I was performing these scans by open source tools, then doing Csuit scan and understanding different CSPM tools.

So because I had a pen testing background and now I got the idea about scanning the cloud network, I was very much good to go and perform the red team assessments and taking care of the entire infra. So that's how I landed into my current role and started leading the cloud security for my current organization. So that's how my journey started and this is what I am today. So now I am moving towards the Kubernetes and GKE and other clouds as well because it again depends on the organizations. Because in the lab you can do a couple of things but in reality until you are not doing scan or the audit of real environment, you won't get that idea how the infra is actually set up in the real environment because of the scale and the size. When we talk about. So these are the things like how and what I've been doing till now. So thank you and over to you.

Host: Yeah, thank you for sharing your journey. One of the things that I like, there are two things that I could that struck out. One is you were always a hands on guy, that's how you started, right? Instead of going buy the books. You have always played with the system in a way. The second thing is you have picked an amazing domain, which is security. Like I was at Forward Cloud SEC and AWS Reinforce this week and the amount of innovation I see makes me very happy that we are in the right domain.

So I'm looking forward to learning more as part of the podcast. So, before we start, one question that I ask all of our guests is what does a day in your life look like? So how does it look like for you today?

Divyanshu: For me currently, that starts with looking into any issue which has come up in the cloud, especially when I do it's mostly the AWS right now. So it can be attack on the web or any DDoS kind of a scenario.

This is just one part. Sometimes it is the developer trying to set up an application and they need IAM access. So reviewing those access, generating the im roles for them. So these are the common things which I have done today itself. And the next part involves mostly around doing an automated scan via open source tool or the Cloudanix itself to see what are the issues. Then basically going and validating those issues in the environment to create a POC for the DevOps teams and the developers and then helping them to fix. So, just before this, our discussion, I was discussing with our team and the developer that what things are required for when they are creating a user in the environment.

There was a specific scenario. So these are the things which I do in my day to day life and most of the things I've done today itself.

Host: Okay, so again, very hands on work like looking at different attack vectors and how you can both attack and prevent as well. So that's a good segway to the workshop. Right. So let's get started. If you want to share your screen and maybe we start with Kubernetes and what are different attack vectors and how you can attack a Kubernetes cluster, what are different ways and we can go from there.

Divyanshu: Yes, sure. So in this workshop, basically I'll be showing you a documentation where I have written these steps so that if anyone is looking into a screen and if they don't know the command, they are well aware what exactly the command looks like and how I am moving around. Because if I'll do somewhere from copying and pasting directly like the person or whoever is looking won't be able to understand, right, which command would come next, like out of the blue. So that is why I'll be following the approach which is most from the trainings which I have given. So I'm starting to share my screen.

Host: Okay. Yeah, that's lovely, that's lovely.

At least our audience can go through the documentation as well and follow along when they look at the podcast.

Divyanshu: So, this is the name of my training, which I usually give so far here, I'm not changing it. So it's the workshop on Defending and attacking Kubernetes.

Here we'll have a demo. Basically. Usually it's more of a hands on, but right now it would be a demo kind of presentation, and I'll do most of this stuff, and there would be a vulnerable application. So if I'll get time, I'll try to show how the application looks like, and if not, I'll show you the YAML and the source code. I'll try to cover things based on the time. So let's get started before that. before I start, the disclaimer which I have put in, so, these are some real life attacks which I have learned over the time. So do not just perform these attacks through the environments where you are not authorized. Like, this is just the educational sections for the security engineers or anyone who is more interested to understand the Kubernetes security. So that is over to you.

Host: Yeah. So one question I had is the URL that you are at, is that publicly available all the time so that we can share it with our audience? No, it is shareable for this session. Once our session is complete, this would go down because it's like a proprietary content. So right now, I'm still working on setting up these things. So that is why I can't share. Maybe in future I'll do that. So I'll paste something in the comment section from where folks can pick it up.

All right, makes sense.

So that is pretty much let's get started. So first I'll go to my EC2 instance. So it is Cloud9. I've set it up and I'll start explaining the configuration which I've used, and I'll also explain basics of how I have set up the cluster. So, this cluster has been set up by a Kind. So, Kind is a software or an application which helps to deploy the Kubernetes cluster into a local environment using the docker. So when we talk about deploying a real cluster, we need a master and a worker node, which can worker nodes, like one worker node or two worker node which can connect to the master.

So in single lab, that is not possible because then we need three easy to instances, and then we have to set up so that it will take time. So that's why we'll use Kind and we'll set up the cluster using Kind.

Host: Yeah, I have used Kind in the past, and it's fairly straightforward to use as well. Correct.

Divyanshu: So I've tried basically using K3s as Kind mini cube and cube adm. But for this lab, it was most appropriate to use Kind because of the docker underlying docker it was using. Plus, the kind of attack I wanted to show were possible in the Kind cluster. Like, although I needed cube adm. So my main lab which I have if you will see this lab which is running on this Ng rock free URL so this lab is deployed using cube adm.

Because it is the actual way of deploying the Kubernetes cluster.

That is why I've used in CTF section or like in main lab I've used it the Cubadium but for demonstrating multiple attacks where I have to delete the cluster or redeploy multiple apps. I just wanted to have seamless way of doing it. So that's why I picked up this kind right just go through the basic of like this the kind it says the cluster. So basically we are deploying the cluster then it talks about the version. So we are using this v one alpha four version. And then in the nodes we have control plane. And if you will see there are two worker nodes.

So in the control plane these extra mounds are there. So usually these things are not required. If I'll remove these things then also my cluster would work. But I wanted to demonstrate the attacks. So that is why for a couple of attacks these docker sock mount point was required. So that is why it is basically mounted. And then if you'll see in the last this networking default disabled default CNI is true.

So which says like I don't want Kind to deploy the default CNI which it is using. So instead of that I have deployed the Celium. So if you have heard about Celium it is an advanced network wrapper or fabric which is used to which uses Ebpf technology to block provide much level granularity in the terms of network security like if you talk about in the layman's language. So that is why I'm using CDM with the kind.

Host: One question that I have is the attacks that the processes that you will show will they work with the default CNI as well? Like, if somebody doesn't want to use cillium or doesn't have cillium, let's say, in production, will these attacks work?

Divyanshu: yes, so these attacks would work in all the environments except the part where I'll be showing the Psyllium itself, how the cillium blocking and like in the defense section, Celium works. So for that, we would explicitly need Celium. But other than that, everywhere flannel or weave or anything would work. Like whichever network fabric the users have.

Okay. Makes sense. Let me jump onto I'll see if I can have something from the basics. So let's start with the container first, because Kubernetes comes after that. So the container is let me zoom in before that. My apologies.

So, basically, container is kind of a virtualization where we have application running on a container D or runtime. And it came after the VMware or the VirtualBox came into the picture, where we had a host and a VirtualBox was taking our memory from the host itself. And it was a separate operating system which consumed our RAM and the space itself. So then containers came into the picture. So it was like a ship itself, as you can see where we have different, different containers and inside each container we have different, different things.

So let me give you an easy example. So when we talk about container, a ship is basically like moving from maybe us to India and there itself we have fishes and the paper. So in the one container we have fishes. So for the fishes we basically require ice and colder environment. And for the papers or like maybe a newspaper or any kind of book or something. So for that I need a hot environment which is non humid, so that no books are not spoiled. So in those cases we need different things, the requirements are different, but we don't need two different captains, two different teams, teams to do all things right? So when we talk about virtualization, imagine we have a ship and inside that we have two more ships which are managed by different, different captains, right? So we basically have three captains which are consuming the resources from the same ship. So it is kind of an overburden. So when container came into the picture, now we have a container box where we have provided the electricity for the refrigerator to run and there it's all cold and no freezing and our fishes are safe.

So in the another container now we have books and newspapers where we have maybe a hot environment and now our papers are also safe, but the captain is same, team is same and now my ship is traveling, right? So I can use one single resource and based on the environment I can do whatever I want. So now I can stack my ships and I can have more containers, like based on obviously the size of the ship. So I can have more containers and I can do much more like with the same amount of team, like a captain. And so it's similar to resources we have on the host operating system.

Host: So that's how I have never heard this analogy, but I still love the analogy because in real life it talks about containers and how shipping and everything works. So yeah, thank you for sharing that. I love that.

Divyanshu: Thank you. So basically this is the diagram which shows the same. So we have a virtual machine on the left and then there is an hardware host to us and hypervisor and the technology which was used in the past was hypervisor and apps and the guest OS was installed. Now these guest OS are used to share all the resources till the hardware level. So it was like your virtual basically your host to us would be very slow and there would be sharing of Ram. So it would take whatever. Like if I'll give eight GB of Ram into my guest OS, one for the app one, it will take everything, right? So that is not required when we talk about running an application with a specific requirement, maybe a Python application or maybe a net application.

So in those cases we need very specific requirement. So let's suppose I want to have a Python application running on Ubuntu so I can have an Ubuntu image and which will use the container engine and we can have a container with Ubuntu minimal requirements and then inside that we can have a Python version. So then everything is again shared. But it is same to the analogy which I have given where the resources are shared based on how your normal or regular process works into the host OS. It would be similar to process not sharing your hardware or taking a part in the hardware. It won't be segregated as in the case of virtual machine.

That is pretty much container and for that this is exactly then we'll talk about Docker. So Docker is an open source platform for building, shipping and running containers. It is just a platform and it is because it is easy to use and it is readily available and everyone is using. So that's why it is very common and everyone is deploying applications in containers using Docker itself.

So that is like the my next image is a Docker file. So docker file is used to create the container images. So first we have a Docker file from where the image is created and it is uploaded into a repository.

It is very much similar to the GitHub code but in the case of container repository like Docker Hub or ECR, we are uploading the images which we have created. So this docker file would contain the set of commands and these commands basically will say as you see from Ubuntu. So this is the base operating system which we were talking about. Then Copy is just copying my code and then run pip install is just running the Pip and installing the requirements and it is exposing the 1990 port and then the CMD is running my command so it will create the image as you can see on the right side in the similar way.

So that is pretty much about how the images work. I'm not going much into the control groups and namespaces because that is a very wide topic but I can talk about that in a brief that control groups are used to limit and prioritize basically the CPU memory and IU bandwidth. So they provide a kind of a separation that one process or one group of processes do not interfere with others. Similarly, we have the namespace. Namespace is again there are a couple of namespaces like network namespace, file system namespace which is basically used to isolate the host from this container which is running. So every time my container would run it would be in the separate namespace. So it won't interfere with my host process or my host network.

So this is the place where the vulnerability arises in cases or the misconfiguration I would say when the container is configured in a way that it requires something from the host like a file system or maybe the process and we are overexposing these things from the host. So the attacker, if they have compromised the containers or any malicious process that is running in the container, it can just come outside and try to take over the entire host or try to attack on those specific file system or that perform the breakout kind of scenarios which we will see in the later section of this workshop.

Host: It is somewhat similar to overprivileged. I am permissions let's say, right using that you can attack an AWS account or GCP account or Azure account and then get access to the whole account in a way that's correct. Makes sense. Yeah, let's continue.

Divyanshu: Yeah, so that is pretty much it. We have some couple of namespaces I've given like PID namespace, mount namespace and usernamespace. So out of which mount namespace we will see where I will be mounting the file system of our host into the container.

Like if my container want to access a specific file system or something which is present in my host, in those cases I'll mount my namespace there and we'll see how attacker it is possible for an attacker to perform a container breakout. And these things actually help in the real world. When we have a misconfigured clusters or when the teams are not very mature or they don't know the kubernetes or container security, in those cases, they will have misconfigured clusters. And in the case of the black box pen testing also where you have found an RC or you are inside the container. Or in the case of gray box pen test also where you found an internal way, maybe during the audit or something. A Red Team internal Red Team assessment. You got access into one of the application which was running internally.

In those cases, once you are inside the container, it would be easy to break out and take over the host. And once you have the host these days the containers are running onto the AWS environment so those hosts will have some kind of no PEM keys or maybe the overly permissive roles or like any hard coded keys which will lead to the complete AWS takeover.

So that is the containers and then we talk about the orchestration. So because we have multiple like we are talking about one or two container but in the microservice architecture we have more than 1000 or maybe 10,000 containers in cases. No in those cases how we'll manage them. So like the deployment and management and scalability and the networking. So all these things need to be managed right? If my container died while I was trying to deploy it in the production, how I'll bring it up, like how I would basically get to know that it has died, right? So for these things we have something called as Kubernetes.

So it is an orchestration engine that solves this problem and it is backed by the Google. And basically by using Kubernetes, we can orchestrate the deployment of these containers into the infrastructure.

And it is very easy to restart the pod and check whether the container is not running or there is some issue with the container or it is taking, or like if we want to do the deployment or if we want to update an application. So in those cases we can use the Kubernetes and it makes it very easy for us to go and understand how basically our infra is working. So for this workshop, I'll be using the Kubernetes YAMLs which basically are used to deploy infra in the real world as well. So lab scenarios might be little bit different but somewhat they will overlap with the real scenario as the YAMLs are created in the similar way they are uploaded into the Git repositories. So if we talk about from the security ingenious perspective in those cases we perform scans on via TF SEC or checkoff on these infra IAC code so where it will have YAML as well. So from there also we'll get an idea what kind of misconfigurations are present into the deployment itself so that we get to know before it is deployed and then we can ask developers or DevOps team for the fix. So that is how so in the next section we'll talk about deploying a vulnerable application and how the YAML looks like and how the security testing is done.

So there were a couple of phases where I wanted to talk about the RBAC and how RBAC works. But for this workshop I'll go through within the lab itself, I'll explain what authentication and authorization and what do we mean by in terms of the Kubernetes infrared cell. Then I'll show the YAML which is basically used to deploy an application with overly permissive permissions and we'll see how it increases the attack surface and I'll also try to get a reverse shell and I'll basically try to show what do we mean by overly permissive Rback in the real world scenario. So let's get started.

As we have discussed that Kubernetes testing is basically the testing of Kubernetes cluster because all the applications will be deployed on the pod. So Pod is nothing but a smallest unit in the Kubernetes. So when we talk about the docker or usually say it is container but when we talk about the Kubernetes it is the Pod. So Pod is nothing but a container. So usually we assume that in Pod we would only have one container running but in some scenarios we can have two or three containers running within a pod. So let me show you. So if I'll run, get pod.

So you will see right now I have this one slash one so this is our Pod. So we can have two also where we mean that in one of our pod we have two containers or more containers running based on the number.

Before that I just want to explain this cube-ctl. So cube-ctl is an application I would say which is used or a CLI tool which is used to communicate to the Kubernetes cluster. So in Kubernetes cluster let me just show you the nodes as well how many nodes we have and how our cluster is deployed. Right? Okay so if you'll see we have this control plane and two worker nodes. So all these are running inside the docker. So Kind is basically using the docker. No, Kind is basically using the docker as its overlaying technology.

So that is why these worker and the control plane everything is deployed in the docker itself. So it is a docker and docker scenario. So whatever these worker nodes are, they are running on the top of basically container. So this is what it is. And then when we talk about the real scenario we have a control plane or the master where all these worker nodes would talk to and where everything would work. This is basically the behind the brain of our Kubernetes. It is the master in EKS or GKE in those clouds we don't have master the control plane exposed.

So if you are in the EKS and you will run TTL get node, you won't get the control plane because you directly can't access it because this is a demo cluster or in the clusters where they are deployed on the EC2 instance like where we are deploying the master itself. In those cases we can definitely see it. But right now when we talk about the cloud, we don't have access to the master nodes or the control plane. So in those cases we will only see that sorry,

Workshop on Attacking a Kubernetes Cluster | Scaletozero
Live and practical experiment session on attacking the Kubernetes cluster. Watch our live interactive session with Divyanshu Shukla! Visit now!

Host: that is the managed piece of the cloud, right? They manage the control plane so that you just focus on your nodes. So it makes sense.

Divyanshu: That is basically helps developers and devotes. But as a security engineer if we can find a way to take over the control plane in the cloud environment that would be I would say big misconfiguration of the vulnerability. But obviously they have their things in place and it is like people are trying so it is not possible where anyone would come and they would directly hack into the master or the control pane of the cloud.

So we can assume that they are safe and the control plane itself is safe and we just have to focus on taking care of the worker node security.

This is two worker node and so it is a three node cluster and in control plane when we talk about this cubectl get pod. So this request reaches to your API server which is inside the control plane and then control plane check and it will see which is a key pair, kind of a DB which keeps these. States and how many pods are running these kind of things for maintaining and tracking it. Right? So this would basically help us to know that how many pods are running and that's how we got the pod. So this is a layman architecture of Kubernetes. Since we are talking more about the security, I just wanted everyone to go through the Kubernetes basic architecture. Although I'm not explaining these things in detail.

Host: No, but thank you for doing that. Right, it helps for somebody who is starting in Kubernetes, they can also understand what are the basics and then get into the attack side of things.

Divyanshu: So basically these topics are very huge in itself. Like when I started, it took me around one month to understand how these deployments happen in the cloud and how these deployments or this master is working in the demo cluster because there you won't be able to see anything, right? And everyone, especially right now, folks are starting mostly into the GKE, so they won't get all these things.

There are a couple of things which changes, right? So that is why it is an easy way of understanding the basics. And then when we move towards EKS in the future or if someone will go through the documentation, they would be aware that they won't see the control plane because it is managed by the AWS.

So now when we talk about the security risk, since we have containers running and anyone can deploy the containers, there is an increased attack surface. And then the second thing, if you remember from our images where we were using a docker file to create the image in those scenarios, also if someone has created a malicious image, a base image and they have uploaded onto the public on docker hub or anywhere public, and we are pulling it from there. So there also it can be a backdoor or it can be an image vulnerable with an old CV which can create havoc. Then the next thing is threat actors, right? It can be an internal or an external person who is having access maybe to connect to the Master or the talk to the API server or any internal user or maybe a disgruntled employee or someone who basically tries to get data or tries to exploit the Kubernetes cluster and have some kind of benefit out of it. So those threat actors can have some kind of permission with them and then they can try to basically exploit it. And then we have the malicious containers itself, like the backdoor containers. So there is a very thin line between a vulnerable container image and a malicious container because in images itself we can have CVS or the Vulnerabilities.

So that would be more or less like a vulnerable container image. And malicious containers are those containers which are directly having the backdrop. So the moment you will deploy those back load containers, backload images, they will give attackers a reversal so that is more of a malicious container or a container maybe once we ran it, it starts scanning the internal network. So those are the malicious containers which they are not directly vulnerable or they cannot be exploited by any user but they would exploit something or they would try to scan the internal network, right? So this is a very thin line I would say, between the two and then the insider threat and the malicious threat actor they would lie again somewhere between them that threat actors can be anyone like I am performing an external scan of some bug bounty program or any malicious user is doing that. So we are considered as a threat actor because we are trying to find an issue then as a bug bounty or a hunter or a security engineer out of the good faith I would go and report it. But those who are not doing it in the good faith they want to exploit something or gain out of those scenarios, they would go and they would start exploiting those scenarios and maybe try to exfiltrate the data out of it.

Right, makes sense. Below are the security best practices.

authentication and authorization. So authentication and authorization kubernetes is when a user is created into the Kubernetes cluster, right? By default there won't be any access. So if we talk about authentication or authorization in general we can assume that we are going into a plane so we are authenticated to get inside the plane but we are not authorized to get inside the cockpit where the captain would sit and she would fly the plane because we are not authorized to. So we can have a basic set of maybe we can have a checklist or a name that my name is there or my user is there like if we talk about AWS, right, where everyone can relate. So I have a user created but I have not attached any policy to that user so in those cases user won't do anything.

So that is the authorization part and the creation of that user into the AWS environment is the authentication part. So that is how the authentication works. So in Kubernetes, whenever a container is deployed or a pod is deployed that pod needs some kind of permission to talk to the Kubernetes cluster or the different pods within the network to the different network like based on the requirement. So in those cases just or any user I have got access right now because I was able to run cubectl, get Pod because I was admin, I had all the privileges. But if I create user, maybe your user push with them in those cases until and unless I won't give you the permission right to maybe list the Pod or create the Pod. So until you don't have those permission you are not authorized to perform these kind of activities. It is similar to the HTP verbs when we talk about the kind of verbs we are allowing as the part of authorization in Kubernetes.

So that is what authentication and authorization is. Yeah. So I will also show that and the next part is the network security where the two networks, basically two pods are communicating within the namespace. So I just wanted to add here before I move to the next part, that by default namespace do not block any type of communication. So I've seen people assuming that applications are running in different namespace, so they are separated and segregated. But that is not true. Like namespace just isn't just a logical separation when we are running pods or anything in the Kubernetes cluster.

And the next thing is CIS security scan. So CIS is just a benchmark for the best practices which we can use, where a bunch of like minded Kubernetes security folks came up and they created some best practices which are easy to follow. And everyone can have a basic idea like if they want to secure their Kubernetes cluster, if they want to find the misconfiguration, what are the common things they need to see. So this is our CIS security and then self assessment of the security in the cluster is about doing the Pen testing of the cluster, doing the Red Team assessments of the Kubernetes cluster, what are the entry points, what are the exit points, what are the vulnerabilities? And then trying to exploit those scenarios or maybe find something which can lead to a dos or a complete cluster compromise in the worst case.

And the last part is infrastructure Pen testing and patching. So when we deploy Kubernetes, we usually do it in the EKS or GKE. Right. So we have to make sure that our Kubernetes, apart from our Kubernetes cluster AWS or cloud itself is safe. Like if I'm a user and I have full permission to create or deploy the cluster itself, I would have access to everything. Like in those cases I can create my own user and do whatever I want. Maybe I would go and simply delete the cluster because now I have access, so I can have a dos without even having access to the cluster.

AWS EKS Audit

Audit your EKS to safe gaurd your data

Start Your Free Trial Now

GCP Kubernetes Monitoring

Your number of Kubernetes clusters can grow pretty fast - and so could be your pain points.

Start Your Free Trial Now

So these are some common best practices and some of the scenarios we would see in the next lab. So before that, if you have anything Purusottam you want to discuss or anything, then I would be happy to answer before we move to the hands on scenario.

Host: Sure. One thing that I want to double click on is where you highlighted. Right. Just because there is a namespace like you're deploying your workloads into two different namespaces, that doesn't mean the communication is not there by default. It's allowed. Right, which a lot of folks maybe do not know. So thank you for highlighting that and I love your analogy as well about the authentication and authorization in a plane.

So, yeah, I'm all good. Let's move to one of the scenarios of attacking a cluster.

Divyanshu: Okay, so just let me see where we are in terms of the application itself. Let me just check whether my application is running or not before we move.

Sure. Although I have deployed and what does this application do?

Host: If you can briefly share. What this application is?

Divyanshu: before that I'll just go into the hands on part as well. Just checking where I deployed this application so that I can just show the explanation in the flow. Yeah, so just let me start my application. So before that I'll just sorry it is in this four dot I'll just explain the YAML and then I'll explain the application and then I'll show you how basically our application is working. So I'll start with I'll just open these files first.

I'll obviously come to that back part. So in Kubernetes we have roles and role binding and cluster role and cluster role binding. So in role it is typically named to a namespace. So when we talk about this, right, cubectl get pods you will see I can only see one single pod but if I'll do something like this where I have QTL get pods.

So you will see now with Hyphen A which means all I was able to see multiple pods, right? So you can see with the pod command it is segregated by the namespace, right? With different namespaces we are getting different results. But when I do sorry, I was not let me zoom in here also. So when I do cubectl get nodes in this case you will see I am still getting three nodes, right? So this is free from any namespace, right? It is not any namespace specific like whether I am in any namespace A or the system namespace or any namespace, this would show me only three nodes. So there are some permissions which there are few resources, correct? Like few resources which are within namespace boundary and few which are not. Right? And also you have permissions to access some and not some others. Makes sense on that only we will assume that how the permissions need to be given. So for the permissions where we want application to access a specific namespace or a specific part of the cluster, right? So in those cases we would go into the basically role binding but when we want to give access to the entire cluster, we would move towards the cluster role and cluster binding.

Host: So this is a which scenario in which case would you use just the role?

Divyanshu: So we would use role suppose I want to give access to a specific namespace, right? I want user to have access to this cube system namespace for a specific set of pods like I want X user or like Portuguese user to see only the Kube system namespace. So in those cases I would give role and role bindings because I just want user to access this specific, right? If I want you to access all the nodes and all the namespaces. So if I'll create a role for every namespace, it will take ten files to YAMLs to be created and then the deployment would happen, right? So again we can do in a different way, but this is the way how it should be done, right? Makes sense. And cluster role is where we are giving the full permission, basically cluster wide permission, not the full permission. And when we talk about the misconfigured cluster role or like overly permissive role, it always starts with a star. If you'll see star everywhere, like you can see here, the API groups are star.

That means all the API groups within the Kubernetes are allowed. Then what kind of resources like pods or the namespace, everything has again started. And the kind of verb like the list verb or the delete verb or the create verb. So these are the verbs. So everything you'll see is star here, right? And the kind is cluster role. That means we are creating a cluster role and this is the name of our cluster role right now. Like your name, my name.

So this is just the name itself. So next is the binding part. So what do we mean by binding, right? And why I'm taking this name again and again, that some rule binding or the cluster rule binding. So we have to tell Kubernetes that this cluster rule is associated either with a person or a service account. So when we talk about bindings, it is just to tell that this role or the cluster role or the role, this would be associated to the Vanshu user or a poor Showtime user. So that every time Kubernetes master mode is that the banshee is trying to access. So it would know that based on the permissions in this x cluster role which is binded here, as you can see here, user or debianchu can access.

And in case of machines we don't have username and password, like machines can't go and log in or like they can't create the certificates. So for that we have a service account. So service account is basically mounted inside a container and from there, based on the permissions this service account would have my container would be able to communicate my container would be able to talk to the node, my container would be able to create more pods or maybe it is able to list the pods or do whatever it wants. Like it can be a requirement, right, for some communication or some creation. So that is why permissions are given. So this scenario is basically binding sounds sorry, I was just saying like binding sounds like how let's say in AWS you have a policy which is very much like your cluster role and then you have a user or access key and then you are attaching the policy to that user. So that is the binding in Kubernetes context, correct? So it is more towards the Google where the Google itself uses the service account.

Like if you know that the machines would have service account, this is exactly similar to that. Only in this scenario, we are just creating a cluster role with overly permissive role. You can see the star star and I'm binding it to a service account because I would run an application and then we would try to basically exploit this application and get inside this application by RC. So once we'll have RC, I'll try to do a couple of things which I'll show you. And this is the deployment of my application. So the kind type is deployment. So there are multiple types of deployment or like multiple ways to deploy a pod.

Like pod is there stateful sets are there, replica set is there, deployment is there. So there are multiple ways by which you can deploy a pod or a container. So pod in pod like I'll explain one in the pod when I use or mention something like Pod, right? Then my pod or my container created and if that container dies or it exits right? In that case it won't come out automatically. Like it will just stay in that state. But in the case of deployment you can see right my replica is set to one. So this is the desired and if my application or my container dies or my pod dies, then it would come up again. So my Kubernetes cluster or my master would try to bring it up every time if this container would die.

So that is why we are using the deployment, right? This is where Kubernetes helps, right? If you have defined Replica then Kubernetes will make sure that it is those many number of instances are available. Correct? This is what I was planning to add that no, this is how Kubernetes comes into the picture. Like if we had a container then we won't be able to do this thing, right? Our pod won't come up every time it dies.

Correct. Then we have a couple of labels to label is like the tags which we are adding. So if we want to manage our pod or anything based on the label, we can do that. It is similar to any tagging system wherever we use like an AWS or anywhere. So it is exactly similar to that. And then we have this service account name. So you will say the service account name is attached here.

So this is the service account. So if you'll see my binding, I have created this service account and I am adding it here. So now my application, sorry, my application would use this service account and whatever permission this service account would have, my application would have the same set of permission.

And the last part is this image which is the Insecure Python app. It is on the docker hub. And this is the name of my repository. So it's nothing else. The main part is this Insecure Python app and it is running on port 8000. I just mentioned that container port is 8000. And my image is this from where the Kubernetes would pull this image and would create a container and it will mount this service account and basically it will have the same set of permissions, right? And then I would show a couple of other YAMLs also very quickly, the namespace YAML is just to create the namespace.

You can see the Kind is namespace. So if you'll see whatever I am deploying, right, my Kind is changing everywhere. Like where I'm deploying a cluster role binding, my Kind is Cluster rule binding where I'm deploying the cluster rule. Like Kind is cluster rule for the namespace, the Kind is namespace. So it is just whatever YAML I'm deploying the same I have to mention in the Kind and the name. Like whatever name I'm going to give my namespace. And then we have this service and the Service account.

So don't get confused with Service and Service account. So services are networking part of the Kubernetes. Like if I want my container to be accessed outside the cluster or within the cluster, I'll use different different types. So the Kind would be service, but the type would change. There are a couple of types like cluster IP, node port ingress, load balancer. So load balancer is mostly for the cloud environment. It won't work in the bare clusters like this.

And then we have the cluster IP where we want to have the communication within the cluster itself. Like as the name says, cluster IP, right? So it would be allowed like one pod would be able to communicate to that specific pod within the cluster. So when we want to expose our application and want to be accessed it outside. Either we can have a load balancer, but because we are running inside the docker, we would have this node port. So it is a docker in docker, right? So like my this application would run on the top of a docker. Then that docker was communicating to my host. So directly we won't be able to access via node port, but in real scenarios, we just have to enter this node port and my application would be accessible directly here.

In this scenario, this won't happen because it is the client cluster. So we would do a small port forwarding where I would say that whenever the node port is running on 8000, I would say that all the traffic when it is hitting this, like my external IP, that should be redirected to this 8000 port within my specific Kubernetes node and inside that this container. So because there are two layers, two containers running. So that is why we are just doing the port forwarding.

And the last part is the service account itself as we are listening again and again. So Service account is just the kind of service account and the name of the service account and in which namespace the service account would be there. That is the service account. So these are the only YAMLs which are required when we deploy an application. Like only these things are like instead of deployment, you can have a replica set or you can have a stateful set so it will depend. But this is whenever you want to deploy or whenever I want to create a cluster right? All these are the only things required. So this is the minimum.

These are bare minimum that you need, correct? So if I have these things, I can deploy the application wherever I want. This is how we can start. If I have idea about all these values and if I'll see something maybe in our GitHub or in any host operating system when doing the Pen test, we would have idea right? Like there is a cluster role. So there should be a service account also or there should be a user also. So it would just give us the idea how the infra has been created, right? That is pretty much it. Let me just close these tabs very quickly.

To deploy the application, we don't have to do anything. I'll just go into the folder itself.

Okay. Because I've already deployed so I'm not redeploying it. But I'll show you the command. So I just have to go and just keep the idea correct. So I have cubectl create and the name of the folder itself. Since I'm inside the folder. So I can do a dot itself.

So this cubectl create command will create and hyphen f is just the set of files or like the folder itself where my all files are. So suppose if I'll go outside this cluster, I have to mention sorry, this folder then I have to mention this full RBad and it will take all the files and it will create everything for me. I can try to create that and I'll show something interesting here.

Create Hyphen app.

I'll do this. Yes. So you will see it is saying that everything already exists. Like you'll see the role cluster role is there. Cluster role binding is because I've already created it. So that is why this is showing this error. If this was not created, then it would create everything for me.

Or if I'll make any changes, then it will show configured. So this is our basic sample application. Now I'll quickly go and just do a code forward so that we can see what is running on the top. So I think I have missed something. Just let me see what namespace I have given here. Yeah, I think the namespace yeah, namespace had a different name. Yes.

Actually I was deploying it using a different name. Right.

Anyway, I'll just copy this namespace and do a port forwarding again.

Let me see right what has happened? Sure. Yeah, I think the service name is also need to be actually, there are multiple applications running. Like multiple I have multiple clusters. Right. So I think I've deployed some other cluster. Basically, I'm using a different lab for this. And the cluster is different, like the name which I'm using.

That is why it is saying CTL. Okay.

I've basically copied the wrong command. That is pretty much it. I'm just seeing what is the name of the service.

So I think if you remember, right from the day one when we started deployment of the Cloud, so everything was not working in my cluster. Also, like something was always down.

Yeah, I think we just need to change the service name here. Correct. This actually gives a great way of what we are doing. I'll just explain it. So you can see right now, because it was showing error, that this service is not found and the namespace is not found. So I just use the Get SVC command in that namespace to see what is the name of my service. And I've mentioned the name again and the namespace itself.

And now when I'll try, it should work. Right. You can see now, it has not shown me any error. So I'll just copy this public IP and I'll just enter the 8000 here.

This is insecure password manager. So you can see the application is working fine, right?

Host: Thank you everyone for joining us in this insightful workshop. This is part one of a multi part series focusing on Kubernetes learning, attack and defense. We hope you found Part One informative and engaging. Stay tuned for Part Two, where we'll dive deeper into practical strategies of attacking a Kubernetes cluster, which will help in understanding an attacker's mindset and areas that attackers exploit in general. See you soon. Thank you.

Automation First Security

Misconfigurations, Container Security, Attack Paths, Identity Management, Secrets Detection - All In One Place

Multi* Security Platform

Get the latest episodes directly in your inbox