Network Security

Hello everyone. My name is Muoi Tran, I am a post-doc in Laurent's group. I have been here for two years. I'm moving away, actually, to another country, becoming an assistant professor. I'm very happy to be here with you today, talking about what I have been doing in the last several years, about network security.

Slide 1

Before we begin, I'd like to ask you one question. What comes to mind when you think of network security? Come on, throw me some keywords. Anyone? TLS. Yeah, that's a good one. Anything else? Firewall. That's one thing that we are going to talk about today. DDoS. Yeah, exactly what we are going to talk about today.

Slide 5

So today we are going to focus on one of the very long-lasting and open challenges in network security, which is DDoS, which stands for Distributed Denial-of-Service.

Slide 6

Slide 7

So, for DDoS, today we are going to talk about a couple of things. The first thing, we are going to try to understand what are the popular DDoS attack vectors, what are the real things happening on the Internet nowadays. We will also look at the common DDoS defense mechanisms. And because we learned P4 a couple of weeks ago, we also learned how to apply network programmability for DDoS defense.

And for the exercise session, you will have some hand-on labs, where you actually defend against the kinda real DDoS attacks. And just keep in mind that the exam questions will cover all of them. So I hope that you stay with me throughout the lectures and the exercise as well.

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

And continue with the logistics. Thanks to Laurent, we are already running behind. But then what I will propose is that we will use the first hour to talk about the attacks and the defenses. Then you have 15 minutes to break. I will give my signal again, and clock these three. And then we spend about half an hour to talk about the third part. And for the exercise, it's a standalone session. Sounds good? Okay, good.

Slide 13

So first, why should we care about DDoS? So there are many, many cyber attacks nowadays on the Internet. And DDoS is basically one of the most popular vectors out there.

So I capture this snapshot about less than one month ago when the real current DDoS attack happened. And the time frame is about 10 seconds, and basically it records about more than 200 DDoS attacks occurring during that time frame. And if you do the math, basically it turns into about 50,000 DDoS attacks happening every day.

Slide 14

Slide 15

Slide 16

DDoS attacks are very popular, and are also very harmful for all the enterprise of the company out there. Including the big names like Twitter, GitHub, Paypal, for example that you are seeing on the screen right now.

And not just the big company, DDoS attack also affects a lot of individuals as well. So if you are into gaming, you will know how important the latency is to the gaming performance. And apparently DDoS actually was recorded to affect a pro CSGO gamer a couple of years ago. And you may wonder why this may affect myself. Actually if you are into gaming, DDoS attack can be launched against your game as well, because it is so cheap to be launched.

Slide 17

Slide 18

So the starting price, or the attack cost, actually starts from only $10 USD to a customer like Trivial, Easy, or Simple website. And this is one very worrying thing about DDoS attack, it's super cost effective.

So the cost for launching the DDoS attack is super cheap, but then the cost to the enterprise, to defense of basically like the cost of learning, loss of value, is way higher than that. So some statistics showing that you can lose up to $6,000 USD when being under DDoS attacks.

Slide 19

Slide 20

Slide 21

So I have one quick question for you. Who do you think are the perpetrators of DDoS attacks?

I have three options right here. Unhappy customer, like the losing guys in the CSGO games.

State level attackers mean that some countries attack some other countries.

And competitors mean, what does it mean? You are trying to kill the opponents, kill the competitors.

What do you think?

Soak it in?

Okay, who do you think the answer is A?

Okay, we have some answers.

Who do you think is B?

Good, good.

How about C?

Okay, we see the winners here.

The answer is actually C, the competitors.

Slide 22

Slide 23

So what happens is that CloudFlare actually did a survey to their customers, and asked them, who do you think that launched attack against you?

And most of them are very, very confident that the majority of the DDoS perpetrators are the competitors.

And one interesting thing, of course, is we pointed out, not all the DDoS attacks are intentional.

Actually there is more 3 percentage of the DDoS attacks were launched by the customers themselves.

Basically they want to test out whether CloudFlare is effective, as they say.

And turns out, yeah, it's the case of DDoS itself.

Slide 24

Slide 25

Okay, so to answer the question I raised earlier, why do you care about DDoS?

The short answer is that DDoS attack is open problem, and it is not going to go away in any near time soon. So it's important for you to learn about the attacks and learn about the defense, which I hopefully will tell you in this lecture.

Slide 26

Alright, so let's jump right into the first bullet point, which is understanding popular DDoS attack vectors.

Slide 27

Alright, so the first question we need to answer is, what are the distributed denial of service or DDoS attacks? And similar to anyone in 2024, I asked ChatGPR, what is DDoS? And it turns, after a couple of tries, ChatGPT returns to me with a pretty good answer, which I will read aloud for you.

So DDoS attacks are cyber attacks that overwhelm the target's network, service, or server, with a flood of traffic from multiple sources, aiming to render the target unavailable to legitimate users. And the reason why this is such a good answer is that it basically tells all the important factors or properties of DDoS attacks.

First one, it talks about the attack focus, which is the network, the service, or the server of the targeted victim. It also talks about the attack strategy, which is basically trying to flood the target from multiple sources. This is where the "distributed" come from. And of course, there is the effect goal, which basically tries to deny a service to the legitimate users.

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Alright, so let's try to see it closer from one example, which I actually show earlier. So this is one of the very famous DDoS attacks in 2016. A DDoS attack against the DynDNS. So back then, DynDNS was one of the most major DNS providers on the internet, and it was targeted by a Mirai-based botnet.

So what happens from the user perspective is that, basically, if you try to visit the DynDNS to resolve a URL into some IP address, you simply cannot do so, like all the packets sent to the DNS servers are dropped, actually most of them. And from the Dyn's perspective, so they have multiple DNS servers, I will show one of them on the right side for now.

So what they see is that there are a few hundreds of thousands of new IP addresses, making the request to their servers. They trace it back, and it turns out there are about more than a hundred thousands of them are Mirai-affected devices. So in case you haven't heard of Mirai, it's a botnet, which basically they are IoT devices that are compromised with the malware, and basically it enables the attackers to control this IoT device, to send certain traffic to some destinations.

And another thing is that they also see a lot of TCP and UDP traffic coming to port 53, basically they act as DNS requests. And also they see a lot of huge spikes in terms of volume traffic, they see 50x more of normal traffic.

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

So besides this attack, there have been several other high-profile DDoS attacks in the last couple of years or so. If you follow the news, you know that it has been a long time, but Spamhaus, I think it's one of the email providers, and then they was attacked by a very sophisticated type of attack called link flooding attacks, and it took them several days to resolve this type of attack.

And you know Mirai botnet, so the one in 2015, an attack on GitHub, one of smart DDoS attack as well.

So you know that in China they have this great firewall. So what happened is that anyone trying to access China's website through this firewall will be used as a reflector to attack GitHub. So it's one of the kind attack, I would say. And then I think one of the biggest and more recent ones is the attacks against Google last year, it's called HTTP2 Rapid Reset.

And it's one of the attack that Google and even CloudFlare and Amazon are hit, and they say that at the moment there is no solution yet. So basically all they do at the moment is just to absorb the traffic.

Slide 38

So all these attacks have the same attack strategy, and the same attack goals, strategies, multiple sources. Goals are denial of service, right. But then the ways they do it are different.

So it's called DDoS attack vector, and there are several of them. So the challenge here is that how do you remember all of them? How do you understand all of them? I will actually go into a couple of popular DDoS attack vectors.

Slide 39

So one of them is the DNS amplification. So you have the attack attacker on the left-hand side, and the victim on the right-hand side. So what happened is that now you have multiple or a lot of DNS resolvers around the world, the attack code can simply make DNS requests to this resolver, but with a spoofed IP address of the victim. What happened after that is that the DNS resolver will basically send the reply back to the victim.

And one key part of DNS amplification is that it amplifies the attacks, meaning that the attacker will need to send a very small amount of traffic, but then the victim will receive way more in terms of volume.

Slide 40

Slide 41

Slide 42

Slide 43

Another popular type of DDoS attack vectors is SYN flooding.

So again we have the attacker and the victim. So normally with a TCP handshake, you have three ways. You have SYN, SYN/ACK, and ACK, right? But what happens in the SYN flooding attack is that, the attacker will send a lot of SYN packets to the victim directly. And of course the attacker can also spoof the IP addresses. So normally the victim will need to send a SYN/ACK back to the source of this SYN packet. And in this case the attacker will simply not respond and not complete the TCP handshake. And for that the victim will run out of connections.

Slide 44

Slide 45

Slide 46

So another popular DDoS vectors called Slowloris. So basically you try to overwhelm the victim with the HTTP connections. So as usual you have the normal user, you just send the HTTP GET, and then you get the response.

And the attacker does in this case basically send a lot of requests, but never complete. Just like legitimate users, but just very slowly. Just wait for me. But then when the attacker does this, the victim just runs out of the opening or available connections for the legitimate users.

Slide 47

Slide 48

Slide 49

Alright, so as I said, how do we understand all these type of DDoS vectors? These are just three of them. There are way more than that. How do we understand all of them? And it turns out that you can see the DDoS from multiple perspectives, you can see it from the victim perspective, and you can also see it from the attacker perspective.

Slide 50

Slide 51

Slide 52

So what do we mean by that? From the victim perspective, you can ask a simple question. What components of my connection are affected by DDoS attacks?

So as you see from the OSI models, you have seven layers. But for the DDoS attacks, normally they target the network layers, the transport layers, and the application layers.

For the DDoS attacks that affect the network layer, as the victim, you can see that all of your bandwidth is consumed. And a couple of DDoS vectors can be categorized into network layers are UDP flood, ICMP flood, and DNS amplification, which I just show you a couple of slides ago.

Affecting the transport layers, so basically you can see that all available state on network device, such as firewalls or load balancers, is consumed.

So basically, let's say there are some connection tables, they are all full. And some of the examples, including TCP connection floods, SYN floods that I just showed you, or even the ACK flooding.

For the application layers, basically the attacks here try to exhaust the memory or the CPU of the victim, so as a victim you can see that as well. So, a couple of examples, including HTTP floods, the Slowloris, and even DNS flooding.

Slide 53

Slide 54

Slide 55

Slide 56

Slide 57

Slide 58

Slide 59

Slide 60

Alright, so let's see, so I showed you the attack against DynDNS. What layers do you think was affected from victim's point of view? So, is it network, transport, or application? Anyone?

Okay, let's try to raise the hands. Raise your hand if you think it's A. Ready, the network layers. Raise your hand if you think it's B, the transport layers. Okay, we still have one, two. Good. How about C, the application layers?

Alright, so we have a split between A and C, network and application. So, the correct answer here is C, the application. Because although the attack involved DNS, but then the attacks here were the flooding of DNS requests, so it affects the application layer of the servers.

So, if you read up the report about this attack, you will see that the bandwidth of the DNS servers from Dyn are not congested. But then they just cannot simply receive any extra DNS requests from the users.

Okay, so it's kinda a trick question, but the correct answer is C, the application.

Any questions? Okay, so just keep in mind that you can always stop me, raise your hand, and ask questions.

Slide 61

Slide 62

Alright, so let's see the DDoS from the perspective of the attackers. So, let's say again we have the attackers and we have the victim. So, the question is how is the attack traffic generated? And it turns out there are some very recent literature about, like how you categorize the attacks.

And then basically you can categorize them to either the direct path, meaning the attacker directly sends the traffic to the victim. And in this category, it can be using the spoofed traffic or without the spoofing traffic. For the spoofing, you can think of SYN flooding, UDP flood and ICMP flood.

And the idea of having the spoofing is basically trying to hide the identities of the attackers, making it harder for the victim to trace back. That's it.

And there are some attacks, some direct DDoS attacks require no spoofing. Actually, with spoofing it doesn't work, such as HTTP Flood and SlowLoris, so all of these are application-level DDoS attacks. And it requires the connections to be established between the attackers and the victim. So, no spoofing in this category.

Okay, so the second category for DDoS attacks from the attacker's point of view is called reflection-amplification. So basically you try to send the spoof traffic to the reflectors. So in the DNS application attack, the DNS server or the DNS resolver acts as a reflector here. So yes, as we can see now, there are a couple of examples. It can be DNS, NTP (stands for networked time protocol), and Memcached.

So you have all the public servers on the internet, where the attacker can try to send some spoofed traffic, and that is with a lot of traffic back to the victim.

Alright, so one thing to note about DDoS attacks is that most of the time, I have been using one single attacker, but most of the time they actually use botnet to either send the direct traffic or also send the spoof traffic to the reflectors. So botnet is a real pain on the internet right now.

Slide 63

Slide 64

Slide 65

Slide 66

Slide 67

Slide 68

Alright, so now I have another question. What do you think? Was the DDoS attack against Dyn generated via reflection amplification? So it's very simple, yes or no? Who think it's a yes, meaning the attack was generated by the amplification? Yes, we see some hands. What about no? Alright, good.

So we can see that there are more hands for the no answer. And actually you are right, so the answer is no. It's not reflection amplification. There are no reflection being used in the DDoS attacks. The attacks were sent directly from the botnet to the server. So there are no reflections at all.

Slide 69

Slide 70

Alright, so now if we look back at all the DDoS attack vectors that we mentioned, now we can have a better perspective, seeing them both on the victim side and the attacker side. So for example, we have DMS amplification. So it's a network level vector. So the strategy is reflection-amplification and you have the same thing for SYN flooding and HTTP flooding attacks.

And there are a couple more DDoS attack vectors that you may see in practice as well, including UDP flooding, NTP, Memcached amplification. So these are the layer 3 DDoS attacks. And you can also see a lot more. So the attack against Dyn is actually the last one. So it's not an amplification reflection. It's a direct path.

Slide 72

Slide 73

Slide 74

I hope that now you have a better understanding of a lot of DDoS attack vectors. So is it done? Now we understand how the attack vectors work in practice. It turns out that in practice, the attackers will use multiple attack vectors, not just one. And there are actually some statistics for that. So there are some statistics from NetScout. And they showed that actually about half of the attacks recorded were actually using multiple attack vectors from 2 to 5. So it basically makes the life of the victim harder.

Slide 75

Slide 76

Slide 77

And also in practice, if you look at the network traces, you see that nowadays, actually in the very recent years, you see a new trend of DDoS attacks called the Pulse Wave attacks. And what happens here is that the attack vectors now control the traffic being generated and directed to the victim in such a way that it goes to the victim in a wave. And the wave here is actually quite short, ranging from a couple of seconds to a minute. And by doing so, you don't need to send a lot of traffic. But at the same time, you can be super effective against the victim.

Slide 78

Slide 79

So to summarize the first part about DDoS attack. Whenever you see DDoS attacks on the news or in practice, you happen to catch a DDoS attack, ask the three following questions. What are the affected victim layers? How is the attack generated? And are there any advanced strategies, like Pulse Wave, or multiple vectors being used?

Slide 80

And by answering this question actually, it helps you to defense against the DDoS attack as well. And this is what we are going to talk about now. So let's talk about the defense.

Slide 81

So DDoS defense, by definition, basically tries to keep the service accessible to the users. And normally it includes three phases, in general.

So provisioning means that when you design a system, try to avoid any single point of failure, make it a target for the DDoS attack. Try to also plan the resource for the peaked hours. For example, you run a website planning for the Black Friday sale, for example. So try to have more resources, way more than the normal load. And also try to have backup regularly, try to have fail-overs. This is not the focus of a today talk though.

And the second part of the defense involves monitoring. Because without monitoring, you don't even know that there's something wrong going on. And monitoring involves collecting data from the bandwidth, from the CPU, from the memory. You can try to detect the congestion failures. And of course, it helps you to actually trigger the mitigations. Basically, I would refer you back to the measurement lectures that we had a couple of weeks ago on this.

I think the focus of today's lecture will be on mitigation. Mitigation basically means that as a victim or as an infrastructure, you should reduce the traffic or prevent traffic from being generated from the first place. Again, this will be the focus in today's lecture.

Slide 82

Slide 83

Slide 84

Slide 85

Slide 86

Slide 87

Slide 88

So, the question is when it comes to DDoS defense data, where do we implement the DDDoS mitigation? If you look back at the path from the attackers to the victim, actually the mitigation can be implemented anywhere on the attacking path. So now, see the model that I showed you a couple slides ago. We have the attacker on the left-hand side, we have the victim on the right side, we have the botnet, we have the reflectors in the middle.

So what options do we have? We can have the defense near the attack source, which is the botnet or the reflectors. These are the originators of the attack traffic. You can have mitigation implemented at the transit network between the traffic originators and the victim. And also you have the option of implementing it at the target network.

Slide 89

Slide 90

Slide 91

Slide 92

Slide 93

Slide 94

So let's talk about the mitigation near the attack source. One good thing about mitigating the DDoS attacks early is that it can prevent the attack traffic from polluting in the entire internet.

And you have a couple of options for doing so. You can try to prevent IP spoofing, which is a major source of DDoS attacks. You can try to disable the reflectors. And you can also try to basically contain the botnet, which is again a major source of DDoS attacks. And we are going to talk about all these three.

Slide 95

Slide 96

Slide 97

Slide 98

So to prevent the IP spoofing, so a lot of networking vendors, now when they sell the routers, it always comes with something called Unicast Reverse Path Forwarding or uRPF. So basically it will allow you to filter the packet without the valid IP address.

And how does this work? So basically now you have the botnet here, you have the routers, and before the traffic is sent to the internet, so the routers will do a very, very simple check. It basically asks a question. So whenever they see your packet, they ask a very simple question. Are the source IP addresses in my forwarding table? Very simple. And if they are, then you forward it. And if they don't, it means that you don't have any route to this IP address in your network, then you simply drop it. That's good.

And uRPF also has two well-known modes. It's called strict mode and loose mode. So in the strict mode, basically you also check for the matching interfaces. So let's say you have the routers with two interfaces, Fast Ethernet 0/0 and Fast Ethernet 0/1.

And now let's say that you are going to reach the prefix over there, 2.2.2.0/24 via the Fast Ethernet 0/0. So let's say when their packet with the source IP address, 2.2.2.2, that beats you through two different interfaces. You are going to drop, you are going to let the one receiving the exact same matching interface go through, and you drop the rest. So this is how the strict mode works.

And of course, when you are a customer of a multi-home network, this may hurt you, because you may send the traffic through different interfaces. So actually, many ISPs nowadays run uRPF using the loose mode. And basically it's only checking for the matching source IP address. And that will allow any interface to be the receiving end.

Slide 99

Slide 100

Slide 101

Slide 102

Slide 103

Slide 104

Slide 105

So another way of mitigating the DDoS traffic early. Like, let's say we can try to disable the reflectors. And basically there are multiple reflectors on the internet right now. And we have to have a different and specific defense for each of them.

For NTP servers, one of the very popular ways of using them as amplification is that you just send them a command called get monlist to get a multiple list of monitors. And as far as I remember, the amplification factor was several hundreds. So the way they do it nowadays is basically having the server disable this kind of commands.

For Memcached servers, users try to disable UDP traffic. This is just the current best practice. And for DNS servers, normally you should try to have some rate-limiting, try to prevent some very dangerous command, like ANY requests. Or if you run a DNS server, try to limit it to the customer, the zone customer or within your private network, instead of having it accessible from the all. And basically it's a known problem at the moment. There are no known solution yet. It seems very challenging to address this kind of reflectors.

Slide 106

Slide 107

Slide 108

So another thing is to allow mitigating the botnet. And it's called using a tarpit or to trap them . This is one of the pretty cool things in literature from the academic point of view.

So what I put now is that, as you can see from these figures, you have the potential victim, which is an IoT device on the top left. You have the already compromised device on the top right. So what they do is that now they have the honeypot and the telescope just to try to figure out who are the compromised devices. And what they do is that they try to find the infected the device with the honeypot and the telescope. And the next thing is that they run some sort of decoy victim/ decoy devices.

And when they have that, basically all the compromised devices will try to make the connections to this decoy device. And as a decoy, you just keep the connections open all the time, just make it alive, so that the affected device cannot reach to the potential victim anymore. So basically you're trying to, in a way, you are trying to deny of service the botnet, so that it cannot spread further. It turns out that it can be quite cheap and effective for doing so. So in this paper, they say that it's costing you 30 euro a month to provide contain like half of the real-life Mirai botnet out there on the internet. Surprisingly good.

So these are the mitigations that you can implement near the attack source.

Question? Yes? This setup, does it have to be specific for each botnet? Or do you have to reverse it in your botnet?

That's a good question. So in the paper, they tried with Mirai botnet, which is one of the most popular ones, and they did exploit a very specific behavior of the Mirai botnet. So I'm not quite sure if this can be extended to other types of Mirai or other types of botnets. But it's a good start. It's one of the kind papers, I would say.

Slide 109

Slide 110

Slide 111

Slide 112

Slide 113

So moving on to one other question. What are the downsides of having mitigations near the attack source?

Any ideas? Yes?

It might be kind of expensive to do that. The attacker can just quickly switch to another attack source, and then you kind of...

So you're like chasing the wild goose. So I think that's a very good point. I didn't think of that, to be honest.

So there are a couple of downsides that I can think of. Actually, the mitigation here only helps out the networks. So meaning that, let's say, you are the network containing the botnet, right? You are not under the attacks. You are only helping out the people. So in practice, it's kind of lack of incentive for you to implement this type of mitigation.

And the downside is that it's quite difficult for the victim to initiate this type of mitigation. So when you are under the attack, let's say you want to trace back with the source IP address, and then you ask them to stop the traffic, which is impossible. But that's a very good point. Thanks a lot.

Slide 114

Slide 115

Slide 116

So another option is that we can try to mitigate the DDoS attacks at the transit networks. And of course, the good thing about this is that you can handle DDoS attacks at scale. And we have multiple options actually.

Now you can try to filter the traffic with a very dedicated hardware. You can use scrubbing centers, or the most popular option nowadays is actually to redirect the traffic to the cloud-based services for them to clean the traffic for you.

Slide 117

Slide 118

Slide 119

Slide 120

And for the hardware, there are multiple of them on the market. We have one from Radware, Arbor, Corero, A10. So we have all the big names in the DDoS defense.

And what they do, like of these big devices, what they do is that they try to identify the attacks. So you just consider them as black box devices. They take the traffic as an input, and they return the clean traffic as an output. So what they do is that they try to identify the attacks based on the signatures. And the signature here can be basically using traffic pattern, like the traffic rate, the volume. Some of the devices can also filter the traffic based on the known malicious IP addresses. Or the traffic comes from the country with bad memory placement.

So in case you don't know, I'm a Vietnamese, and actually it turns out that Vietnam has a real bad reputation for DDoS attacks. Because we simply run a lot of IoT devices, and all of them are vulnerable. So what I can guess, I don't know, to be precise, but I can guess that whenever they see a huge traffic from Vietnam, they will simply block it.

And basically, all these devices, although they look like this big, they can handle multiple traffic volumes. Like depending on the setup, this can reach to 10 Gbps or 400 Gbps even. But it comes with the downside that this can be super expensive. So the device that I'm showing you here costs $100k, but then the more advanced one can cost $1 million, $2 million, easily. It can be super expensive.

Slide 121

Slide 122

Slide 123

Slide 124

Slide 125

So the second option is that we have the scrubbing center. And basically they work in the same way. They take the, I would say, within the ISP network, for example, they will redirect all the traffic through the scrubbing center, and return the clean traffic back. And basically, the scrubbing center consists of lots of servers and devices with large bandwidth capacities. And here, they can also use the customized devices that I showed you in the previous slide. And again, they also filter the traffic based on the signatures, and most of the time they can have the network operation center to do that also.

And of course, in order to build the scrubbing center, you need to be a pretty large ISP, or a large network provider.

Slide 126

Slide 127

Slide 128

Slide 129

Slide 130

All right, so the most popular, or actually I would say the most popular option that you can see on the Internet right now is using some sort of cloud-based defense. And it's very suitable for smaller or medium sized organizations. So some of the big names you can see, CloudFlare, Radware, Akamai, for example. And the way they work is that, normally they have a CDN, and they also have multiple scrubbing centers themselves.

And let's say you have a victim on the right side again. So what CDN does is that they will announce the prefix from the victim, and of course, with the permission from the victim. And they will also have DNS servers, so that whenever someone try to reach the victim's prefix, it will be distributed to different scrubbing centers.

And then they will send back the clean traffic through some GRE tunnels back to the victim.

Slide 131

Slide 132

Slide 133

Slide 134

Slide 135

Slide 136

Slide 137

All right, so let's discuss what are the downsides of the cloud-based DDoS mitigations. What do you think? Any ideas?

Yes? It adds latency to the path. Very good, exactly.

Yes? Not scalable, to lower company and organization. For the bigger ones? For the smaller ones?

For the smaller ones? Yeah, so when you say it's not scalable, I don't think that is the case, because they are huge, right? Basically, CloudFlare at the moment can absorb any kind of traffic that you can see. But you are right about the second point, which is it can be very costly, especially for small-sized organizations or individuals.

Let's say I run a website, which is not a problem, but then if DDoS attacks happen, I can do nothing about it. And of course, as you mentioned, it can include some sort of latency, because basically the traffic doesn't go directly to you. So these are the two downsides of cloud-based mitigations.

Slide 138

Slide 139

Slide 140

All right, so we still have one more part to go, which is the mitigation at the target network. So basically, it gives you more control for the incoming traffic. There are a couple of options. Of course, you can do this with a firewall or access control list. You can do it with a BGP. And these are all BGPs.

So you can do it with remotely triggered black hole, or BGP flowspec.

Slide 141

Slide 142

Slide 143

Slide 144

Okay, so filtering with firewalls is basically, I would say, the oldest trick in the book here.

So now you have the victim. So here, the network that you are trying to implement the mitigation, you have the traffic coming in. So what they do, actually, at the beginning of DDoS attack era is that, they try to trace back the source traffic.

And then they try to basically add the firewalls rules on all the routers. And for this, actually, it's worked quite well. And actually, that's what you can see in the exercises as well. You can really block the traffic based on the 5-tuple of the packets, so very precise.

So one thing that you can also see in the exercise is that when you try to add more rules to different routers, it can be quite time consuming, and it can be error-prone as well. And of course, firewalls here is just a term. It can run in hardware, it can run in software, and virtually if you use a cloud as well.

Slide 145

Slide 146

Slide 147

Slide 148

Slide 149

Slide 150

Alright, so another option that you can do is that you can try to do RTBH, or remotely triggered black hole. So what happened here, now you have a network here, you have a victim, right? And you have multiple edge routers that are receiving the DDoS traffic. And the idea here is that you want to install the firewalls rules so that all these edge routers will drop the traffic towards the victim.

And of course, we have something called the trigger routers, and they maintain the iBGP sessions with each other. So what they do with that, first of all, all these edge routers will need to do some sort of setup. They will need to set some unused IP spaces to the null0 interface on their routers. This can be any IP address, but in this example, we use 192.0.2.1. So you want to set unused IP address to some dropping interface.

And let's say the victim is under DDoS attack, and the victim basically has that IP address. Okay? This will request the trigger routers to do the black hole for this IP address. The trigger routers will basically try to send out BGP messages that set the /32 prefix of the victim. And the next hop for that will be routed to the unused IP address being set by the edge routers in the previous step.

And of course, when all the edge routers do that, they will route all the traffic coming to the victim IP address through the dropping interface on the edge router in the setup phase.

Slide 151

Slide 152

Slide 153

Slide 154

Slide 155

Slide 156

And what makes this black hole technique quite popular is that it can also be used with upstream providers. So now you have upstream providers on the left-hand side. And basically what they do is the same thing. You have the edge routers. You set an unused IP address to some interface on your edge routers to null0. And also you need to set up communities. In this case it can be 666. Just set up the communities for the black hole. And let's say the victim network wants to black hole that IP address: 100.64.6.65.

So it's been sent out a BCP message which is a tag for the community being sent by the upstream providers. And it will be sent via the eBGP session to the upstream providers. When this happens, all the edge routers will basically route the victim traffic or basically route any traffic through that prefix to the null0 interface.

So in the null0 interface, it is dropped directly? Yes, it will drop directly.

Can't the victim send the BGP message directly? Most of the time you need routers for that. So the trigger routers are mostly run by the network operators. So as a victim, you contact them to do so. But if you have a router, you can do so as well. But of course, we need to have the BGP sessions already. So I guess that is the point.

Slide 157

Slide 158

Slide 159

Slide 160

Alright, so this is quite useful. But then we also have a couple of downsides. So for example, as you can see, it requires a lot of setup in advance. And so it actually helps the attacker in the sense that now the victim is not receiving any traffic at all. And if you remember the definition of DDoS attacks, it prevents the legitimate users from using the services. So it actually completes the attack in the sense.

And of course, as the victim, you need to change the IP address, for example, so that you can continue the services for the legitimate users. And of course, the attack can try to DDoS this again. So these are a couple of downsides of this technique.

Slide 161

Slide 162

Slide 163

So the last option that we have is BGP FlowSpec. So this is something proposed more recently, I would say.

Okay, so the promises here are quite nice. It provides you the precise control as a firewalls. But at the same time, it provides the same automation as an RTBH.

So how does this work? So now, think about as the extension for BGP messages. So now you have two routers. They need to be enabled on both sides of the BGP session.

And this is what the announcement looks like. It's based on the match and the action. So for matching, actually you can match the traffic based on multiple types of fields. It can be based on the destination prefix, source prefix, protocol, so on and so forth, all things that you can think of.

And the action can be the traffic rate, which is basically the most relevant that you can see. So if you want to drop the entire traffic, like the remotely triggered black hole, you just set the traffic rate to zero.

But you can do many other things. For example, like redirecting, which is super useful when you have a cloud-based DDoS defense, because you want to redirect the DDoS traffic to the cloud. But then when the cleaned traffic go back to your network, you want to redirect it to another path so it doesn't go into a circle.

So again, for example, so now I have one example where you have an announcement, where you try to match the destination, which is the victim, the protocol is UDP, the port numbers, and the action here is that I want to drop everything, traffic rate zero.

And then you just announce this to the upstream, and then it is propagated to the routers from the upstream providers, it will install the firewall rules. So that any traffic coming in matches the rule will perform this action. So that's how this is how BGP flowspec works.

Slide 164

Slide 165

Slide 166

Slide 167

Slide 168

Slide 169

Okay, so of course it also has a few drawbacks. So first of all, there are no acknowledgement mechanisms, meaning that once you send out an announcement, you don't really known whether the rule being applied, or the new firewall rules being installed. And also you don't have the ordering, like you have when you install the firewall rules. Actually, that's one thing that you can also see in the exercise as well.

And of course, the adoption is pretty slow at the moment. That's why it's not as popular that you can hear on the other one.

Slide 170

Slide 171

Slide 172

All right, so there is a very big summary. So all these mitigation have different pros and cons. So yeah, you see on the screen, we are running out of time, so I'm not going to read that.

Slide 173

Slide 174

Slide 175

So we have reached the last slide of the first session of the lecture. So what is the best DDoS mitigation?

So the answer is that the DDoS attacks have multiple vectors, and you should have multiple layers of defenses.

And for that, we are done for the first half of the lecture. Thanks so much for sticking with me.

Slide 176

Slide 177

Slide 178

All right, so we talk about the DDoS attacks and DDoS defense, and in the second half of the lecture, we will talk about how we can apply network programmability, which you learned a couple weeks ago to defense against DDoS attacks.

Slide 179

So in the previous half of the lecture, we know that there are multiple purpose-built hardware or servers are used for DDoS filtering, right. And there are multiple limitations with this kind of device.

They have high initial cost, so hundreds of thousands, millions of dollars. Also, they have limited capabilities. So actually, from some of the device over here, even when you bought them, you still need to pay for the subscription, so that they get updated with the new attack signatures.

And clearly, one of the limitations of this kind of device is it's very difficult to upgrade. Once you build there, it's there. It's very difficult to perform any new functionalities, for example.

Slide 180

Slide 181

And you may already know it from the network programability lectures, that we now have programable switches. They can provide an alternative solution, and especially for DDoS defense. So in the figure here, it's the switch that we have physically at our lab.

So programmable switches, overall, they guarantee line-speed processing. They can have some sort of flexibilities when you implement P4. And I should say sufficient efficiency, because now you may have known that implementing any kind of programs in P4 is kind of a pain. And of course, it has low cost compared to all these kind of purpose-built devices that I just mentioned.

Slide 182

Slide 183

Let's try to build a kind of naive DDoS defense using the programmable switches. So we have control planes on the top side, we have data planes on the bottom side. And let's say we have a network over here, and we have a P4 switch at the edge, and you have multiple other switches in the network.

So what you would do is that when the traffic coming in, you try to, as a controller, you try to pull some statistics from the traffic. And then let's say you implement some sorts of signatures identification, you try to see that, hey, the DDoS is going on, I know this is a SYN flooding attacks, for example. And basically you try and make the filter rules back to the switches. And from that, green traffic coming out, all right? So see how a naive DDoS defense using a P4 switch can be implemented.

Slide 184

Slide 185

Slide 186

Slide 187

Slide 188

Slide 189

So what are the issues with this naive DDoS defense? Anyone? Any ideas, any guesses? So you think this is the end game already? You think this will work, no matter what?

You have to identify the attack signature. Yes, that's one of the biggest issues.

Slide 190

All right, so other than that, we have actually multiple other issues with this kind naive design also.

So as you said, they only cover some kind of known attack signatures. If new signatures or new attacks emerge, then this kind of defense cannot defend right?

And of course, there is something else, something more technical with that. When the traffic is coming in, you also need to have some sort of threshold so that you can start to trigger the mitigations.

And setting this kind of threshold is actually quite difficult in practice because if you set it to be too low, then it can be a lot of false positives. If you put it too high, then the attack may already cause damage.

Another issue is that it's actually have quite a slow reaction time because as you can see, it's require kind of one round trip when you have to pull the statistics from the data plane to the control plane, and then you have to add the filter rules back. So it's some sort of slower in terms of reaction time and which is quite important when you want to defense against DDoS.

And another issue is that when you try to add these kind of filter rules, you may actually drop the production traffic. So false positive in terms of attack traffic and misclassified traffic can also happen.

Slide 191

Slide 192

Slide 193

Slide 194

Slide 195

All right, so in our lab, we actually used programmable switches to implement a DDoS defense called ACC-Turbo. And this will published at ACM SIGCOMM 2022. And in case you don't know, this is like a flagship networking conferences, like the best networking conference out there.

All right, so all these challenges are addressed by a system called ACC-Turbo. And this is what we are going to learn in the next like 20 minutes or so.

So the ideas of ACC-Turbo is to use a congestion control and it's based on aggregates of the traffic. And for this, it make the defense very generic in the sense that you don't really need to know about the attack signatures.

Also, it has some sort of like traffic clustering, like you divide the traffic into different clusters and this can be implemented in P4, make it run at a line speed on the data plane.

And also, it has some sort of scheduling algorithm. So basically, it doesn't drop any traffic, but only try to deprioritize them based on the scheduling algorithm. So it's a risk of dropping the benign traffic is less. And of course, it has something called on-way-on clustering, meaning that you don't really need to set any kind of threshold for it to be activated. It runs all the times. So these are very high level ideas of ACC-Turbo.

Slide 196

Slide 197

Slide 198

Slide 199

Slide 200

And in order to understand ACC-Turbo, you will need to understand what it is built on. So it's called ACC. It was proposed like more than two decades ago. So basically, what happened here is that like, so now you have the traffic coming in, in the data plane, and you have a RED queue, or random early detection queue. And basically, for this queue, it's based on the queue size, or like how full is the queue. If the queue is quite full, it will drop more packets. And when it's more empty, it will drop fewer packets. But there's some packets will be dropped based on this queue. And all these packet drops can be reported back to the control plane.

And basically, the control plane will infer what are the aggregates of the packet that are, how you say, like responsible for more packet drop. And aggregate here basically is /24. So basically, the control plane will see that what /24 prefixes has more packet being dropped, right?

So the idea for that is that, hey, this is the aggregate has a like kind of heavy hitters. They try to rate limit this aggregate. And for that, they send back the rate limiting policies back to the data plane. And just try to relieve the congestion in the network.

All right, so is that ACC? Yep.

- So the RED packets are those that actually are dropped, or like the packets are random and you just drop them?

- No, it's actually, so you know, you actually drop them randomly. It's not because of the queue if full. I guess that answers the question, yeah. Good.

Slide 200

Slide 201

Slide 202

Slide 203

Slide 204

Yeah, so that's about ACC. And actually, ACC is only quite generic in the sense that now you don't really need any attack signatures. You see there is some congestion in the network based on the packet drop because of the queue so that there are congestions now. Now you see there is some attack going on. You don't really need to care. what attacks are actually going on, what kind of attack factors are happening. And also, it's safe because actually, you don't drop all the traffic, but now you're only rate limiting it. I mean, they sound similar, but you don't actually drop any traffic. It's just rate limited. So in some way, it's safe.

Slide 205

Slide 206

All right, so, and of course, like we mentioned a couple of the issues at the beginning. For EACACCC, it's quite slow, again, because it requires a lot of time, like coming back and forth between the control plane and the data plane. And also, it's not very automated because you also, yes, so the inference is done in the control plane, and then it's not automated because now it only triggers when the drop rate is above some certain threshold, that was set in advance.

Slide 207

Slide 208

Slide 209

All right, so, ACC-Turbo tool basically addressed this kind of issues. So they do with online clustering and programmable scheduling, blah, blah, blah. But what it does is that now you have the packet coming in, they will try to cluster it in the data plane, okay?

And then, for the statistic, you will be only like pulled to the control planes, not all the time, but like after specific amount of time. Now you just do it periodically. And also, you do the scheduling policies for each of the cluster, so that you apply the rate limiting without actively dropping any traffic. Does this sound good? This is just an overview, okay?

- So are the clusters based on /24?

No, that is something new, and I'm going to talk about it. So we are not going to, so the cluster is different from the aggregate. So ACC uses /24 aggregate, but the cluster in ACC Turbo is something different, and something that we will really dive in, okay?

Slide 210

Slide 211

Slide 212

Slide 213

Slide 214

All right, so for the clustering packet, actually now you, so one of the setting for ACC-Turbo is that let's say I want five cluster for my arriving packet. And the question here now for any arriving packet, which cluster that you will assign them to, right?

So what happen here is that for each of the packet, now you grab the headers, and it really depends on the setup, but for the headers you can have five different fields, right?

You can have the source IP address, the destination IP address, the port, and the protocol, for example. And of course, this can become with TTL, it's very, like, there's no limit in terms of what parameters that you can set.

But you can, if you think about it, you can, with all these parameters, and all the field, you can always represent the packet in what we call the header space.

So in this example, the space consists of two axes, right? The source IP address, and the destination IP address. For that you can literally just present the packet as one point in this space. And after that, it's the, so now you have the cluster. So the cluster here, you basically represent it by a range of value. So in this example, you see that, so we have two clusters, cluster one and cluster two. So for cluster one, you just, let's say for one dimension, you just need to keep what is the minimum value and the maximum value. And again, for the rest of the dimension. For this, you can represent a cluster. Okay?

And now for each arriving packet, basically you just try to measure the distance. So based on this header, now you can calculate the distance to each of the cluster. And it's just basically assigned to the closest cluster.

And ACC-Turbo actually used the so-called Manhattan distance to calculate the distance. And this is purely an engineering choice, because it can be implemented in P4.

Slide 215

Slide 216

Slide 217

Slide 218

Slide 219

Slide 220

And what it actually does is that, I have one example.

And of course, for the dimension, you can use IP addresses, but for the simplicity, I will use the length of the packet. And so now we have the two clusters. Cluster number one, ranging from 500 to 900. Cluster two from 100 to 300.

And another numeric dimension that I can find is actually the TTL. So again, cluster one, 150 to 255, and 60 and 128 for the second cluster. Let's say we have These two clusters.

And for the arriving packet, it has the length of 100 and the TTL of 240. And how do we try to assign this packet to either of these clusters? So we perform the Manhattan distance.

So anyone who doesn't know about Manhattan distance, so I can give a quick recap. Okay, so you don't know. Okay, so the Manhattan distance, basically, let's take a look at the distance between the packet B and cluster number one, okay? You take the length first. The length here is 100, and it's outside of the range of this cluster, right? So you just subtract it. So it's 500 minus 100, right?

Let's take a look at the TTL. So the TTL of this packet is 240. It's within the range of this cluster. So there's no distance at all, yeah? So then you add this two values, and the distance is 400.

And you do the same thing for the distance between the packet and the cluster number two. And then when you look at this, it math is easy. The closer cluster is cluster number two, and basically you need to add this packet to the cluster number two.

Okay, so here one important thing, very, very important. When you add it to the cluster number two, now the cluster will need to be updated as well, right? So you add it to number two because the length is within the same range, so there are no changes. But now you have a different maximum TTL. So the TTL of the cluster number two, we see from 60 to 128, right? Now it will be increased into 60 and 240. It's very important to remember, closer distance, to matching it to the closest cluster, but then the cluster will also need to be updated, okay? Very important.

Slide 219

Slide 220

Slide 221

Slide 222

Slide 223

All right, so that's it about the cluster, and now you, let's say you have the cluster. So now you want to also play around with these clusters as well.

So basically now you have the cluster, you can, it's easy to provide you multiple options, multiple different algorithms so that you can rank them differently. You can rank them based on throughput of all the packet in this cluster. You can count the number of packet in this cluster, and you can do multiple things. And with different kind of ranking algorithm, you can have different ranks for each cluster.

And once you have the rank, you now you have different priority. Let's say you want to rank them based on the number of packet within the cluster. If the cluster has more packets, I put it as the lowest priority, right? In this case, let's say, cluster with the packet number six over here has three packet, it has the lowest priority.

And then you get the cluster with the priority, right? So all, any arriving packet, they will be sorted into this cluster, and then they will have some pre-set priority for them.

And of course, as a controller, you will pull the statistics, calculate the ranking again, and rank the clusters again periodically by the control plane. So you don't really need to do it for every packet, you know? So that makes the reaction time faster. All right?

Slide 224

Slide 225

Slide 226

Slide 227

Slide 228

In the paper, they run a lot of experiment, and the takeaway that is can mitigate the DDoS quite effectively. So this is when we have no defense, and we have the benign traffic on the green line, and the attack traffic on the red line. So this is the pulse-wave DDoS that exactly that I mentioned in the previous half of the lecture. So you can see that when the attack happen at the 20th second, the production traffic, actually, it dropped quite a lot.

But with ACC-Tubo, basically, they save a lot of production traffic, and there are two thing that you should notice about this result.

The first thing is that they are not dropping the attack traffic, they are only deprioritizing, meaning that now the attack traffic will be, actually, ideally, all of them will be put into one cluster. So now, let's say you have five cluster, you have four of them for the benign traffic, you have one of them for the attack traffic, they are not dropped, but they have way lower throughput. So that's one interesting thing.

And the second interesting thing about, is here about the reaction time. So as I mentioned, it's very, very fast. Because it's always on, and then it's like the experiment so that it can take less than one second to react to this kind of attacks.

Slide 229

Slide 230

Slide 231

Slide 232

All right, so, in summary, ACC-Turbo, basically, like, shows that it's possible, feasible, to use network permanently to defense against DDoS attack. It provides generic DDoS defense, and at the line rate, provides safe mitigation, also, and providing fast reaction time.

All right, so that's it about ACC-Turbo. And, yep, question?

- With the Manhattan distance, can the clusters overlap?

- Yes, so that's a good question. So, when it's overlapped, they're often for you to merge the cluster, and then you create a new one.

- Does ACC-Turbo reset and create new clusters?

- Yes, so there are multiple design choices for that. You can keep the clusters all the time, which is not advisable, also. But, of course, you can merge them. That's another option. And another option, of course, is to, to basically, like, to shrink the cluster after some amount of time. So, let's say you have the cluster over here, you don't keep it because it will grow all the time, right? So, one option is that you shrink it to the, like, the central point, or, like, some certain gaps. But, basically, in the paper, they just say that it's possible, it's just up to the, to the operators to implement whatever that they like, but these are the options for that.

- The graph that you shows, was that with the 5-tuple? - That's a good question. I cannot recall precisely, but I believe it's the 5-tuple.

- What happen if you launch the attack with the botnet with different IP addresses? - I think ACC-Turbo works on one of the assumptions is that the attack traffic or the attack packets will look quite similarly. So, what I could imagine when in case of botnet attack, and just to mention, this is a direct attack, it's not a spoofing attack. No, sorry, it can be the spoofing, but it's the direct, not the amplification. So, when this happen, they will group, like, the IP address from different botnet into the same cluster, but you don't have one packet per bot, right? You have multiple of them. Actually, that's how they flood the network. So, the idea is that now you have the aggregates. The packet from the one single botnet, and they will be very limited because they are now in the same cluster.

Yeah? If there are no other questions, then, yeah, I think we are good. We reached the end of the lecture.

And now, we have 15 minutes of the break also. Let's try to go back here.

Yeah, let's just say 4:15. So, keep in mind that, so, in the next one, we are in defense against the real DDoS attack. You will need a laptop for that. You will log in the VM.

One thing is that I already printed out the cheat sheet that will be super helpful when you do the exercises. So, come down here, grab one if you need it, because otherwise, you don't want to switch between the terminals and switch back to see the topologies, okay?

So, just grab this. I will hope to see you in the exercise. It will be in the exam as well. So, just so you know.

Slide 233

Slide 234

Slide 235

Slide 236