[MUSIC] All right. So let's see what some of the problems are with this TCP congestion control scheme. >> First, the multiplicative decrease can be quite aggressive. >> Perhaps the rate that the sender was sending at was only marginally larger than the available capacity. In this case, cutting down the rate by half is a drastic reduction that's totally unnecessary, and is inefficient because TCP will take a while to ramp up the rate again. >> Another problem is that loss itself can be a poor signal of congestion or how much congestion. So although a packet may be lost because the queue did fill up, which is the likely condition in the data center. We're not talking about a wireless network where there's a high Rate of corruption of packets at the physical layer. It may be only a very brief, transient overrun of the queue. And backing off substantially is fixing a problem that in the future wasn't really there. >> Brad mentioned long queues. So let's look back at our example scenario. You'll zoom in on this buffer here. As the buffer fills up, the packets at the tail of this buffer have to wait for the rest of the packets to drain out before they can reach the port. This increases the latency that's absorbed end to end. This is a long queuing delay. >> In fact, in data centers, queuing delay can be the primary cause of end to end latency within the network. So to illustrate this, let's borrow a demonstration from rear admiral Grace Hopper. So, what is this here, Anca? >> This is about a nanosecond. Half the speed of light in vacuum, or a nanosecond at the speed of light in fiber. Now if you add a data center, about a thousand times this distance might be the distance between two distant machines. >> Perhaps. >> On the floor. >> Um-hm. >> Right, so it covered that distance in about a microsecond. That's still a millionth of a second. >> And in comparison, how long does it take to send a packet? Through the network. Well, imagine we have a jumbo frame that's about nine kilobytes and we're sending it through a ten gigabit per second network. If you multiply that out we're talking about eight microseconds. Which is already something like, eight-ish times longer than the speed of light propagation delay through the actual wires and in the data center. >> Right. Working with that example, 41 microseconds. >> Right. >> Although there are other datas in the network, all the devices take time to process packets and such. But, in practice, tens of microseconds end to end are doable. These are numbers that have been reported from production data centers. So tens of microseconds off propagation delay when the queues are almost empty. >> So, in comparison, we had that eight microsecond number for waiting behind one packet potentially. Now, we've got multiple hops. We might have multiple packets behind which we have to wait at each hop along say a six hop path in a pretty large data center. We're already substantially longer than that. And end latency that's been shown to be achievable without queuing delay. >> As a side note, Grace Hopper, whom we borrowed that demonstration from, was a computer scientist in the Navy. She was also cool enough to have a missile destroyer named after her, as well as a super computer, Public Service Announcement. Were I to do something awesome, I would pick the super computer to be named after me. She also popularized the term debugging. Back to congestion control in datacenters. >> So what we just saw in that digression was that if you have large queues that are backed up then you can have, easily, an order of magnitude latency increase. In the data center which can significantly effect latency sensitive applications. >> Also, buffer occupancy is bad for isolation across different applications. Once the buffer is occupied, it shared in many switches across the ports of that switch. So what that means is, if one flow has occupied the buffer. Other flows will start dropping packets even though they're not headed to the same port. >> Now, certain applications that we run in modern data centers exacerbate those latency problems. So, let's take an example of a web service that we're hosting in the cloud. This web service is going to receive queries from clients, and these queries will eventually make their way to some web server, which is facing the client directly. Now, to fulfill that query, we're going to I have to aggregate results, actually, from many back end queries from other data bases. Like, you know, on my website I might have advertising results. I might show items for sale from various databases. I might show results that are specific to user. There can easily be hundreds of queries to multiple different kinds of back in databases, to produce the result that I actually want. Which means that there's this fan out or a scatter, going to many different servers in the data center before those are gathered again back. The responses are gathered back at that aggregation point. >> For such applications, low latency is crucial. Web search really depends on serving users in a very time limited manner. User experience and the provider's bottom line both depend on providing timely service. >> At the same time, sending all of that network traffic to many different servers is going to potentially increase ques, increase latency. And cause a severe condition that can occur, that's known as, Incast. >> So, Incast has bee measured both in the laboratory and in actual deployed data centers. One such measurement for synchronized reads on a storage cluster is shown here. On the x-axis of this plot is the number of servers sending data to this one read client. And on the y-axis is good-put in megabits per second. The maximum in this network is 1,000, or about one gigabit per second >> As you can see, as the number of servers increases on the x axis. The throughput drops sharply, at about five servers, five senders that is to this one client, the throughput has collapsed. This is a correct restrict of the end catch condition. >> Okay, so what we've seen is a short overview of TCP and some of it's problems. >> And in the context of running cloud and running a data center, latency is very important in queueing because the physical distance is very short. So now if we back up the queues with all of the big data that we're sending, this can significantly increase latency for latency sensitive applications, especially if there's a human waiting for many results to be finished. And that sort of traffic pattern when we are waiting for many results to finish is very common, the scatter gather traffic pattern that can lead to a specific problem known as TCP incast. And next time we're going to see how some of these problems can be solved. [MUSIC]