[SOUND]. So here in lecture 12.5, we're going to finish our analysis of the logic side, the gate-level delay analysis side of static climbing analysis and we're going to look at exactly how do I compute the ATS, the RATS, and the slacks. In a realistic kind of design I've got a million gates or five million gates or ten million gates. I need to be able to do one quick pass through this logic or a couple of quick passes through this logic and calculate everything. So we're going to show you a nice recursive kind of algorithm which in a forward pass through the graph can label the arrival times and a backwards pass in the graph can label the required arrival times can then use the ats and the rats to calculate the slacks. And get you immediately the worse path, the most violating path if you have a timing problem. The next thing we're going to do which is kind of surprising and kind of cool, is that we're going to show you that the exact same idea for maze routing, the way-front expansion heap, cost-based expansion can be used to actually report worst case paths, worst delay paths in path order. So, you know the sort of thing that happens in a real design is you got ten million gates and you got 30,000 paths that are violating your timing. What do you want to know? You want to know the worst one first. What do you want to know next? You want to know the second worst one, and then the third worst one, and on to the 29,99ninth worst one. You want them in order. How do you find them in order? And the answer is an algorithm that looks exactly like a cost based maze routing. So let's go look at how you calculate the ATS, how you calculate the RATS, how you calculate the SLACKS, and how you report the paths. How you enumerate the paths worse, in worst case order to finish up our discussion of logic side timing. So let's look for a second and what the, you know, the most typical kind of static timing analysis problem there is. So you know, you give me your logic, you tell me your o'clock period. So let's say it's a nanosecond. So this is a gigahertz clock. And you ask me to answer this question - what are all the to slow path? Okay, that violate timing. And the most useful answer is that I report the paths in order from the slowest to the fastest. Right? So, in other words, I'm enumerating those paths in, in delay order. So, not only do I have to be able to tell you the worst path. I need to be tell you all the worst paths because it doesn't help, right, if you just tell me one of them. Right so, you know in a big logic network with, you know, millions and millions of gates, you know, the first time you design it, the first time you synthesize it, the first time you optimize it you know, I mean you could have tens of thousands paths that are in violation. And is just as interesting as side, remember we talking about the fact that we assumed something was statically sensitizable and there are lots of false paths. Really in a real world met off in the case that the designers are the people who are putting the systems you know specifications together that actually know where are lot of the false paths are. So if you're actually to really look at a static timing analysis, kind of the input you know, you might actually have a list of 30,000 paths that you are telling the tool don't find this please, or if you find this, and I know you'll probably find this, don't report this to me please because honestly I don't care, it's not a real path. Please move on and give me the next path. So, you know? That, that sort of happens. But you have millions of logic gates in your, you know? In your, in your network. You have tens of thousands of paths in violation. The most useful thing to do is to report them to me in the order of worst to least violating. So that's the other thing we need to keep in mind is how to enumerate the problems. So what do we need? We need to calculate all of the arrival times, the ATs. We need to calculate all the required arrival times, the RATs. We need to calculate all the Slacks. And we need to do this very efficiently because these graphs are gigantic. They're really enormous. And we also have to figure out- How to enumerate the violating paths in worst delay order. Okay? So this is what we want to figure out how to do. So this is actually all sort of interesting that the delay graph. And a little bit of more sort of mechanics on this graph will actually let us do all of this stuff. So, one approach, this is an easy approach, I like this approach because it is easy to explain is that we do Topological sorting on the delay graph. And so the way to think about this is that I am only going to do one recursive thing on the graph which is I am going to sort it into the nodes into a special order and then pretty much all the processing, the actual calculation stuff I am going to do on the graph, is really not going to be recursive. This is going to be sort of manipulating the nodes. So, a topological sort sorts the vertices in the clay graph onto one single ordered list with the essential property that if, if there's an edge from a node P to a node S, P has to appear before S in the sorted order. So I've got a little graph example here. There's a source node and then there's nodes B and C... D and E in sync, so the source has an edge of 3 to B and an edge for 4 to C. B connects to D with a 5 and E with an 11. C connects to E with a 9, D connects to sync with a 6, E connects to sync with a 15. A topological sort is a list of the nodes source B C D E sync in the order so that if there's an edge From one node to the next, that node appeasr before the next. So for example B has an edge that goes to D. So B has to appear before D in the order. And D has an edge that goes to the SNK, so D has to appear before the SNK in the order. And a topological sort is just a very simple algorithm that comes up with this order. So for example, SRC,B,D,C,E,SNK is a, is a legitimate order, SRC,B,C,D,E,SNK is a legitimate topological sort orders, SRC,B,C,E,D,SNK, SRC,C,B,D,E,SNK, SRC,C,B,E,D,SNK, those are all. Legitimate topological sorting orders. And this is a really simple algorithm you can go type topological sorting and wikipedia comes back immediately and says, oh this is a famous thing, and wikipedia actually comes back with a little bit of pseudocode and honestly this is nothing more than a depth first search walk on the graph with a little bit of processing. So this is a very, very, very easy thing to do. So, let's assume that you've actually topologically sorted the graph. Right? How do you compute the arrival time? So, I've got my little picture again from the previous couple of lectures ago, of computing the arrival time. So there's a source node. There's a set of highlighted nodes called the predecessors. Pred n of node n. One of them is distinguished with a p, right, and they all have a delay delta from say from node p to node n. And then there is the remaining side of the graph. So there are the successor nodes to node n, which one of them is highlighted with an s, and the successor nodes all have paths to the sync, but those are all in gray. What I really care about is node n, and it's predecessors, the predecessors event. All right, so, if I have topologically sorted the graph, there is a very simple algorithm, you just walk though the list, right? So to compute the ATs, the first thing you do is you set the AT of the source to zero because that's what it is and then for each node in, in topologically sorted order, the first thing you do is you set the AT to negative infinity because you're calculating a max and you just need the first one to. You know, never be the right answer. And then for each node in the predecessors, right, so you look at all the nodes in the predecessor set of node n you just calculate. You know the AT is the max over the AT of node n. Now this is the running maximum, and the AT of the predecessor node p, plus the RAT, plus the delay from node p to m. you just simply walk through the node form the source to to the sync, knowing that you're visiting the nodes in an order. such that every tiem you get to a node liek n you knwo youve alresd processed arrival times of predators and you can calculate straightforwardly this is pretty clse to the definiton of the at we gave you know 20 slides earlier this is really simple this is why we like topological soritng it makes it very simple. How do you calculate the RATs? Well, you know, more or less the same thing, but, you know, you, sort of, pretend that you’re going in the opposite direction. So it’s, kind of, like, you’re pretending all the edges are reversed, that they point from the sink to the source. So you, you know, you’re just going to walk the graph backwards, which means you walk the topological sorting order in the other order. So how do you calculate the RATs? again, assuming you have a topological sorting you set the RAT of the sync to be cycle time because that's the right answer. And then for each node and in reverse topological sorted order. Okay. you, walk backwards through the graph. and then you look at, in this case, you look at all the successors, right? So, you look at node n. You look at all the successor nodes, which you know have been visited already. Because you've got the. Topological sorting order correct, you'll look at each successor node S and you calculate the rat with the formula. The Rat is the minimum of whatever the rat is, because you know you're starting, this is the running rat computation, or the rat of S the successor minus the edge delay, Delta from N to S... So all pretty simple, all pretty straight forward. If you can topologically sort the graph and you can do that with a very simple depth for a search, then you get this list with you know, several million nodes in it. You just walk it straight in order. And you know that every time that you visit a node in the forward order for the AT, you've already calculated the predessesor so you can just use the formula. And you know that when you walk the list in the reverse order, you've already calculated the RAT for the successor and you can just use the formula. It's beautiful and simple and easy. Now, another thing that's really not at all obvious and is kind of cool, is that you can use the slack for path reporting. Right, so remember one of the things I said was I really want to enumerate the pads in worst delay order because then I could actually go ask a question like what are all the pads where the delay greater than 12, right? In order, I can focus on sort of the biggest problems first, you can use the slack for path reporting. The slack property is that all the nodes in the longest path have the same slack Right? So the surprising result is you can actually use this to find the N worst paths even though you didn't trace them all. You can actually enumerate them, one at a time in order, with a really easy algorhythm. So, let's just actually just sort of put all the values on this little graph. So this is the same graph I showed previously. Source b d c e sync source connects with a 3 to B, 4 to C, B goes to D with a 5, B goes to E with an 11. C goes to D, C goes to E with a 9, D goes to sync with a 6, E goes to sync with a 15. The cycle time is 29. Just made it up. The ATs are: the AT for the source is zero, the AT for B is three, the AT for eight is D, the AT for D is eight, the AT for C is four, the AT for E is 14, and the sync the AT is 29. What about the rats? I'm going to do this backwards. The RAT for the sink is 29. The RAT for D is 23. The RAT for E is 14. The RAT for B is three. The RAT for C is five. The RAT for source is zero. And what are the slacks? The slack is zero at the source, zero for B. One for C, 15 for D, zero for E, zero for the sink. Right, so, we can do a forward pass to get the ATs, a backward pass to get the RAT's. While we're doing the RAT's we can also' do the Slacks. So one forward pass to the graph, one backward pass to the graph, all the AT's, all the RAT's, all the Slacks. Now, there's a surprising and simple algorithm for, actually, doing the N worst path reporting. Now, what's really amazing, is that it's basically Maze routing. And, one of the reasons for talking about the Maze routing stuff first, is that, you get the, sort of, conceptual framework, and I can do this in one slide. Right, so we're going to find the N worst paths by evolving partial paths just like we evolved partial paths in the maze router, and each partial path stores three things, the path itself, the delay on this path and the slack of the final node on the path, and we're going to store the partial paths on a heap just like maze routing. Only the thing that we're going to use to index the heap is the slack value. Right, so in[UNKNOWN] routing we were indexing on either the length, you know the path length, or the estimated path length if we had a predictor, we're just going to do this on the slack. So we're going to sort so that the path with the worst slack endpoint is always on the top. That means you know, we're going to have the heap. Always make sure that the partial path with the most smallest slack, and that might even mean a slack with the biggest negative, the most negative number is on the top. And initially the heat just contains the two point maze routing, And in the algorithm, the rest of the algorithm looks very simple and familiar like maze routing. There's an expansion step, you pop the partial path off the top of the heap. It has the most negative smallest slack, or at least the smallest slack. you ask the question have I reached the target? Is this the end of the synch? If so correct congratulations print out the path and throw the note away. Otherwise when you expand the path which means you pull it out of the heap you reach all the other things that are one edge away. So you add each successor note to make new partial paths for all of the ed notes that are connected by a edge, and you push them back on the heap with. A little data structure, that is, the path, the delay, and the slack value of the, of that new node. The node at the end of the partial path. And then you just keep going. Really, it is really just that simple. let's do a little simple example. So I've got my little graph again. Source, B C, D E Sink. And I've got all the slack values labelled. Slack is zero for the source, zero for B, one for C, 15 for D, zero for E and zero for the sink. And the heap starts as if you will the source cell just like the source ceill for a maze router. and it starts, with the source cell, the delay to the source cell, which in this case is zero, and the slack at the source cell, which is zero. And so, we're just going to push that thing in the heap. It's really very much like maze routing, and I've got this highlighted, in the graph and we're going to see the highlights as these little green boxes that are going to be trying to show the show the paths. So this is a path with eactly one cell in it, which is the source. Right? The delay on this path is 0 because it's the source and the slack on this path is 0. And what we're going to do is evolve paths so we're going to actually going to use this You know, just like May's routing, we're going to be reaching our neighbors. And you know, you shouldn't be surprised that we're going to be reaching B and reaching C in the next slide. We're going to be calculating some stuff. We're going to be calculating some partial paths. We're going to be shoving things back in the heap. The heap is going to be driving the stuff. We pop things off the heap. We use this to sort of expand, define, define new paths and we throw the cell away. So, right now it's just like MACE routing. There's one cell in the heap, it's the source. The delay is zero. The slack is zero and let's remember that what we, what we do to sort of make this process go forward is we index on the slack. Okay, so what happens next? Well, we take the source path and we pop it off of the heap, and we use that to reach b and c. And so we actually get 2 new paths, right. We get the source to b path, and we get the source to c path. And the source to b path has the length of 3. And the source to c path has the length of 4 but we put them indexed in the heap. On the slack. And so, it is the zero slack of the b node which is the end of the source to b path. That's the minimum of the heap. That's the thing that's the top of the heap. So there's 2 paths in the heap right now. Source to b, source to c. But the thing that's on the top of the heap. Is the 0 slack path source to be. So, what are we going to do, we are going to take that path off, we are going to pop it, we are going to expand it, we are going to use it to reach it neighbors Okay. So, going forward we are expanding the source to B path and we are going to use that to reach it's neighbors. So, what are we going to get, we are going to reach D and so we are going to get source B to D path. And then we're going to also reach E, so we're going to get a source to be to E path, and so I'm still highlighting these things with kind of green so you can get sense. There's a source to be to D path and the slack value on the end of that one is 15. There's a source to be to E path and the slack value is 0, and there's a source to C path... And the slack value in that is one. And so when you put those paths in the heap source to be to E source to be to D source to C, again indexed on the slack value. So what's on the top? The source to B to E path because it has the smallest slack. So what are we going to do? We're going to take the thing off the top of the heap, the source to B to E path, we're going to pop it, we're going to expand it, that's what we're going to do next. So going forward, the source to B to E path is being expanded, we've popped it out of the heap, and we're going to reach what? We're going to reach it's neighbor which is the sink, and so we get a source to B to E to sink path Oh hey, the sink is like the target in May's routing. Hooray. Only I've, I haven't routed a wire, I have found a path. I print it out. I print out source BE SNK with a delay. That delay is 29, and then I throw that away. And I go back, and I look at okay, what's the next most the next smallest slack path in the heap, and the answer is it's the source to c path, right, it's this path the source to c path. So I'm going to pop that off the heap and I'm going to expand it, just like maze routing, I'm going to use it to reach its neighbors. Alright, so, that's going to happen next. What happens? I'm expanding the source-to-C path, and I reach E. Now, one of the things to note, is that, in this particular algorithm I'm going to reach e again. So there's none of this, you know? A bit that you mark that says, oh, I've reached it, and I don't need to reach it again, you know? And, sort of, the maze routing with a consistent path, with a consistent cost function. Look, we're looking for longest delay paths in the graph. We don't care if a bunch of them go through the same gate. In fact, you know? A bunch of them might go through the same gate. That's just[INAUDIBLE] . That's just the way it works. So we pop the source to C Path, and we expand it. What do we expand it to? We expand it to E, okay. And so we take it, and we push it back in. What do we push it back in on? We push it back in on the slack value, alright. The slack value is zero. So we get a source to see the E path with a slack value of zero. What else is sitting around in the heap? What else is sitting around in the heap is a source to B to D path. With a slack of 15. Okay, what do we do? Go back to the heap, pop the smallest cell off the top. The smallest cell, you know the cell with the worst slack value, the slack is zero. We pop the source CE path, and we expand it. So, pop the source CE path, and what happens? We expand it. When we expand it we reach the SNK. Oh, hey! It's like routing the net again. We have a source to C to SNK path that's like reaching the target. We print the delay out. What's the delay on this one? The delay on this one is 28. And then we throw it away. Just like you expand a cell, you throw it away in the maze routing. Okay. Then we go back, and what's still sitting around in the heap? And the answer is, what's sitting around in the heap is a source to b to d path. And that has a slack of 15. It turns out, that's the only thing in the heap right now. So, we are going to expand that. We're going to pop it off the heap and we're going to expand it and see what happens. So, we expand it and we're gonnas pop the source BD path and we're going to reach the sink again. Hurray! Okay, what happens? Now, we reach the sink with a path delay of 14. Great. We print out the source B D sink path of 14, and we go back, and we say, all right, what else is in the heap? And the answer is, oh, nothing. The heap is empty. Or in this particular case, like, oh, okay. We've printed all the paths for this particular graph, which if you look at it, makes sense. There's only three ways you can get from the source to the sink in this graph. Look at it, SRC BD SNK, SRC BE SNK, SRC CE SNK, and look, my little maze routing kind of algorithm where I evolve the partial path. I index it by the slack on the end node of the partial path. I push it into the heap, like reaching things. I pop it out and expand it just like maze routing I acknowledge the fact that I will reach the same node multiple times. I don't worry about it. For this particular algorithm I can print out for you all of the paths in the worst delay order and believe me when you have millions and millions of gates and you know zillions of paths and you know that, you know probably the first 35000 paths are screwed up, right, because you know your timing is not optimized yet and you just want to print them out in the order of worst to least worst. The fact that you can do this super fast. Is wonderful. Right? It's a beautiful, simple algorithm. And it's one of the reasons that I talk about maze routing before I talk about this stuff, because it really feels just like maze routing. So, with that. you know? We basically, we come to the end of our, of our static timing analysis. Static timing analysis is a very important step in the design of complex ASICs. It's what's called a sign off step. And what that means is you don't get to fabricate. Or you don't get to go to the next step of the design process until you pass this. So people spend a lot of time Qualifying their design, optimizing their design, changing their design, fixing their design so that it passes the static timing analysis sign-off. You can do it early in the process when you just have the logic aids and you can either know nothing about the wires, or really lousy models for the wires. The more information you have about your gates, the more information you have about your wires. The more accurate your, static timing analysis can be. Many big ideas, you know, to get to this point in our lecture. Gate level delay models matter. And I'm sorry to say, they can just be really complex in the real world. It's just the way the physics work. Logical analysis of the graph is not what we're doing. We're doing topological path analysis. We're assuming all paths are sensitizable. No paths are false. Honestly, in real static timing, somebody who's really smart and knows a lot about the logic, will tell you that they're just some false paths. And they'll put them in the file, and they'll say, please don't report these. We both know they're false. You build delay graph calculate the ATs the RATs and the slacks recursively i gave you a really simple algorithm were you topologically sort the graph from the source to the sin walk from the source to the sink calculate the ATs walk from the sink to the source you calculate the rats and the slacks and then you can do this cool path reporting stuff. The concept of slack is very, very big it's, you know, it's, it's an incredibly, incredibly important idea because it drives the path of numeration process and because it tells you how much trouble you're in on a gate by gate basis in a gigantic network. All right, so it lets you locate the worst paths and the problem gates, and an idea surprisingly like maze routing lets you enumerate the paths in worst delay order, which is really a very, very pretty algorithm. Finally the static timing world is so big, and the world of timing is so complicated, I have to admit but just, you know a few of the things I didn't tell you. I didn't tell you anything about static timing analysis with sequential elements. I pretended the flip-flops didn't exist, and that's not really good enough. how do you model the flip-flops or latches so you can verify, for example, that the setup in the hole time, are not being violated? you know, for a design that has a million flip-flops in it? it turns out there's just some more tricks with the delay graph. it's very interesting. You can take the delay graph, and you add some edges to it that represent the timing things. Turns out you add some loops to the graph, which is a little disconcerting But it turns out you add some loops to the graph you calculate some ATs and some RATs. And some things and then you check the values on some edges and sort of like the slacks if something bad happens to the number you know you have a problem. there's a whole other kind of analysis that I didn't show you. What i showed you is technically called Late Mode Timing analysis, so it is, I worry about the longest path through the graph. I worry about the fact that my clock is in gigahertz, I have one nanosecond, one thousand picoseconds to get, you know, from the input of the logic to the output of the logic and that's what we worried about... there's early mode timing. If you have things like transparent latches, where you're worried about the shortest paths, and, you know, maybe, you know, going around the loop from the storage element through the logic more than once in one clock period. There's some fancy kinds of timing that use these sorts of sophisticated analysis. There's a whole other set of formulas. They look very much like the late mode, but they're just a little bit harder to understand if you're not working with some of the more fancy timing problems. There's also incremental static timing analysis. So, you know, in practice you have a million logic gates, and you change 10,000 of them because some logic designer fixes something, you really don't want to redo the whole static timing analysis. You just want to go in and change the stuff that changes. there's some very very nice algorithms that can just incrementally look at, what changed in your logic. And so, what's going to change in your static timing analysis? You know? What are the, the ATs that can change? What are the RATs that can change? What are the, what are the, what are the slacks that can change? Don't update all 20 million of them if you only have to update 500,000 of them. so they're very pretty algorithms that do that and again just didn't have time to talk about it, static timing analysis is a huge, huge part of the way people actually build, build chips and so you now know the logic side of things. What we need to talk about next is the geometry side because you know what, you know what's between all those gates... Wires, and wires have their own complicated timing behavior. So let's go look at those next. [SOUND]