Hello and welcome to this course in which we're talking about using Python for credential access. In this video, we're going to be talking about the use of network sniffing to access credentials that are exposed over the network. We're going to start by taking a look at a packet capture file in Wireshark so that we can understand the structure of a few of the different protocols that can leak information over the network. This file is called merged.pcap. What it is, is essentially a mix of three capture files available off the Wireshark sample captures page. There's one for FTP, one for Telnet, and one for SMTP. The reason why we care about these particular protocols is that there are some examples of insecure protocols that leak credentials on the system. We're going to take a look at what these different protocols network traffic looks like before we look at using Python to extract credentials from this traffic automatically. In Wireshark, since all of these are printable protocols, the easiest way to take a look at how they work is to go to, Follow TCP Stream, which will pull out something like this. This shows the printable portions of the traffic for the entire TCP connection. We care in this particular case, because we can see some login information on the system. Obviously this isn't real login information with a username, a fake, and a password of user. But it shows us what login information looks like for the Telnet protocol essentially shows up in plain text. There's a prompt for login, a response, a fake, a prompt for password or responsive user. If you take a look at the colors here, you can see that the blue is coming from the server, the red is coming from the client. We can easily see in this particular network traffic that credentials are exposed on the wire. Moving on to our second protocol, we're looking at an example of FTP traffic. This is both similar and different from Telnet. With FTP, we see that everything's in a red, meaning that a single packet's going to have the command user and then the command pass, data FTP. When we're looking for credentials in FTP traffic, we just need to mainly look for the user and pass commands that are built into the packets and whatever else is in that particular packet is the actual user credentials. Finally, we have an instance of traffic where the credentials are a little less visible. In fact, our user credentials show up right here. We can tell because there's a request for authentication login, and then there's authentication succeeded at the other end. In between these two has to be the authentication traffic. In this case, it's username and password based. However, an SMTP that traffic's a little bit more difficult to see and the reason why is that it's base-64 encoded. If you're not familiar with base-64, this is an encoding algorithm that's primarily designed to make sure that any type of data could be printable on a system. For example, you might have data that has any particular byte value, 0-255 and most of those are non-printable characters, or at least a lot of them are. For protocols that are designed to be wholly printable, maybe like HTTP traffic, or in this case, SMTP traffic. We use base-64 encoding so that any non-printable characters are transformed to something that's guaranteeably printable for transmission over the network. As we see here, it also has the benefit of slightly obfuscating the traffic because you can't see the data that's being sent. However, since base-64 is unknown protocol and there's nothing like a password involved in it. Then it's easily reversible and you can extract the original data. In fact, that's what we're going to be doing with our Python code for this particular video. Let's take a look at our Python code now. Starting at the bottom here, in our main function, we're going to be using Scapy for reading our network packets and parsing them so that we can look for indications of the traffic that we care about. We're going to use rdpcap or read pcap and read in merged.pcap, which is going to read in that packet capture and present it as a list of packets in Python, which we'll store in the variable packet. We're then going to iterate over them. Each packet in packets were going to use Scapy's packet parsing capabilities to make it a lot easier to identify the particular packets in this traffic that contain user credentials. We're going to test to make sure that it has a TCP layer, because all three cases here we have TCP traffic, and also that it has this layer called Raw. All that means is that, it has some payload. We don't really care about packets like a TCP, SYN, or SYN-ACK. They're not actually carrying any data that we care about, we're caring about the ones where you have Telnet, or FTP, or SNTP traffic embedded in the payload at the TCP packet. Once we're certain that we have a TCP packet and it has a layer that we're interested in, we're going to split these different types of traffic based off of the ports associated with them. FTP data, it's going to be over port 21, and so we're looking for something that is a TCP packet with a destination port, a port 21, because that means it's coming from the client to the server, and that means that it's the type of packet that will have the user credentials. Those credentials are going to always flow from the user to the server, not from the server to the client. Similarly, when we're talking about SMTP, we're looking for packets coming from the client to the server. When we talk about Telnet, it gets a little bit more complicated. Remember when we were looking at that traffic capture, we saw that the server will send a prompt in Telnet, and the user will send a response. Unlike FTP, where the user provides a command and data. We'll see that SMTP works on the server prompt user response as well, but that use of Base64, makes it easier to find the username and password in the traffic. For our Telnet traffic here, we need to be able to detect both the prompt, so we know that we're looking for a username or password, and then we need to detect the packet that actually contains that authentication information. Let's start out at the top here about FTP, because this is one of the simpler ones here. In our FTP packet, we saw that the traffic was structured as a command data structure. User FTP pass FTP. This makes it very easy for us to identify the packet that we care about, and extract the credentials from them. Here we're going to pull out the actual payload of the packet and interpret it in a way that's useful to us. If we go packet of Raw, we're talking about lower down having a Raw layer, so we're accessing the payload layer on the packet. Then, the dot load is the actual payload and contained there, we're going to decode it using the UTF-8 encoding, and then rstrip just to remove trailing white-space from it. All of these packets will have a line feed at the end, and we don't want to interpret that as part of the password. Once we've got our payload, we can then test to see what the command is. If the first four characters of the payload match the string user, then we know that we've got a username. Similarly, if the first four characters of the payload maps the string pass, then we've got a password. Under those circumstances, we're going to print out this authentication information on the terminal. We're going to use a couple of format string specifiers. We're going to have one up here at the beginning saying it's a string, then specify FTP username, and then another one at the end. These are paired up with items in this list that we pass as well. The first one, we're going to provide the destination IP address of the packet, and so this is going to be the IP address of the server that the user was logging into. Now we'll know that at the IP address printed, we should be able to log in to a particular user account using the remainder of that payload, which is the username. Similarly for the password, we'll print out the IP address of the server, and the password as well. This gives us login information for a particular user on a particular server. FTP is the easiest of the three options will be looking at here. The next one we'll be looking at is SMTP. Recall when we looked at that, that the information in the packet was Base64 encoded. We couldn't really see what the contents of that packet were. However, the only things that really were Base64 encoded in that traffic were the username and password, and the rest of the authentication session. If we focus on things that are Base64 encoded in the traffic, we've got a pretty good chance of identifying the username and password. Once again, we're loading the packet payload into a variable payload, and then we're going to use B64 decode, which as we see up here is part of the Base64 library. As I mentioned earlier, Base64 is easily decoded and there's functions for it built in the Python. We're going to Base64 decode the payload of every packet that we receive. In a lot of cases, this is going to fail. The reason for that is that most of the packets in the SMTP traffic won't be perfect base-64 packets, they might have characters that are outside base-64's character set. They may not be the proper length, etc. So we're handling that down here in our accept part of the try-except block, we wrap this in. If something goes wrong, we know we have the wrong packet, so we continue onward. However, if we successfully decode the payload of the data, there's a chance that what we're looking at is a username or password. However, that's not guaranteed. It's entirely possible that there's other data there that just happens to be a valid base-64 encoding. So we have a little bit more work to do. Once we've decoded the payload using base 64, we're going to interpret it using the UTF-8 encoding, and then we're going to set up something that'll allow us to track the current state of the session. So this isn't required, but what this is essentially grabbing the client's IP address and port number because there's the potential that we might have multiple different SMTP sessions in the same traffic capture. If that's the case, they all should be using a different IP address and port. So if we can track the IP address and port to determine whether or not we're looking for a certain piece of information. By default, we're not really looking for anything in particular, when we're looking for username. We just know that someone might try to authenticate and they're going to provide a username. However, with SMTP, that username is probably going to be formatted in a certain way. It's going to be email address. How we're going to identify usernames is using regular expressions? Python has a regular expression library built-in and you can define regular expressions that match certain types of strings. In this case, we have a regular expression to find here called email RegEx, and all this is supposed to do is define a user or a string that matches the email standard. It'll start with a letter or a number, and then maybe has one or more letters or numbers possibly with periods or underscores, and eventually, you'll hit an at symbol, then there might be something like Gmail, then there will be a period, and then there will be a top-level domain.com, et cetera. This email RegEx will only identify strings that match that particular case. When we get to this print statement here, we know that we have something that matches an email RegEx and was base-64 encoded. The odds of us getting a false positive there are getting pretty slim, and we can be fairly confident in saying that we've got an SMTP username at this point, and we're using the same format specifiers and other information as we did with the FTP to print out that information or the credentials that we've just extracted. At this point, with a username, we've only got half of the authentication stream. We need a password as well. Now we know that we're waiting for something for this particular connection, which we can define using an IP address and a port number, we know that the last thing that someone sent was a username. So the next piece of base-64 encoded data should be the password. Later on, when we get another packet, we successfully base 64 decode. It doesn't match the email RegEx, and we check to see if we're waiting for a password from this particular connection. If so, we know that we have the SMTP password, we print that out, and then we remove the state tracking information from our unmatched variable here to say, okay, we've got what we needed. Any other base-64 data that happens to be transmitted over this connection can safely be ignored because we've got the user credentials. This is how we're going to take advantage of passwords being transmitted and SMTP traffic. As we see is a bit more complicated than FTP, because we don't have an easily matched string, and the prompt and the response, the actual credentials are indifferent packets in the network traffic. Finally, let's take a look at Telnet. Like SMTP, Telnet's a little bit more complicated. When we were looking at the traffic capture, we saw that it's a call on response or challenge-response protocol. You'll get a prompt for say, login and you need to provide the username, then you'll get a proper password and you need to provide the password. We need to be able to look at both the server and the client packets in this case because we don't have something convenient like Base64 encoding to help call out the authentication data for us. What we're going to do is we're going to start out by trying to decode the contents of the packet and notice that now we're in a try except block. The reason why is that FTP traffic isn't necessarily all going to be decodable using UTF-8. However, everything that we care about, those usernames and passwords will be. If we cannot successfully decode the payload with UTF-8, we can safely ignore. If we do succeed, we then are going to have our tracking information stored in ConData temporarily and then we're going to test the payloads of the packets we receive. Right now we're assuming that we have a packet coming from the server to the client. If that payload starts with string login or the payload starts with the string password, we've discovered, okay, that server is prompting for a username or password. The next thing that we receive, that's Telnet traffic cop contains printable payload, etc., should be the username and password. Like SMTP, we've got state tracking here. However, it's a bit more complex because we need to track if we're waiting for a login versus we're waiting for a password, etc. We've got two different variables doing this same state tracking. However, the state tracking is a little bit more complicated here as well, because with SMTP, we knew we're waiting for a password because we got a log in. Both of those packets came from the same computer. They were both client to server. In this case, our prompt telling us we're waiting for something is server to client and then the actual data we're waiting for is client to server. If we're not looking at a prompt packet here, we're probably looking at the response to that prompt. We need to flip the data that we're tracking because this source of a prompt packet is the destination of a response packet and vice versa. Now we're looking at the destination IP address and destination port because we assume that this is a client to server packing. We then test to see if we're waiting for something. If so, we print out the user credentials and then we remove the tracking information from our variables. Similarly, if we're waiting for a password, print out the password, and then remove that tracking information again. This example was designed to demonstrate extracting credentials from a few vulnerable protocols, namely ones that leak user credentials and plain text. If we open up a Terminal Prompt and run this particular Python code, so with Python network credentials sniffing dot Pi and hit "Enter, " we see that we get the same results that we saw when we were looking at the Wireshark capture. Username and a fake password user for Telnet, username and password, both FTP for FTP, and then we actually get to see the username and password for SMTP this time. We see an email address and then a password because we Base64 decoded that data as we can see it in plain text. This is an example of extracting credentials and information using Python and SCAP from network traffic. Certainly possible to extract other data and there may be other protocols that you can extract even credential information from, such as traffic too and HTTP web page that stores that data in HTTP header information, the URL queries, etc. It's definitely possible to modify this code to use Python and SCAP to pull more sensitive information that's leaked and network traffic. Thank you.