I have to get the name of all the different protocols from PCAP files. Basically, I have to parse it. I researched a bit and was informed that dpkt is very efficient for this. I am writing the script in python and below is the code -
def inet_to_str(inet):
# First try ipv4 and then ipv6
try:
return socket.inet_ntop(socket.AF_INET, inet)
except ValueError:
return socket.inet_ntop(socket.AF_INET6, inet)
def read_packet(pcap):
with open('/XYZ/XYZ/XYZ/XYZ/XYZ/' + str(pcap),
"rb") as f:
pcap = dpkt.pcap.Reader(f)
for timestamp, buf in pcap:
#Not printing out the timestamp for now
#print('Timestamp: ', str(datetime.datetime.utcfromtimestamp(timestamp)))
#Unpacking the ethernet frame
eth = dpkt.ethernet.Ethernet(buf)
#Not printing the ethernet frame
#print('Ethernet Frame: ', mac_addr(eth.src), mac_addr(eth.dst), eth.type)
#Making sure the ethernet packet contains an IP packet
if not isinstance(eth.data, dpkt.ip.IP):
print('Non IP Packet type not supported %s\n' % eth.data.__class__.__name__)
continue
#Now unpack the data within the Ethernet frame (the IP packet)
#Pulling out src, dst, length, fragment info, TTL, and Protocol
ip = eth.data
#dp = ip.data
#proto = type(udp.data)
#print(proto)
time.sleep(3)
# Pull out fragment information (flags and offset all packed into off field, so use bitmasks)
do_not_fragment = bool(ip.off & dpkt.ip.IP_DF)
more_fragments = bool(ip.off & dpkt.ip.IP_MF)
fragment_offset = ip.off & dpkt.ip.IP_OFFMASK
# Print out the info
print('IP: %s -> %s (len=%d ttl=%d DF=%d MF=%d offset=%d) Protocol=%s\n' % \
(inet_to_str(ip.src), inet_to_str(ip.dst), ip.len, ip.ttl, do_not_fragment, more_fragments, fragment_offset, ip.get_proto(ip.p).__name__))
time.sleep(5)
The problem is that code gives me the transport layer protocol (TCP/UDP), but not the application layer protocols(SSH, DHCP, DNS etc). I read the documentation and found out that there are modules to analyze different types of packets if you know them, but I want to do it automatically as I have millions of pcap files. I want to automatically identify the application layer protocol and then call an appropriate function to analyze it. Is there a way I can at least get the name of protocols?
You can look at the ports of the transport layer packets. This way you can figure out which application layer protocol is being used. Here is a link to the ports numbers and the application layer protocol that uses it: https://en.wikibooks.org/wiki/A-level_Computing/AQA/Paper_2/Fundamentals_of_communication_and_networking/Standard_application_layer_protocols
Using dpkt you can add to your code:
If you are looking for a specific protocal you can set up if statements like so: