I need to process a 7GB pcap file to extract their packets size, payloads size. I initially use scapy's PcapReader to extract these sizes, but scapy run truly slow for 7GB file. So I change to use DPKT library, however, I don't know how to check its TCP payload size.
import dpkt
payload_size=[]
packet_size=[]
for ts,buf in dpkt.pcapng.Reader(open('pcap file','rb')):
eth=dpkt.ethernet.Ethernet(buf)
if eth.type==dpkt.ethernet.ETH_TYPE_IP:
ip=eth.data
if ip.p==dpkt.ip.IP_PROTO_TCP:
packet_size.append(ip.len)
payload_size.append(?)
else:
pass
Looking at the source for dpkt's IP class
def __len__(self):
return self.__hdr_len__ + len(self.opts) + len(self.data)
They calculate the length as the header, options, and length of data. So I think you can get the payload length by:
payload_size.append(len(ip.data))
Update:
OP wanted the TCP payload. The TCP's source is similar:
def __len__(self):
return self.__hdr_len__ + len(self.opts) + len(self.data)
So the length of the TCP payload should be len(ip.data.data)
.
if ip.p==dpkt.ip.IP_PROTO_TCP:
tcp = ip.data
payload_len = len(tcp.data)