I am trying to retrieve 5-tuples information from a list of pcap files using the dpkt library. To parse the PPPoE packets with VLAN tags, I write codes like this(for test only):
import dpkt
import socket
def decode(pc):
for ts, pkt in pc:
eth = dpkt.ethernet.Ethernet(pkt)
pppoe = dpkt.pppoe.PPPoE(eth.data)
ip = pppoe.data
if ip.p == dpkt.ip.IP_PROTO_UDP:
udp = ip.data
yield(ip.src, udp.sport, ip.dst, udp.dport, ip.v)
else: pass
def test():
pc = dpkt.pcap.Reader(open('epon.pcap','rb'))
for src, sport, dst, dport, ip_version in decode(pc):
print "from", socket.inet_ntoa(src),":",sport, " to ",socket.inet_ntoa(dst),":",dport
test()
It turns out error which means the parsing is wrong:
AttributeError: 'str' object has no attribute 'p'
So what should the correct code be like? I'm a Python beginner and the dpkt source code really puzzles me a lot...
The capture you have has a vlan within a vlan (stacked vlan).
Without modifying the dpkt library you will need to parse the second VLAN manually.
Another problem you will have is the payload of pppoe is ppp not ip.
You can change your code to something like this:
import struct
...
def decode(pc):
for ts, pkt in pc:
eth = dpkt.ethernet.Ethernet(pkt)
if eth.type == dpkt.ethernet.ETH_TYPE_8021Q:
eth.tag, eth.type = struct.unpack('>HH', eth.data[:4])
eth.data = eth.data[4:]
pppoe = dpkt.pppoe.PPPoE(eth.data)
ppp = pppoe.data
ip = ppp.ip
if ip.p == dpkt.ip.IP_PROTO_UDP:
udp = ip.data
yield(ip.src, udp.sport, ip.dst, udp.dport, ip.v)
else: pass