tcpnetwork-programmingapplication-layer

How to detect packets using http or other application layer protocols?


I am writing a program which can detects protocols used in network packets. For every packet receives it will try to detect protocols in layers such network and transport. Detecting protocols in these two layer was very easy because somewhere in the packet we have some bytes which tell us about protocols. But for application layers, it's much harder as I know. No where in a HTTP packet is not mentioned the protocol (as far as I know). And another difficulty during detecting application layer protocols is that it is possible that a whole HTTP request or response takes more than one packet, and it's lot harder to concatenate multiple packets.

I want to know theoretically how can I detect these protocols.


Solution

  • Unfortunately there's no simple answer. While there aren't that many network and transport layer protocols and the ones that exist are well standardized, the application layer is a lot messier.

    One way to guess the application protocol is to look at various "hints" like port number, the presence of certain strings in specific locations (e.g. "HTTP"), packet lengths etc. But it's not bulletproof. I can easily run some custom protocol on port 80 which happens to contain "HTTP" in the payload but isn't HTTP. This is why even dedicated tools like wireshark sometimes fail to detect the correct protocol. By the way, you could use wireshark's source code for the specifics on how it dissects different protocols.

    Regarding protocols being sent in more than one packet - that's easier. Your parser has to treat TCP as the stream protocol it is. Individual packets are meaningless in TCP and your parser has to track the stream across multiple packets.