csocketsnetwork-programming

Understanding socket() function args


I need to use socket() but the args given to the function make me confused.

I have to do an school exercice where I have to use socket for intercept ethernet frame (more specifically arp spoofing).

First, I need to specify the domain, which must correspond to an address family. In my case, that would be AF_PACKET (which isnt really an address family, right?), and from what I understand, it provides direct access to Ethernet frames.

Second, I need to set the type of protocol, in my case SOCK_RAW, which, as far as I understand, gives access to all raw frames without being tied to a specific protocol like TCP/UDP. But doesn’t this parameter basically do the same thing as the first one?

Finally, there’s the third argument: the protocol to use. Why does this one, unlike the others, need to be converted to network byte order (and what is this parameter for)?

Reading manuals doesn't help me at all. The Cisco "network basics" course did (I stopped at arp exercise).

This is all quite confusing to me, so thank you for any help you can provide.

If someone is french and can explain me in french, maybe it would help.


Solution

  • Raw sockets are a "special case" for the socket API and their parameters won't necessarily make complete sense in and of themselves, although they will still make sense in context of the regular order.

    So in order to explain the parameters of socket() in raw-sockets usage, you first need to understand their regular use for stream and datagram sockets, and expand from there. For example, the 3rd parameter is easier to understand in the context of AF_INET – so you should be familiar with TCP and UDP first, then I'd expand to e.g. SCTP and raw IP, and finally go from raw IP to raw 'packet'.

    Also: Use Wireshark. Or any other packet capture tool (tcpdump or tshark or Microsoft Network Monitor can do the job), but Wireshark is the most commonly used. It is much easier to understand protocols when you can see visually the packets sent and received, e.g. to see the correspondence between the 'protocol' API parameter and the respective Ethernet header field, or between a close() call and a TCP FIN packet being sent.

    First, I need to specify the domain, which must correspond to an address family. In my case, that would be AF_PACKET (which isnt really an address family, right?), and from what I understand it provides access directly to Ethernet frames.

    Sure, it's not a "definite" address family, just a pseudo-family that indicates "all kinds of link layer" networking. (Most other AF_ constants correspond 1:1 to a protocol, and yes, it would probably make more sense to have individual AF_ETHER and AF_TOKENRING and such, but someone decided to make raw packet sockets somewhat an exception and it's what we have now.)

    Although it of course still involves addresses, only the actual address family will depend on what link you're dealing with. If you bind the socket to an Ethernet interface, you will be using Ethernet addresses (the 48-bit "MAC" addresses), and so on.

    Second, I need to set the type of protocol, in my case SOCK_RAW, which, as I understand it, gives access to all raw frames without being tied to a specific protocol like TCP/UDP. But doesn’t this parameter basically do the same thing as the first one?

    Not necessarily. AF_PACKET only says "work on Ethernet layer", but doesn't necessarily say how to work there.

    For example, you also have the option of using AF_PACKET with SOCK_DGRAM, which makes the socket behave a lot like any other datagram socket: the OS will process the Ethernet header for you, and you will be specifying addresses through 'struct sockaddr_ll' instead of crafting the raw headers yourself – just like you would with IP/UDP for example.

    And there actually was a connection-oriented transport protocol for Ethernet, called "LLC2" and used largely for IBM mainframes (and I think also for X.25). Although I don't know any real implementations, conceptually it would certainly have fit into a SOCK_STREAM slot at the same AF_PACKET layer.

    You will not encounter LLC2 today, but the socket API dates back to an era where there was a huge variety of network protocols as well as physical network types.

    And side note: When you say "without being tied to TCP", you might be thinking of AF_INET-level SOCK_RAW, which is different from AF_PACKET-level. Both are "raw" but at different layers – with AF_PACKET it would be more correct to say "without being tied to IPv4".

    Finally, there’s the third argument: the protocol to use. Why does this one, unlike the others, need to be converted to network byte order

    The value here is compared directly against the value in the packet without conversion, and Ethernet uses the 'network' byte order, so you need to pre-convert it yourself. (Just like you need to do with port numbers in TCP/UDP sockets, if I remember correctly.)

    I don't know if there is a good reason that the BSD socket API was designed in this way, but that's how it was designed.

    the protocol to use. […] (and what is this parameter for) ?