pythonpython-3.xsocketsdnsreverse-dns

Reverse DNS lookup in python returns wrong host


I am trying to use python sockets to get hostname of ip. But, by some reason, when I try to do this:

import socket
result = socket.gethostbyaddr('216.58.207.46')  # google ip address
print(result[0])

I get printed out this: fra16s24-in-f14.1e100.net.

How do I get it to print out google.com?

Update: I am trying to parse range of IP addresses, and then convert them into hostnames (if can be converted). Obviously, when I convert 3.126.***.192, I want to see an actual hostname, and not ec2-3-126-***-192.eu-central-1.compute.amazonaws.com.

How do I do that?


Solution

  • Although what you are expecting is not in line with the result you are getting, the result you are getting from Python's standard socket.gethostbyaddr() is correct. Let's start by taking a look at what's happening and then work to align your expectations accordingly...

    DNS utilities such as dig are your friend when working through problems like this, so I'll show output from dig as I work toward an answer for you.

    Reverse DNS lookup is not necessarily a one to one relationship, as others have indicated in comments above. It relies on a convention wherein (using the IPv4 address above as an example) you issue a DNS query for a PTR record for the reversed IP address suffixed with .in-addr.arpa; in your case, this is 46.207.58.216.in-addr.arpa:

    ▶ dig 46.207.58.216.in-addr.arpa in ptr 
    
    ; <<>> DiG 9.10.6 <<>> 46.207.58.216.in-addr.arpa in ptr
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23082
    ;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
    
    ;; QUESTION SECTION:
    ;46.207.58.216.in-addr.arpa.    IN  PTR
    
    ;; ANSWER SECTION:
    46.207.58.216.in-addr.arpa. 77579 IN    PTR fra16s24-in-f14.1e100.net.
    
    ;; Query time: 85 msec
    ;; SERVER: 192.168.0.1#53(192.168.0.1)
    ;; WHEN: Thu May 28 13:17:17 PDT 2020
    ;; MSG SIZE  rcvd: 83
    

    The DNS response here is a PTR record for fra16s24-in-f14.1e100.net which is perfectly in line with what you are seeing when you use Python's socket.gethostbyaddr().

    While it may be true that you associate this host with something related to google.com conceptually, it may not be tied to anything related to the google.com top level domain as far as DNS is concerned.

    Edit (based on your update):

    Obviously, when I convert 3.126.***.192, I want to see an actual hostname, and not ec2-3-126-***-192.eu-central-1.compute.amazonaws.com.

    In the example you are giving here, you need to realize that ec2-3-126-***-192.eu-central-1.compute.amazonaws.com is an actual hostname. Take a look at the article I linked above about reverse DNS lookup and also take a look at a primer on how DNS works. It bears repeating that many address records may exist for a single name (and often this is the case for load balanced resources).

    Here is an example of this for stackoverflow.com:

    ▶ dig stackoverflow.com in a
    
    ; <<>> DiG 9.10.6 <<>> stackoverflow.com in a
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 51644
    ;; flags: qr rd ra ad; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 0
    
    ;; QUESTION SECTION:
    ;stackoverflow.com.     IN  A
    
    ;; ANSWER SECTION:
    stackoverflow.com.  106 IN  A   151.101.193.69
    stackoverflow.com.  106 IN  A   151.101.129.69
    stackoverflow.com.  106 IN  A   151.101.1.69
    stackoverflow.com.  106 IN  A   151.101.65.69
    
    ;; Query time: 81 msec
    ;; SERVER: 192.168.0.1#53(192.168.0.1)
    ;; WHEN: Thu May 28 13:25:55 PDT 2020
    ;; MSG SIZE  rcvd: 99
    

    Should all of these IPs have PTR records that point back to stackoverflow.com? No, not necessarily... Let's look at one for an example:

    ▶ dig 69.193.101.151.in-addr.arpa in ptr
    
    ; <<>> DiG 9.10.6 <<>> 69.193.101.151.in-addr.arpa in ptr
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 418
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
    
    ;; OPT PSEUDOSECTION:
    ; EDNS: version: 0, flags:; udp: 512
    ;; QUESTION SECTION:
    ;69.193.101.151.in-addr.arpa.   IN  PTR
    
    ;; AUTHORITY SECTION:
    151.in-addr.arpa.   3125    IN  SOA pri.authdns.ripe.net. dns.ripe.net. 1586415572 3600 600 864000 3600
    
    ;; Query time: 79 msec
    ;; SERVER: 192.168.0.1#53(192.168.0.1)
    ;; WHEN: Thu May 28 13:28:03 PDT 2020
    ;; MSG SIZE  rcvd: 116
    

    In this case, there are no PTR records when you perform a reverse DNS lookup on this IP address. So, how would Python's gethostbyaddr handle this case?

    >>> socket.gethostbyaddr('151.101.193.69')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    socket.herror: [Errno 1] Unknown host
    

    It raises an exception, in this case. If the DNS response has no reverse lookup PTR record, Python cannot really offer any further useful information.

    If this answer is not clear enough, write back in a comment and I will amend this answer until everything is clear.