regexurlhostnametldsld

how to compute hostname's root name


I am looking for a way to extract what I called "hostname root" from a given hostname i.e.

f('stackoverflow.com') -> 'stackoverflow.com'
f('www.stackoverflow.com') -> 'stackoverflow.com'
f('www.stackoverflow.co.uk') -> 'stackoverflow.co.uk'

My first approach was (of course) RegExp but SLD is an issue because there are a considerable amount of options.

Maybe a SLDs database would be a good approach.

EDIT

I am working with node.js and by now I am using the tldjs module


Solution

  • You need to have the entire SLD/TLD database to do this. There's no other general purpose way, especially because there's in some edge cases third or fourth level domains.