regexbashfqdn

Fully qualified domain name validation


Is there a quick and dirty way to validate if the correct FQDN has been entered? Keep in mind there is no DNS server or Internet connection, so validation has to be done via regex/awk/sed.

Any ideas?


Solution

  • It's harder nowadays, with internationalized domain names and several thousand (!) new TLDs.

    The easy part is that you can still split the components on ".".

    You need a list of registerable TLDs. There's a site for that:

    https://publicsuffix.org/list/effective_tld_names.dat

    You only need to check the ICANN-recognized ones. Note that a registerable TLD can have more than one component, such as "co.uk".

    Then there's IDN and punycode. Domains are Unicode now. For example,

    "xn--nnx388a" is equivalent to "臺灣". Both of those are valid TLDs, incidentally.

    For punycode conversion code, see "http://golang.org/src/pkg/net/http/cookiejar/punycode.go".

    Checking the syntax of each domain component has new rules, too. See RFC5890 at https://www.rfc-editor.org/rfc/rfc5890

    Components can be either A-labels (ASCII only) or Unicode. ASCII labels either follow the old syntax, or begin "xn--", in which case they are a punycode version of a Unicode string.

    The rules for Unicode are very complex, and are given in RFC5890. The rules are designed to prevent such things as mixing characters from left-to-right and right-to-left sets.

    Sorry there's no easy answer.