javascriptalgorithmurlstring-matchingbrute-force

Find the nearest matching domain for the url Javascript


Example If the Domains are:

1.google.com

2.go.google.com

3.pro.go.google.com

And the input urls are:

hello.go.google.com //will return domain 2 (go.google.com as it is the nearest)

abc.hello.go.google.com // will return domain 2 (go.google.com as it is the nearest)

qwert.pro.go.google.com // will return domain 3 (pro.go.google.com as it is the nearest)

pro1.go.google.com // will return domain 2 (go.google.com as it is the nearest)

xyz.google.com // will return domain 1(google.com as it is the nearest)

I have created a brute force algo and it is working but it fails in one case that is:

Input url : xgo.google.com //returns domain 2 that is go.google.com but should return 1 that is google.com

Below is the source code I tried:

const domainArr = ["google.com", "go.google.com", "pro.go.google.com"];
const pureHostName = "xgo.google.com";

let maxLength = 0;
let selectedDomain = "";
for (let i = 0; i < domainArr.length; i++) {
    const domain = domainArr[i];
    if (pureHostName.includes(domain)) {
        if (i === 0) {
            maxLength = domain.length;
            selectedDomain = domain;
        }
        if (domain.length > maxLength) {
            maxLength = domain.length;
            selectedDomain = domain;
        }
    }
}

console.log("selectedDomain--->", selectedDomain); //returning go.google.com instead of google.com

Solution

  • Well, first is to sort the domains based on length of the subdomains for each domain and put longer count of subdomains first and shorter ones later.

    Then, check with each domain one by one. If the current testcase at hand matches exactly or has a subdomain when we prepend a period character to our current domain in iteration, we found a match. Otherwise, continue the same test with other available domains.

    const domainArr = ["google.com", "go.google.com", "pro.go.google.com"];
    const testCases = [
      "hello.go.google.com",
      "abc.hello.go.google.com",
      "qwert.pro.go.google.com",
      "pro1.go.google.com",
      "xyz.google.com"
    ];
    
    domainArr.sort((a, b) => b.split('.').length - a.split('.').length);
    
    for (let i = 0; i < testCases.length; ++i) {
      let match = 'No match found!';
      for (let j = 0; j < domainArr.length; ++j) {
        if (testCases[i] == domainArr[j] || testCases[i].includes("." + domainArr[j])) {
          match = domainArr[j];
          break;
        }
      }
      console.log(testCases[i] + " => " + match);
    }