Input 1 - "English"
Input 2 - "French"
Expected Output - 46.9 (Based on this website - http://www.elinguistics.net/Compare_Languages.aspx)
Is there any Python library that supports such a language similarity request? There are plenty of options for checking similarity of two sentences, but what about that of two whole languages.
The website in the question is the best one I could find but I'm unable to use it through HTTP request. Selenium is not an option here either.
import requests
from bs4 import BeautifulSoup
lang1 = "Hindi"
lang2 = "Sanskrit"
link = f"http://elinguistics.net/Compare_Languages.aspx?Language1={lang1}&Language2={lang2}&Order=Calc"
response = requests.get(link)
if response.status_code == 200:
soup = BeautifulSoup(response.text, 'html.parser')
#print(soup)
table = soup.find_all('table', class_ = "tblTextActive")
#print(len(table))
tr_elements = table[0].find_all('tr')
#print(len(tr_elements))
for tr_element in tr_elements:
if "The distance" in tr_element.text:
print(tr_element.text)
genetic_similarity = float(tr_element.text.split(":")[1].strip().replace(",", "."))
print(genetic_similarity)
else:
print(f"Failed to retrieve the page. Status code: {response.status_code}")
This will print a float with genetic similarity between two languages. By no means, this code is optimized, it should have try-catch segments, but it gets the job done as long as your input language names are present in the available language list at http://www.elinguistics.net/Compare_Languages.aspx