pythonmacospython-webbrowser

webbrowser.open(site) doesn't process korean characters


I'm quite an infrequent coder, I hope my question won't be too obvious.

I have this very simple code to open some websites based on string (open website for a specific word) which works on Windows but somehow doesn't on my new computer with Mac OS. The tricky part is that I'm using Korean alphabet (I learn this language and therefore research websites to create flashcards) which somehow doesn't land properly in the website's URL when opening based on this simple script.

Example:

If I run python3 flashcard.py 가다 in my terminal, I would expect it to return (among others): https://en.dict.naver.com/#/search?query=가다

But unfortunately it returns: https://en.dict.naver.com/#/search?query=???

Which means Korean characters are somehow not recognised and changed to question marks. I tested different parts of code with print statements, but everything down to for loop works fine, so the culprit is the webbrowser.open(). I tried encoding the strings, but then usually getting some errors and apparently I'm not doing it right. I have Korean installed as language both in system & browser.

Has anyone of you experienced similar issue and has resolved the problem?

import sys
import webbrowser
import pyperclip

# Get search word from command line
search_word = sys.argv[1]


# Sites to search for search word
sites = [
        f'https://en.dict.naver.com/#/search?query={search_word}',
        f'https://search.naver.com/search.naver?where=image&sm=tab_jum&query={search_word}',
        # f'https://ko.dict.naver.com/#/search?query={search_word}',
        f'https://forvo.com/word/{search_word}/#ko',
        # f'https://translate.google.com/#view=home&op=translate&sl=ko&tl=en&text={search_word}',
        # f'https://papago.naver.com/?sk=ko&tk=en&st={search_word}',   
        # f'https://ko.wiktionary.org/wiki/{search_word}#%ED%95%9C%EA%B5%AD%EC%96%B4'
        ]

# Search for search word in each site
for site in sites:
    webbrowser.open(site)

# Copy search word to clipboard
pyperclip.copy(search_word)

Solution

  • I think the problem is that the word isn't getting properly URL-encoded (i.e. '가다' needs to be converted to '%EA%B0%80%EB%8B%A4' for use in a URL). Some browsers deal with this differently than others, and I think you're seeing a difference between the browser you use on Windows vs. on macOS. To encode it, you can use:

    import urllib.parse
    url_search_word = urllib.parse.quote(search_word)
    

    ...and then use {url_search_word} instead of {search_word} in the sites strings.