(Windows, Python 3.9.6, yt-dlp 2025.06.09)
I have a youtube playlist whose titles contain both korean and latin characters. I can print the titles of the videos in the playlist using the below command:
yt-dlp --flat-playlist -i --print title PLySOINx0fqvYr6s8aGdqaK9j8_CAWcP5U
I receive the following output:
WARNING: [youtube:tab] YouTube said: INFO - 1 unavailable video is hidden
Vague (feat. Hey)
새벽 한 시
천 개의 태양
Wish
나쁘게
[온스테이지] 11. 캐스커 - 향
캐스커 (Casker) - 고양이와 나 (Cat and Me)
고양이와 나 (Acoustic Version)
...
(dots not part of output)
I am using the standard Windows cmd terminal. When my terminal's code page is set to 437 (default) or 65001 (utf-8), it displays :
over the korean characters. Using code page 949, both the korean and latin characters display properly, as above. For any code page, the characters display properly when the output is copied to notepad or anywhere else.
However, when I run this:
yt-dlp --flat-playlist -i --print title PLySOINx0fqvYr6s8aGdqaK9j8_CAWcP5U > out.txt 2>&1
I receive the following in out.txt:
WARNING: [youtube:tab] YouTube said: INFO - 1 unavailable video is hidden
Vague (feat. Hey)
Wish
[] 11. -
(Casker) - (Cat and Me)
(Acoustic Version)
...
The korean characters disappear :
and I can verify that these are indeed gone. Only the spaces are left.
Using subprocess.run in python yields the same issue. Running the below code:
import subprocess
p = subprocess.run("yt-dlp --flat-playlist -i --print title PLySOINx0fqvYr6s8aGdqaK9j8_CAWcP5U".split(), capture_output=True, text=True, encoding='cp949')
print((p.stdout.p.stderr))
I receive the following output:
('Vague (feat. Hey)\n \n \nWish\n\n[] 11. - \n (Casker) - (Cat and Me)\n (Acoustic Version)\n - .mp4\nHidden Track\n (feat. Of My Aunt Mary)\n (Song By From Wanted)\nCasker - fragancia\nCasker () "Scent ()" [ ]\n \n (Casker) - \n (Casker) - Fragile Days\n\nSmall One\n (Casker) - Air Trip\nCasker - \nCasker - (Just This Much)\nP\n[Vietsub] - Casker - A thousand suns (Live in Viewzic Session)\n\n (Casker) - Hidden Track\n47\n47 (ver. 2)\nMocha\nPolyester Heart\n\ntender - (Casker)\n (feat. )\n\n\n \n (Casker) - 7 (The Ipanema Girl in July)\n NANJANG ; casker ; \n\n (Piano ver.)\n (Casker) - Midnight Moment\n (Casker) - Skip\nUndo\n[MV] (Casker) - (The Smiler)\n Toothbrush (Acoustic Version)\n1103 (feat. )\n \nNowhere\n \n (Casker) - (One Day) Pt. 2\n(Casker) - \n8 \nCasker - \n (Casker ) - Cactus\n', 'WARNING: [youtube:tab] YouTube said: INFO - 1 unavailable video is hidden\n')
The code page used does not matter.
As far as python goes, I have tested all combinations of:
import subprocess
enc = "cp949"
import os
os.environ['PYTHONIOENCODING'] = enc
# import sys
# sys.stdout.reconfigure(encoding=enc)
# comm = "echo 철갑혹성"
comm = "yt-dlp --flat-playlist -i --print title PLySOINx0fqvYr6s8aGdqaK9j8_CAWcP5U"
# p = subprocess.Popen(comm, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True, shell=True, encoding="utf-8")
# print(p.communicate())
p = subprocess.run(comm.split(), capture_output=True, text=True, shell=False, encoding=enc)
print((p.stdout, p.stderr))
In all cases, "echo 새벽 한 시" gives the correct output when routed to terminal, file, or string, and the yt-dlp command does not.
I cannot tell whether this is a yt-dlp issue, python issue, terminal issue, OS setting issue or some combination of them all, though I predict it's probably the terminal or the OS because of the difference between output to stdout vs anywhere else. Additionally, yt-dlp works and names my files perfectly when simply downloading videos.
The redirection operator >
is converting the text to the console's code page.
On wineconsole
Z:\home\lmc\tmp>chcp
Active code page: 437
Z:\home\lmc\tmp>echo 철갑혹성
철갑혹성
Z:\home\lmc\tmp>echo 철갑혹성 > k.out
Z:\home\lmc\tmp>type k.out
????
Z:\home\lmc\tmp>chcp 949
Active code page: 949
Z:\home\lmc\tmp>echo 철갑혹성 > k.out
Z:\home\lmc\tmp>type k.out
철갑혹성
Note: international characters appear as an small empty rectangle but correctly displayed by this site). Kind of:
echo -e '\u2BD1\u2BD1\u2BD1\u2BD1'
⯑⯑⯑⯑
Making subprocess to write to a file and setting encoding on ytl-dlp could work (don't have python on Wine to test)
import subprocess
enc = 'cp949'
f = open('out.txt', 'wb')
p = subprocess.run(f"yt-dlp --encoding '{enc}' --flat-playlist -i --print title PLySOINx0fqvYr6s8aGdqaK9j8_CAWcP5U".split(), stdout=f, encoding=enc)
Output on wineconsole
(cmd.exe
)
wineconsole 2>/dev/null
Microsoft Windows 10.0.2600
Z:\home\lmc\tmp>chcp 949
Active code page: 949
Z:\home\lmc\tmp>type out.txt
Vague (feat. Hey)
새벽 한 시
천 개의 태양
Wish
...
Additionally, yt-dlp
can be used as a python module to avoid using subprocess
import json
import yt_dlp
URL = 'https://www.youtube.com/playlist?list=PLySOINx0fqvYr6s8aGdqaK9j8_CAWcP5U'
enc = 'cp949'
f = open('out.txt', 'wb')
list_out = ''
# See help(yt_dlp.YoutubeDL) for a list of available options and public functions
ydl_opts = {
'extract_flat': True,
'playlist_items': '1-5',
'encoding': enc
}
with yt_dlp.YoutubeDL(ydl_opts) as ydl:
info = ydl.extract_info(URL, download=False)
#print(json.dumps(ydl.sanitize_info(info)))
for v in info['entries']:
list_out += f"{v['title']}\n"
f.write(list_out.encode(enc))