It's possible to print the hexcode of the emoji with u'\uXXX'
pattern in Python, e.g.
>>> print(u'\u231B')
⌛
However, if I have a list of hex code like 231B
, just "adding" the string won't work:
>>> print(u'\u' + ' 231B')
File "<stdin>", line 1
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \uXXXX escape
The chr()
fails too:
>>> chr('231B')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: an integer is required (got type str)
My first part of the question is given the hexcode, e.g. 231A
how do I get the str
type of the emoji?
My goal is to getting the list of emojis from https://unicode.org/Public/emoji/13.0/emoji-sequences.txt and read the hexcode on the first column.
There are cases where it ranges from 231A..231B
, the second part of my question is given a hexcode range, how do I iterate through the range to get the emoji str
, e.g. 2648..2653
, it is possible to do range(2648, 2653+1)
but if there's a character in the hexa, e.g. 1F232..1F236
, using range()
is not possible.
Thanks @amadan for the solutions!!
To get a list of emojis from https://unicode.org/Public/emoji/13.0/emoji-sequences.txt into a file.
import requests
response = requests.get('https://unicode.org/Public/emoji/13.0/emoji-sequences.txt')
with open('emoji.txt', 'w') as fout:
for line in response.content.decode('utf8').split('\n'):
if line.strip() and not line.startswith('#'):
hexa = line.split(';')[0]
hexa = hexa.split('..')
if len(hexa) == 1:
ch = ''.join([chr(int(h, 16)) for h in hexa[0].strip().split(' ')])
print(ch, end='\n', file=fout)
else:
start, end = hexa
for ch in range(int(start, 16), int(end, 16)+1):
#ch = ''.join([chr(int(h, 16)) for h in ch.split(' ')])
print(chr(ch), end='\n', file=fout)
Convert hex string to number, then use chr
:
chr(int('231B', 16))
# => '⌛'
or directly use a hex literal:
chr(0x231B)
To use a range, again, you need an int, either converted from a string or using a hex literal:
''.join(chr(c) for c in range(0x2648, 0x2654))
# => '♈♉♊♋♌♍♎♏♐♑♒♓'
or
''.join(chr(c) for c in range(int('2648', 16), int('2654', 16)))
(NOTE: you'd get something very different from range(2648, 2654)
!)