pythonpython-3.xpdfasciichr

Python 3 - incorrect decoding ascii symbols (Python 2.7 works well)


Via HTTP API I get array of integers, [37,80,68,70,45] - and so on, which represents ascii codes. I need to save it as pdf file. In php code is:

$data = file_get_contents($url);
$pdf_data = implode('', array_map('chr', json_decode($data)));
file_put_contents($file_path.".pdf", $pdf_data);

and it works fine.

But, in python 3:

http_request_data = urllib.request.urlopen(url).read()
data = json.loads(http_request_data)
pdf_data = ''.join(map(chr, data))
with open(file_path, 'w') as fout:
    fout.write(pdf_data)

The result pdf file is damaged and can't be read

What can be the problem?

EDIT:

Tried with python 2.7, file opens and it's good. Problem not solved, I need it in python 3.6

EDIT: https://stackoverflow.com/a/25839524/7451009 - this solution is ok!


Solution

  • When it works in Python2 and not in Python3, the hint is that it is probably caused by a byte vs unicode problem.

    Strings and characters are bytes in Python2 and unicode in Python3. If your code works in Python2, its Python3 equivalent should be:

    http_request_data = urllib.request.urlopen(url).read()
    data = json.loads(http_request_data)
    pdf_data = bytes(data)                 # construct a bytes string
    with open(file_path, 'wb') as fout:    # file must be opened in binary mode
        fout.write(pdf_data)