javascriptpdfblobfilestreamresult

Wrong encoding on JavaScript Blob when fetching file from server


Using a FileStreamResult from C# in a SPA website (.NET Core 2, SPA React template), I request a file from my endpoint, which triggers this response in C#:

var file = await _docService.GetFileAsync(token.UserName, instCode.Trim()
.ToUpper(), fileSeqNo);
string contentType = MimeUtility.GetMimeMapping(file.FileName);
var result = new FileStreamResult(file.File, contentType);
var contentDisposition = new ContentDispositionHeaderValue("attachment");
Response.Headers[HeaderNames.ContentDisposition] = 
contentDisposition.ToString();
return result;

The returned response is handled using msSaveBlob (spesificly for MS, but this is a problem even though I use createObjectURL and different browser (Yes, I have tried multiple solutions to this, but none of them seems to work). This is the code I use to send the request, and receive the PDF FileStreamResult from the server.

if (window.navigator.msSaveBlob) {
  axios.get(url).then(response => {
      window.navigator.msSaveOrOpenBlob(
          new Blob([response.data], {type: "application/pdf"}),
          filename);
  });

The problem is that the returned PDF file that I get has a wrong encoding on it somehow. So the PDF will not open.

I have tried adding encoding to the end of type: {type: "application/pdf; encoding=UTF-8"} which was suggested in different posts, however, it makes no difference.

Comparing a PDF file that I have fetched in a different way, I can clearly see that the encoding is wrong. Most of the special characters are not correct. Indicated by the response header, the PDF file should be in UTF-8, but I have no idea how to actually find out and check.


Solution

  • Without knowing axios it seems though from its readme page that it uses JSON as default responseType. This may potentially alter the content as it is now treated as text (axios will probably bail out when it cannot convert to an actual JSON object and keep the string/text source for response data).

    A PDF should be loaded as binary data even though it can be both, either 8-bit binary content or 7-bit ASCII - both should in any case be treated as a byte stream, from Adobe PDF reference sec. 2.2.1:

    PDF files are represented as sequences of 8-bit binary bytes. A PDF file is designed to be portable across all platforms and operating systems. The binary rep resentation is intended to be generated, transported, and consumed directly, without translation between native character sets, end-of-line representations, or other conventions used on various platforms. [...].

    Any PDF file can also be represented in a form that uses only 7-bit ASCII [...] character codes. This is useful for the purpose of exposition, as in this book. However, this representation is not recommended for actual use, since it is less efficient than the normal binary representation. Regardless of which representation is used, PDF files must be transported and stored as binary files, not as text files. [...]

    So to solve the conversion that happens I would suggest trying specifying the configuration entry responseType when doing the request:

    axios.get(url, {responseType: "arraybuffer"}) ...
    

    or in this form:

    axios({
      method: 'get',
      url: url,
      responseType:'arraybuffer'
    })
    .then( ... )
    

    You can also go directly to response-type blob if you are sure the mime-type is preserved in the process.