cloudflare-workerscloudflare-r2

How do I retrieve a PDF from R2 using a Cloudflare Worker?


I am porting a React SPA from a traditional nginx host to a Cloudflare worker, using the recently announced Vite Cloudflare plugin. This SPA needs to retrieve images and PDFs from R2. I have the image file retrieval working well with the following code in the client:

  const getR2Data = (r2Url) => {
    const url = "/api?r2Image=" + r2Url;
    fetch(url)
      .then((res) => res.body)
      .then((rs) => {
        const reader = rs.getReader();
        return new ReadableStream({
          async start(controller) {
            while (true) {
              const { done, value } = await reader.read();
              if (done) {
                break;
              }
              controller.enqueue(value);
            }
            controller.close();
            reader.releaseLock();
          }
        })
      })
      .then(rs => new Response(rs))
      .then(response => response.blob())
      .then(blob => URL.createObjectURL(blob))
      .then(binaryData => setBinaryData(binaryData))
      .catch((err) => {
        console.log("R2 fetch error:")
        console.log(err);
      })
  }

and a very slightly modified version of the default fetch() code in /worker/index.js. setBinaryData() simply sets the state variable used in the <img> tag. However, if I use this same function to retrieve a PDF and then try to display it in an <object> tag (which works in the nginx version):

      <object
        data={pdfBinary}
        type="application/pdf"
        width="100%"
        height="100%"
      >
        <p>text</p>
      </object >

it doesn't work, and the source for the PDF is visible, and it takes a looong time to render, even for a small PDF.

pdf text screenshot

What do I need to change to get the PDF to display correctly in the <object> tag? Am I doing something wrong? Is the ReadableStream method the only way to retrieve binary data from an R2 bucket? Again, the image retrieval works fine.

Edit: I tried <embed> and <iframe> and both display the PDF source instead of the PDF itself, so I think this is somehow due to how I am retrieving the files from R2.

Edit: So I converted it to a simple file download with this code and the download works fine:

  .then( blob => {
    var fileURL = URL.createObjectURL(blob);
    var fileLink = document.createElement('a');
    fileLink.href = fileURL;
    fileLink.download = `whatever.pdf`;
    fileLink.click();
  })

Really confused as to why this would be different w/r/t the nginx system reading the file from (say) /pdfs/data.pdf...


Solution

  • A coworker more skilled in such matters was able to see the error in my processing code, above. By manually turning it into a ReadableStream I was dropping the mime info from the file and the <object> tag could not process it correctly. The <img> tag is not sensitive to this fault, apparently. The working function is provided below.

      const getR2Data = (r2Url) => {
        // use /api simply so the arg can be read correctly by searchParams
        const url = "/api?r2Image=" + r2Url;
    
        fetch(url)
          .then((res) => res.blob())
          .then((blob) => URL.createObjectURL(blob))
          // Update image
          .then((binaryData) => setBinaryData(binaryData))
          .catch((err) => {
            console.log("R2 fetch error:");
            console.log(err);
          });
      };
    

    In my original pre-ReadableStream code I forgot to run the blob through URL.createObjectURL() which is what sent me down that rabbit hole in the first place. Hopefully this helps someone!