I need to convert MS Word (MS Office 2016) file to PDF (like a "scanned image").
In order to do that I print a file to a "virtual" PDF-printer (Microsoft PDF printer or Acrobat PDF printer - unfortunately I can't name the exact version).
As a result I have got a PDF file, but symbols in it are selectable.
Images that were contained in the MS Word file have "selectable" borders in PDF-file.
But I need a PDF without any selectable objects (like the document had been scanned).
Why does it occur (selectable objects in PDF) ?
What determines the appearance of these selectable objects in a "printed" PDF-file (may be I should make specific settings of the PDF-printer) ?
Often Windows print drivers offer options to save as file or save as images. So the common suggestions are use Microsoft Print as Image (Several B&W or Greyscale Tiff, PNG or JPG options can be leveraged) and convert to PDF. These can often be problematic.
Page shaped JPEGs can easily be converted by command line into Paged PDF but then you need either complex CMD handling or PowerShell scripting. For one such example see https://stackoverflow.com/a/79617042/10802527
OR
Simplest use a Print to PDF driver that has an image only option. Several Virtual Printers will do that. Shop around for good quality of controls.
A third option (Which I support) is SumatraPDF command line convert PDF to Microsoft Print as PDF but it enforces ONLY Print as Image. I don't suggest it is as efficient so would always recommend MuPDF conversions or GhostScript reprint or Acrobat Printing as better options.
Whatever method you use whilst Word will have options to produce PDF/A compatible PDF. You can see if the options allow for image only PDF/A-1b.
The difference is shown here. Where a standard Word PDF Printout is selectable on the left, but Printed as PDF/A it is ISO Standard "Archivable" and only the image can be selected. There is no need for embedding fonts. However beware fontless text as images needs to be considerably larger files, requiring much larger archives.Also since PDF only has optional limitations the Reader can add in the selectable text anyway so conversion to PDF/A-1a or Section 508 accessibility, would restore embedded font selectability.
Even humble browsers work on PDF.js "Images with Text" layers so an OCR API can add the text back as a selectable overlay on the image only PDF.