javaaspose.wordsaspose.pdfaspose-slides

How to stop rendering image url present in html content using aspose


I am using below code to convert docx/doc/txt files to image using aspose-words-15.8.0-jdk16.jar

com.aspose.words.Document doc = new com.aspose.words.Document(new ByteArrayInputStream(origData));
ImageSaveOptions saveOptions = new ImageSaveOptions(SaveFormat.JPEG);
saveOptions.setPageCount(1);
saveOptions.setPageIndex(0);
doc.save(outputImage, saveOptions);

if txt file is containing html content with image src tag its trying to render image. this is security issue. i dont want to render image. what changes i need to make from our side ?


Solution

  • You can explicitly specify load format as TXT, in this case HTML tags in your TXT document will not be interpreted:

    LoadOptions opt = new LoadOptions();
    opt.setLoadFormat(LoadFormat.TEXT);
    Document doc = new Document("C:\\Temp\\in.txt", opt);
    

    Also, you can skip loading external resources upon loading document using IResourceLoadingCallback.

    LoadOptions opt = new LoadOptions();
    opt.setResourceLoadingCallback(new SkipImgCallback());
    Document doc = new Document("C:\\Temp\\in.html", opt);
    
    static final class SkipImgCallback implements IResourceLoadingCallback {
        @Override
        public int resourceLoading(final ResourceLoadingArgs args) throws Exception {
            return ResourceLoadingAction.SKIP;
        }
    }