javaapache-poidocx4j

Removing content controls from Docx


I want to replace content controls(drop down list only) in a docx with actual text and then applying some logic on document to extract out tables using apache-poi. If I don't do it then cells having content control are not extracted. If, I save my docx manually as Word 97-2003(*.doc) then it asks to removes all content controls and replace it with text being selected so I am planning to convert docx to doc to get rid of content controls. I've explored so far:

It does create doc file but did not remove content controls as it did with aspose.

what would be a best way to handle this scenario, is there any way to replace content controls directly? Thanks!


Solution

  • docx4j can remove content controls

    The essence of the sample code at https://github.com/plutext/docx4j/blob/master/docx4j-samples-docx4j/src/main/java/org/docx4j/samples/ContentControlRemove.java reproduced below:

        String input_DOCX = System.getProperty("user.dir") + "/some.docx";
    
        // resulting docx
        String OUTPUT_DOCX = System.getProperty("user.dir") + "/OUT_ContentControlRemove.docx";
    
        // Load input_template.docx
        WordprocessingMLPackage wordMLPackage = Docx4J.load(new File(input_DOCX));
    
        // There is no xml stream
        FileInputStream xmlStream = null;
    
        Docx4J.bind(wordMLPackage, xmlStream, Docx4J.FLAG_BIND_REMOVE_SDT);
    
        //Save the document 
        Docx4J.save(wordMLPackage, new File(OUTPUT_DOCX), Docx4J.FLAG_NONE);