javaapache-poixwpftextstyle

Is there any way to identify character styles with Apache POI xwpf documents?


Here we see that Apache POI for "HWPF" (MS Word 2000 .doc) files has a method CharacterRun.getStyleIndex()... by which you can, it appears, identify the character style(s) (not paragraph styles) which apply to this run...

But with the XWPF stuff (MS Word 2003+ .docx) files, I can't find any way to identify the character style(s) in an XWPFRun object.


Solution

  • The following code should get all styles from all runs[1] within the XWPFDocument and print their XML if they are applied as character styles:

    import java.io.FileInputStream;
    
    import org.apache.poi.xwpf.usermodel.*;
    
    import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTRPr;
    import org.openxmlformats.schemas.wordprocessingml.x2006.main.STStyleType;
    
    import java.util.List;
    
    public class WordGetRunStyles {
    
     public static void main(String[] args) throws Exception {
    
      FileInputStream fis = new FileInputStream("This is a Test.docx");
      XWPFDocument xdoc = new XWPFDocument(fis);
    
      List<XWPFParagraph> paragraphs = xdoc.getParagraphs();
      for (XWPFParagraph paragraph : paragraphs) {
       List<XWPFRun> runs = paragraph.getRuns();
       for (XWPFRun run : runs) {
        CTRPr cTRPr = run.getCTR().getRPr();
        if (cTRPr != null) {
         if (cTRPr.getRStyle() != null) {
          String styleID = cTRPr.getRStyle().getVal();
          System.out.println("Style ID=====================================================");
          System.out.println(styleID);
          System.out.println("=============================================================");
          XWPFStyle xStyle = xdoc.getStyles().getStyle(styleID);
          if (xStyle.getType() == STStyleType.CHARACTER) {
           System.out.println(xStyle.getCTStyle());
          }
         }
        }
       }
      }
     }
    }
    

    [1] please don't try it with a document with much content ;-).

    As mentioned in the comment from @mike rodent, if you get java.lang.NoClassDefFoundError: org/openxmlformats/schemas/*something* then you must use the full ooxml-schemas-1.3.jar as mentioned in https://poi.apache.org/faq.html#faq-N10025.

    For me this code runs without this since I don't use Phonetic Guide Properties (https://msdn.microsoft.com/en-us/library/office/documentformat.openxml.wordprocessing.rubyproperties.aspx). I use Office 2007.