Using docx4j java libraries, when trying to generate a docx file having a HTML string embedded in docx file as altchunk the inline font-size
formatting does not work as expected. When font-size
is set as 24pt
, docx file shows it as 14
only.
When changing font-size
to either 23pt
or 24pt
, it works as expected. Same issue also does not happen for any others tag like p
or other Heading#
. In example below both Heading1
and Heading2
are taken with custom font-size
as inline style but it works only for Heading2
.
Example HTML String:
"<html><body><h1 style="font-weight: normal; line-height: 1.1; margin-top: 0.2em; margin-bottom: 0.2em; background-color: transparent; color: #404040; font-family: Calibri; font-size: 24pt;">H1</h1><h2 style="font-weight: normal; line-height: 1.1; margin-top: 0.2em; margin-bottom: 0.2em; background-color: transparent; color: #404040; font-family: Calibri; font-size: 26pt;"> H2 </h2></body></html>"
As seen in MS word: Heading 1 styling as seen in MS Word
Code:
String html = "<html><body><h1 style=\"font-weight: normal; line-height: 1.1; margin-top: 0.2em; margin-bottom: 0.2em; background-color: transparent; color: #404040; font-family: Calibri; font-size: 24pt;\">H1</h1><h2 style=\"font-weight: normal; line-height: 1.1; margin-top: 0.2em; margin-bottom: 0.2em; background-color: transparent; color: #404040; font-family: Calibri; font-size: 26pt;\"> H2 </h2></body></html>";
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();
byte[] bytes = html.getBytes(StandardCharsets.UTF_8);
ByteArrayInputStream baos = new ByteArrayInputStream(bytes);
CTAltChunk ac = new ObjectFactory().createCTAltChunk();
ac.setId("htmlChunk");
wordMLPackage.getMainDocumentPart().addAltChunk(AltChunkType.Html, bais);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
wordMLPackage.save(baos);
byte[] docxFile = baos.toByteArray();
I'm not a pro of docx4j library and can't tell about the inner logic, whether some default styling is taking precedence over yours or if that's just a bug.
Looking at the docx4j documentation they mention an external library for handling Xhtml import (docx4j-ImportXHTML).
Adding the library as dependency and making the following adaptation to your code seems to be generating the expected result, based on docx4j-ImportXHTML sample.
var html = "<html><body><h1 style=\"font-weight: normal; line-height: 1.1; margin-top: 0.2em; margin-bottom: 0.2em; background-color: transparent; color: #404040; font-family: Calibri; font-size: 24pt;\">H1</h1><h2 style=\"font-weight: normal; line-height: 1.1; margin-top: 0.2em; margin-bottom: 0.2em; background-color: transparent; color: #404040; font-family: Calibri; font-size: 26pt;\"> H2 </h2></body></html>";
var wordMLPackage = WordprocessingMLPackage.createPackage();
var mdp = wordMLPackage.getMainDocumentPart();
mdp.addAltChunk(AltChunkType.Xhtml, new ByteArrayInputStream(html.getBytes(StandardCharsets.UTF_8)));
mdp.convertAltChunks();
wordMLPackage.save(new FileOutputStream("myFile.docx"));