javautf-8itext7utf-16pdf-annotations

How to translate Java unicode string to PDF string object correctly using iText7?


I was looking to set PDF annotation using iText7 and encountered a problem. Unlike rest of PDF document, where you using PDF stream objects to show content - annotation can be set only with PDF string.

But it shows glyphs in microsoft edge reader mode, like this:

<!@8-72...

I also tried to open it in Opera and Chrome but get this result:

Ё3,Ё»1’¼°¼22ёЁȂ21.

Here is a code snippet

Rectangle rect = new Rectangle((float)x1, (float)y1, (float)(x2-x1), (float)(y2-y1));
float[] floatArray = new float[] {(float)x2, (float)y1, (float)x1, (float)y1, (float)x2, (float)y2, (float)x1, (float)y2};

PdfAnnotation annotation = PdfTextMarkupAnnotation.createHighLight(rect,floatArray);
annotation.setContents(new PdfString("Привет, использую русский здесь.");

How can I get the result showing correct?


Solution

  • After enough searching I was able to answer. According to plinth answer we can set UTF-16 encoding of pdf string, changing the default PDFdocEncoding.

    https://stackoverflow.com/a/163065/16591105

    Also to note: not any browser will support UTF-16 encoding, so it will be glyphs anyway.

    Rectangle rect = new Rectangle((float)x1, (float)y1, (float)(x2-x1), (float)(y2-y1));
    float[] floatArray = new float[] {(float)x2, (float)y1, (float)x1, (float)y1, (float)x2, (float)y2, (float)x1, (float)y2};
    
    PdfAnnotation annotation = PdfTextMarkupAnnotation.createHighLight(rect,floatArray);
    annotation.setContents(new PdfString("Привет, использую русский здесь.", "UTF-16"));
    

    Hope this helps to someone!