pdffontstruetype

How do I inspect the cmap table and subtables in a TrueType font?


The PDF Reference says:

A TrueType font program’s built-in encoding maps directly from character codes to glyph descriptions, using an internal data structure called a “cmap”

It goes on to explain that the behaviour of a PDF processor depends on which cmap subtables are present in the font file.

I am trying to analyze a .ttf font file extracted using fontforge from a PDF that was generated by LibreOffice. The PDF embeds this font file as a simple font, using single-byte codes. When I look at the .ttf file in fontdrop.info, it tells me the "glyphIndexMap" is as follows:

{"0":0,"2":0,"3":0,"4":0,"5":0,"6":0,"7":0,"8":0,"9":0,"10":0,"11":0,"12":0,"13":0,"14":0,"15":0,"16":0,"17":0,"18":0,"19":0,"20":0,"21":0,"22":0,"23":0,"24":0,"25":0,"26":0,"27":0,"28":0,"29":0,"30":0,"31":0,"32":0,"33":0,"34":0,"35":0,"36":0,"37":0,"38":0,"39":0,"40":0,"41":0,"42":0,"43":0,"44":0,"45":0,"46":0,"47":0,"48":0,"49":0,"50":0,"51":0,"52":0,"53":0,"54":0,"55":0,"56":0,"57":0,"58":0,"59":0,"60":0,"61":0,"62":0,"63":0,"64":0,"65":0,"66":0,"67":0,"68":0,"69":0,"70":0,"71":0,"72":0,"73":0,"74":0,"75":0,"76":0,"77":0,"78":0,"79":0,"80":0,"81":0,"82":0,"83":0,"84":0,"85":0,"86":0,"87":0,"88":0,"89":0,"90":0,"91":0,"92":0,"93":0,"94":0,"95":0,"96":0,"97":0,"98":0,"99":0,"100":0,"101":0,"102":0,"103":0,"104":0,"105":0,"106":0,"107":0,"108":0,"109":0,"110":0,"111":0,"112":0,"113":0,"114":0,"115":0,"116":0,"117":0,"118":0,"119":0,"120":0,"121":0,"122":0,"123":0,"124":0,"125":0,"126":0,"127":0,"160":0,"161":0,"162":0,"163":0,"165":0,"167":0,"168":0,"169":0,"170":0,"171":0,"172":0,"174":0,"175":0,"176":0,"177":0,"180":0,"181":0,"182":0,"183":0,"184":0,"186":0,"187":0,"191":0,"192":0,"193":0,"194":0,"195":0,"196":0,"197":0,"198":0,"199":0,"200":0,"201":0,"202":0,"203":0,"204":0,"205":0,"206":0,"207":0,"209":0,"210":0,"211":0,"212":0,"213":0,"214":0,"216":0,"217":0,"218":0,"219":0,"220":0,"223":0,"224":0,"225":0,"226":0,"227":0,"228":0,"229":0,"230":0,"231":0,"232":0,"233":0,"234":0,"235":0,"236":0,"237":0,"238":0,"239":0,"241":0,"242":0,"243":0,"244":0,"245":0,"246":0,"247":0,"248":0,"249":0,"250":0,"251":0,"252":0,"255":0,"305":0,"338":0,"339":0,"376":0,"402":0,"675":3,"710":0,"711":0,"728":0,"729":0,"730":0,"731":0,"732":0,"733":0,"916":0,"937":0,"960":0,"8211":0,"8212":0,"8216":0,"8217":0,"8218":0,"8220":0,"8221":0,"8222":0,"8224":0,"8225":0,"8226":0,"8230":0,"8240":0,"8249":0,"8250":0,"8260":0,"8364":0,"8482":0,"8706":0,"8719":0,"8721":0,"8730":0,"8734":0,"8747":0,"8776":0,"8800":0,"8804":0,"8805":0,"9674":0,"57374":0,"64257":0,"64258":0}

(the interesting part is "675":3)

I can understand this insofar as the font contains 4 glyphs, and the glyph at index 3 is the ʣ character (decimal Unicode point 675 / U+02A3).

But in the PDF, this character is used in text strings as <01>, and no other encoding is given - so according to the PDF Reference, the mapping from <01> to the glyph at index 3 must be done according to a mapping within the .ttf file:

If no Encoding entry is specified in the font dictionary, the “cmap” subtable with platform ID 1 and encoding 0 will be used to map directly from character codes to glyph descriptions, without any consideration of character names. This is the normal convention for symbolic fonts.

I have confirmed that no Encoding entry is specified within the PDF. Here are the /Font and /FontDescriptor objects extracted using qpdf:

18 0 obj
<<
  /BaseFont /BAAAAA+LiberationSerif
  /FirstChar 0
  /FontDescriptor 20 0 R
  /LastChar 1
  /Subtype /TrueType
  /ToUnicode 21 0 R
  /Type /Font
  /Widths [
    777
    802
  ]
>>
endobj

20 0 obj
<<
  /Ascent 891
  /CapHeight 981
  /Descent -216
  /Flags 4
  /FontBBox [
    -543
    -303
    1277
    981
  ]
  /FontFile2 23 0 R
  /FontName /BAAAAA+LiberationSerif
  /ItalicAngle 0
  /StemV 80
  /Type /FontDescriptor
>>
endobj

So how can I investigate the .ttf file to confirm that "the “cmap” subtable with platform ID 1 and encoding 0" is in place and contains the mappings I think it does?

Edit: the PDF in question


Solution

  • How do I inspect the cmap table and subtables in a TrueType font?

    OT Master Light, from Dutch Type Library, is a free tool that's quite handy for inspecting internal font tables.