I have one PDF and I am trying to scan PDF using CGPDFScanner. While scanning the pdf, when the word "file" is encountered, the CGPDFStringGetBytePtr API returns "\x02le". PDF is having Type1 font and no ToUnicodeMapping(CMap). Encoding dictionary is not present in the PDF hence using NSUTF8String encoding. However I have tried with all NSMacOSRomanStringEncoding, NSASCIIStringEncoding but had no luck. What can be the problem?
Thanks.
The code \x02 corresponds to 'fi' string. The 'fi' sequence is drawn using a ligature this is why you have only one character code.
The correspondence between the code and the string is done in the font encoding. The font encoding contains a /Differences array that specifies the mapping between code \x02 and the sequence 'fi'