character-encodingmacos-carboncore-servicestext-encoding-converter

In the Text Encoding Converter API, what on earth is a MIB?


TEC (also known as Text Encoding Conversion Manager) has these APIs, declared in TextEncodingConverter.h:

extern OSStatus 
TECCopyTextEncodingInternetNameAndMIB(
  TextEncoding               textEncoding,
  TECInternetNameUsageMask   usage,
  CFStringRef *              encodingNamePtr,       /* can be NULL */
  SInt32 *                   mibEnumPtr)            /* can be NULL */ __OSX_AVAILABLE_STARTING(__MAC_10_3, __IPHONE_NA);

extern OSStatus 
TECGetTextEncodingFromInternetNameOrMIB(
  TextEncoding *             textEncodingPtr,
  TECInternetNameUsageMask   usage,
  CFStringRef                encodingName,
  SInt32                     mibEnum)                         __OSX_AVAILABLE_STARTING(__MAC_10_3, __IPHONE_NA);

The documentation in the header says that the MIB parameter… takes/returns a MIB enum value. But there's only one such enum defined: kTEC_MIBEnumDontCare. There are no others. (The documentation for the former function also notes that “valid MIB enums begin at 3”.)

Nothing in the header defines a MIB (not even to say what the letters stand for), and none of the remaining Apple documentation of this API explains it either. And while the older functions named TECGetTextEncodingInternetName and TECGetTextEncodingFromInternetName are documented, the newer ones that use CFStrings and take/return a MIB aren't.

So what on earth is this thing?


Solution

  • While “MIB” is used as a noun in these function names, it's used as more of an adjective in the in the functions' documentation, which refer to “MIB enum value”s. That's a subtle clue.

    Also: “enum” in “MIB enum value” doesn't mean in the sense of a C enum. That's why there are no other enums defined (which, in Apple's style, would generally have a typedef associated with them, like TextEncoding for their own encoding constants). It's the more generic meaning of “list of defined numbers”, and in this case, the list lives somewhere else.

    MIB stands for Management Information Base, as explained by RFC 3808. The charset MIB enum values are defined by this IANA assignments list.

    The RFC implies that these are a subset of a broader Management Information Base that encompasses further categories. That's defined as part of the Structure of Management Information, which appears to be very high-level, taxonomize-all-the-things stuff. The transition from looking at text encodings to reading about SMI is a bit like trading a county map for a star map.

    As the TEC documentation notes, valid charset MIB enum values start at 3. Values less than that are never valid charset MIB enum values.

    So the TEC functions in question can produce a Core Services TextEncoding from either an IANA encoding name or a charset MIB enum value, or inversely give you the IANA name and/or charset MIB enum value that most closely corresponds to a TextEncoding.