javabase64one-time-passwordbase32

TOTP Base32 vs Base64


Every TOTP implementation (even FreeOTP by RedHat) I find uses Base32 encoding/decoding for its generated secret. Why is Base64 not used, since Base32 uses roughly 20 % more space and its main advantage is, that it is more human-readable? It is not shown to the user for generation anyways.

While every comment within the implementation says, that its implementation follows RFC6238 / RFC4226, I cannot find anything being said about Base32 within the RFC documents.

It obviously makes sense for it to be converted to either Base32 or Base64 because of data safety through transportation, but why not just use Base64 then?


Solution

  • The reason Base32 is used is to avoid human error. It has nothing to do with space. The reason Base32 is not mentioned in RFC4226 is because it has nothing to do with private key and HMAC or token generation. Base32 is only used to deliver the private key in a human readable form.

    More details if interested:

    The private key in TOTP should be a 20-byte (160-bit) secret. The private key is used with HMAC-SHA1 to encode the number of seconds since Jan 1 1970 (epoch time counter). A token is then extracted from this generated 160-bit HMAC.

    BUT, to enter this 20 byte secret key into a tool like Google Authenticator is not easy. That's why an option with QR code or an Applink scheme are typically provided. eg:
    otpauth://totp/Example:alice@google.com?secret=JBSWY3DPEHPK3PXP&issuer=Example

    So if not using QR Codes or Applink schemes, then you have to read and re-enter this private key. In this case , the passcode is invariably shared using Base32 format, i.e. the 20 byte secret key is encoded as a Base32 string.

    So why is Base32 better than Base64 in this case.

    One of main advantages of Base32 over base64, is that it uses Uppercase A-Z letters only and only the numbers 2-7. There are no lowercase letters nor the digits 0,1,8 or 9.

    Just 26 uppercase A-Z letters and 6 numbers(2-7) = 32 chars.

    So confusion with lowercase I "i" lowercase L "l" uppercase I "I" and number "1" is reduced. Similarly confusion of Letter "B" with the number "8", and number "0" with Letter "O" are also reduced.

    Base32 reduces human error and ambiguous interpretation of the string. This is not the case with Base64.
    All of the above issues with upper and lowercase letters and numbers being confused apply to Base64.

    UPDATE for Clarity: thanks to google Authenticator... https://datatracker.ietf.org/doc/html/rfc4648 Although the rfc4648 refers to base32 using uppercase letters, google now publishes at least some of its base32 codes in lowercase! This is fine as long at the tool knows a=A, b=B etc. Obviously you just convert to uppercase before decoding the secret. And yes lowercase is more readable, and it works , But when a standard is published you do wonder what is better.