javascriptbase64utf-16utf-16le

Javascript string to Base64 UTF-16BE


I'm trying to convert a string into BASE64 and a charset of utf-16 Big Endian in order to send it using an sms API.

I'm not being able to do so in Javascript.

This is the original js string I want to send in the sms:

const originalString = 'Teste 5% áàÁÀ éèÉÈ íìÍÌ óòÓÒ úùÚÙ çÇ ãà ?!,;';

Using btoa(originalString) I get VGVzdGUgNSUyNSDh4MHAIOnoycgg7ezNzCDz8tPSIPr52tkg58cg48MgPyEsOw== that is not what I need... I used an online converter to that purpose and this is the correct value:

AFQAZQBzAHQAZQAgADUAJQAgAOEA4ADBAMAAIADpAOgAyQDIACAA7QDsAM0AzAAgAPMA8gDTANIAIAD6APkA2gDZACAA5wDHACAA4wDDACAAPwAhACwAOw==

I tested sending an sms with it and it works fine.


Solution

  • To get the UTF-16 version of the string, we need to map all its characters to their charCodeAt(0) value.
    From there, we can build an Uint16Array that would hold an UTF-16LE text file.
    We just need to swap all the items in that Uint16Array to get the UTF-16BE version.

    Then it's just a matter to encode that to base64.

    const originalString = 'Teste 5% áàÁÀ éèÉÈ íìÍÌ óòÓÒ úùÚÙ çÇ ãà ?!,;';
    const expectedString = "AFQAZQBzAHQAZQAgADUAJQAgAOEA4ADBAMAAIADpAOgAyQDIACAA7QDsAM0AzAAgAPMA8gDTANIAIAD6APkA2gDZACAA5wDHACAA4wDDACAAPwAhACwAOw==";
    
    const codePoints = originalString.split('').map( char => char.charCodeAt(0) );
    const swapped = codePoints.map( val => (val>>8) | (val<<8) );
    const arr_BE = new Uint16Array( swapped );
    
    // ArrayBuffer to base64 borrowed from https://stackoverflow.com/a/42334410/3702797
    const result = btoa(
        new Uint8Array(arr_BE.buffer)
          .reduce((data, byte) => data + String.fromCharCode(byte), '')
      );
    console.log( 'same strings:', result === expectedString );
    console.log( result );