arrayscstringcryptographychar

Is there a way to print Runes as individual characters?


Program's Purpose: Rune Cipher

Note - I am linking to my Own GitHub page below (it is only for purpose-purpose (no joke intended; it is only for the purpose of showing the purpose of it - what I needed help with (and got help, thanks once again to all of you!)


Final Edit:

I have now (thanks to the Extremely Useful answers provided by the Extremely Amazing People) Completed the project I've been working on; and - for future readers I am also providing the full code.

Again, This wouldn't have been possible without all the help I got from the guys below, thanks to them - once again!

Original code on GitHub

Code

(Shortened down a bit)

#include <stdio.h>
#include <locale.h>
#include <wchar.h>
#define UNICODE_BLOCK_START 0x16A0
#define UUICODE_BLOCK_END   0x16F1

int main(){
  setlocale(LC_ALL, "");
  wchar_t SUBALPHA[]=L"ᛠᚣᚫᛞᛟᛝᛚᛗᛖᛒᛏᛋᛉᛈᛇᛂᛁᚾᚻᚹᚷᚳᚱᚩᚦᚢ";
  wchar_t DATA[]=L"hello";
  
    int lenofData=0;
    int i=0;

    while(DATA[i]!='\0'){
          lenofData++;  i++;
          }

  for(int i=0; i<lenofData; i++) {
      printf("DATA[%d]=%lc",i,DATA[i]);
      DATA[i]=SUBALPHA[i];
      printf(" is now Replaced by %lc\n",DATA[i]); 
      }        printf("%ls",DATA);

return 0;
}

Output:

DATA[0]=h is now Replaced by ᛠ
...
DATA[4]=o is now Replaced by ᛟ
ᛠᚣᚫᛞᛟ

Question continues below

(Note that it's solved, see Accepted answer!)

In Python3 it is easy to print runes:

    for i in range(5794,5855):
    print(chr(i))

outputs

ᚢ
ᚣ
...
ᛝ
ᛞ

How to do that in C ?

Is there a way to e.g print ᛘᛙᛚᛛᛜᛝᛞ as individual characters?

When I try it, it just prints out both warnings about multi-character character constant 'ᛟ'.

I have tried having them as an array of char, a "string" (e.g char s1 = "ᛟᛒᛓ";)

One Example of how I thought of going with this:

Print a rune as "a individual character":

To print e.g 'A'

How do I do that, (if possible) but with a Rune ?

I have as well as tried printing it's digit value to char, which results in question marks, and - other, "undefined" results.

As I do not really remember exactly all the things I've tried so far, I will try my best to formulate this post.

If someone spots a a very easy (maybe, to him/her - even plain-obvious) solution(or trick/workaround) -

I would be super happy if you could point it out! Thanks!

This has bugged me for quite some time. It works in python though - and it works (as far as I know) in c if you just "print" it (not trough any variable) but, e.g: printf("ᛟ"); this works, but as I said I want to do the same thing but, trough variables. (like, char runes[]="ᛋᛟ";) and then: printf("%c", runes[0]); // to get 'ᛋ' as the output

(Or similar, it does not need to be %c, as well as it does not need to be a char array/char variable) I am just trying to understand how to - do the above, (hopefully not too unreadable)

I am on Linux, and using GCC.

External Links

Python3 Cyphers - At GitHub

Runes - At Unix&Linux SE

Junicode - At Sourceforge.io


Solution

  • Stored on the stack as a string of (wide) characters

    If you want to add your runes (wchar_t) to a string then you can proceed the following way:

    using wcsncpy: (overkill for char, thanks chqrlie for noticing)

    #define UNICODE_BLOCK_START 0x16A0 // see wikipedia link for the start
    #define UUICODE_BLOCK_END   0x16F0 // true ending of Runic wide chars
    
    int main(void) {
      setlocale(LC_ALL, "");
      wchar_t buffer[UUICODE_BLOCK_END - UNICODE_BLOCK_START + sizeof(wchar_t) * 2];
    
      int i = 0;
      for (wchar_t wc = UNICODE_BLOCK_START; wc <= UUICODE_BLOCK_END; wc++)
        buffer[i++] = wc;
      buffer[i] = L'\0';
    
      printf("%ls\n", buffer);
      return 0;
    }
    

    About Wide Chars (and Unicode)

    To understand a bit better what is a wide char, you have to think of it as a set of bits set that exceed the original range used for character which was 2^8 = 256 or, with left shifting, 1 << 8).

    It is enough when you just need to print what is on your keyboard, but when you need to print Asian characters or other unicode characters, it was not enough anymore and that is the reason why the Unicode standard was created. You can find more about the very different and exotic characters that exist, along with their range (named unicode blocks), on wikipedia, in your case runic.

    Range U+16A0..U+16FF - Runic (86 characters), Common (3 characters)

    NB: Your Runic wide chars end at 0x16F1 which is slightly before 0x16FF (0x16F1 to 0x16FF are not defined)

    You can use the following function to print your wide char as bits:

    void print_binary(unsigned int number)
    {
        char buffer[36]; // 32 bits, 3 spaces and one \0
        unsigned int mask = 0b1000000000000000000000000000;
        int i = 0;
        while (i++ < 32) {
            buffer[i] = '0' + !!(number & (mask >> i));
            if (i && !(i % 8))
                buffer[i] = ' ';
        }
        buffer[32] = '\0';
        printf("%s\n", buffer);
    }
    

    That you call in your loop with:

    print_binary((unsigned int)wc);
    

    It will give you a better understand on how your wide char is represented at the machine level:

                   ᛞ
    0000000 0000001 1101101 1100000
    

    NB: You will need to pay attention to detail: Do not forget the final L'\0' and you need to use %ls to get the output with printf.