I am trying to create my own c function btoi(char *str, int base)
that can take any base from 2 to 64. However after reading a bit I realise I might be opening a big can of worms.
I'm saying this because binary
,octal
,decimal
,hexadecimal
,base-32
and base-64
alphabets are either universal or well defined in rfc4648.
However despite my initial assumption that anything up to base-62 would be a continuation of 0-9
+ A-Z
+ a-z
alphabet, reading section-7 of rfc4648 set me aback as "regular" base-32 is A-Z
+ 2-7
.
To complicate things further we also have padding as a problem.
My question is: Is there a standardized way to convert a string to an int in any base (up to 64)?
Or is it however I want to implement it?
You're misunderstanding what RFC4648 is for.
It's not dictating which characters should be used for a number in bases 16, 32, and 64. It's showing three different ways to encode binary data in ASCII text.
In the case of base64, it takes 3 8-bit values, treats them as 4 6-bit values, then outputs ASCII characters. Below is an example of this from the RFC:
Input data: 0x14fb9c03d97e
Hex: 1 4 f b 9 c | 0 3 d 9 7 e
8-bit: 00010100 11111011 10011100 | 00000011 11011001 01111110
6-bit: 000101 001111 101110 011100 | 000000 111101 100101 111110
Decimal: 5 15 46 28 0 61 37 62
Output: F P u c A 9 l +
The above shows how the bytes values 0x14 0xfb 0x9c 0x03 0xd9 0x7e are converted to the ASCII string FPucAgl+
.
As far as what alphabet is considered standard for numbers in bases 2-36, the most common is 0-9 for the values 0-9 and both a-z and A-Z for values 10-35 (i.e. case insensitive).
The standard library function strtol
already exists that will do this for you.