pythonencodingbinary

Is there a way to get a binary number array without converting it to a string first?


I am working on a problem where I need to manipulate binary data. The easiest way for me to do this would be through arrays as the use of binary string representation is not allowed. I'm able to derive this code or do the same thing using bin() but the numbers will always be converted into a string. I need to use an array of type int or bool. How can this be done?

ascii = list("ABCD".encode('ascii'))
arr = list(map(lambda x: [*format(x, '08b')], ascii))

I tried using bin() and format() to get binary strings in python but got string results.


Solution

  • List comprehension: (@Mark in the comment)

    ascii = list("ABCD".encode('ascii'))
    res = [[(byte >> (7 - i)) & 1 for i in range(8)] for byte in ascii]
    print(res)
    

    Prints

    [[0, 1, 0, 0, 0, 0, 0, 1], 
    [0, 1, 0, 0, 0, 0, 1, 0], 
    [0, 1, 0, 0, 0, 0, 1, 1], 
    [0, 1, 0, 0, 0, 1, 0, 0]]
    

    import numpy as np
    
    ascii = list("ABCD".encode('ascii'))
    res = []
    
    for byte in ascii:
        for i in range(8):
            bit = (byte >> (7 - i)) & 1
            res.append(bit)
    
    print(np.array(res, dtype=bool))
    

    Prints

    [False True False False False False False True False True False False False False True False False True False False False False True True False True False False False True False False]

    Comments

    Isn't Numpy kind of overkill just to convert 1 and 0 to True and False? Why not: res.append(bool(bit)) or even just leave them as ints? by @Mark

    Naked Benchmark

    
    import time, sys, numpy
    
    data = list(range(100000000))
    L = list(data)
    arr = numpy.array(data)
    start = time.time()
    list_mult = [x * 2 for x in L]
    end = time.time()
    print(f"List: {end - start} seconds")
    
    start = time.time()
    arr_mult = arr * 2
    end = time.time()
    print(f"Array: {end - start} seconds")
    

    Prints

    List: 12.309393882751465 seconds 
    Array: 2.8275811672210693 seconds
    

    Note

    (byte >> (7 - i)) & 1:

    i = 0: (65 >> (7 - 0)) & 1 → (65 >> 7) & 1 → 00000000 & 1 → 0
    i = 1: (65 >> (7 - 1)) & 1 → (65 >> 6) & 1 → 00000001 & 1 → 0
    i = 2: (65 >> (7 - 2)) & 1 → (65 >> 5) & 1 → 00000010 & 1 → 1
    i = 3: (65 >> (7 - 3)) & 1 → (65 >> 4) & 1 → 00000100 & 1 → 0
    i = 4: (65 >> (7 - 4)) & 1 → (65 >> 3) & 1 → 00001000 & 1 → 0
    i = 5: (65 >> (7 - 5)) & 1 → (65 >> 2) & 1 → 00010000 & 1 → 0
    i = 6: (65 >> (7 - 6)) & 1 → (65 >> 1) & 1 → 00100000 & 1 → 0
    i = 7: (65 >> (7 - 7)) & 1 → (65 >> 0) & 1 → 01000001 & 1 → 1
    
    [False  True False False False False False  True  # 'A' -> 01000001
     False  True  True False False False False False  # 'B' -> 01000010
     False  True  True  True False False False False  # 'C' -> 01000011
     False  True  True  True False False  True False] # 'D' -> 01000100
    

    Note that the bitwise & between any bit and 1 returns the bit itself, which effectively filters out all the other bits:

    0 & 1 → 0
    1 & 1 → 1