I'm having issues implementing the block frequency test in Python to understand the randomness of a binary string. I was wondering if anyone would be able to help me out in understanding why the code wont run.
Also, are there any statistical tests to test the randomness of a binary string in Python or possibly Matlab?
from importlib import import_module
import_module
from tokenize import Special
import math
def block_frequency(self, bin_data: str, block_size=4):
"""
Note that this description is taken from the NIST documentation [1]
[1] http://csrc.nist.gov/publications/nistpubs/800-22-rev1a/SP800-22rev1a.pdf
The focus of this tests is the proportion of ones within M-bit blocks. The purpose of this tests is to determine
whether the frequency of ones in an M-bit block is approximately M/2, as would be expected under an assumption
of randomness. For block size M=1, this test degenerates to the monobit frequency test.
:param bin_data: a binary string
:return: the p-value from the test
:param block_size: the size of the blocks that the binary sequence is partitioned into
"""
# Work out the number of blocks, discard the remainder
(num_blocks)= math.floor((1010110001001011011010111110010000000011010110111000001101) /4)
block_start, block_end = 0, 4
# Keep track of the proportion of ones per block
proportion_sum = 0.0
for i in range(num_blocks):
# Slice the binary string into a block
block_data = (101010001001011011010111110010000000011010110111000001101)[block_start:block_end]
# Keep track of the number of ones
ones_count = 0
for char in block_data:
if char == '1':
ones_count += 1
pi = ones_count / 4
proportion_sum += pow(pi - 0.5, 2.0)
# Update the slice locations
block_start += 4
block_end += 4
# Calculate the p-value
chi_squared = 4.0 * 4 * proportion_sum
p_val = Special.gammaincc(num_blocks / 2, chi_squared / 2)
print(p_val)
There are three issues that I see with your code.
len
where necessary and some other minor changes).scipy.special.gammainc
and not tokenize.Special.gammaincc
, which doesn't exist anyhow.Putting it all together, try something like:
from importlib import import_module
from scipy.special import gammainc
import_module
import math
def block_frequency(self, bin_data: str, block_size=4):
"""
Note that this description is taken from the NIST documentation [1]
[1] http://csrc.nist.gov/publications/nistpubs/800-22-rev1a/SP800-22rev1a.pdf
The focus of this tests is the proportion of ones within M-bit blocks. The purpose of this tests is to determine
whether the frequency of ones in an M-bit block is approximately M/2, as would be expected under an assumption
of randomness. For block size M=1, this test degenerates to the monobit frequency test.
:param bin_data: a binary string
:return: the p-value from the test
:param block_size: the size of the blocks that the binary sequence is partitioned into
"""
# Work out the number of blocks, discard the remainder
my_binary_string = '101010001001011011010111110010000000011010110111000001101'
num_blocks = math.floor(len(my_binary_string) / 4)
block_start, block_end = 0, 4
# Keep track of the proportion of ones per block
proportion_sum = 0.0
for i in range(num_blocks):
# Slice the binary string into a block
block_data = my_binary_string[block_start:block_end]
# Keep track of the number of ones
ones_count = 0
for char in block_data:
if char == '1':
ones_count += 1
pi = ones_count / 4
proportion_sum += pow(pi - 0.5, 2.0)
# Update the slice locations
block_start += 4
block_end += 4
# Calculate the p-value
chi_squared = 4.0 * 4 * proportion_sum
p_val = gammainc(num_blocks / 2, chi_squared / 2)
print(p_val)