I am writing code to generate an NxN matrix of 0s and 1s in XTensor where the probability of an entry being 1 is 1/N. Additionally, I want to discard all values on the diagonal and above. Finally, I want to find the indices of all 1s. Hence, I am using the following code:
auto binomial = xt::tril(
xt::random::binomial(
xt::shape<uint32_t>({N, N}), 1, 1/N
),
1
);
std::vector<std::array<unsigned int, 2>> vals = xt::argwhere(binomial);
The expected size of vals here should be N. This is true when I try N=100, N=1000, N=10000, but does not hold when I try N=100000. Are there any limitations to this approach that I am not aware of?
It appears that you have to worry about types here. Internally, you should be able to resolve indices up the N * N
. For you N = 1e5
that means 1e10
. The maximum size of uint32_t
is about 4e9
, so that means that you overflow.
What does seem to work is
size_t N = 100000;
auto binomial = xt::tril(
xt::random::binomial(
xt::shape<size_t>({N, N}), 1, 1/(double)N
),
1
);
auto vals = xt::argwhere(binomial);
I wonder if this should be considered a bug though. Looking at the API I would expect that xt::shape<uint32_t>({N, N})
should be of a type that is able to handle the maximum N
that you want. Rather, it seems that xt::shape<T>
sets T
for internal sizes, so that T
should be of a type that can handle N^d. I wonder if it should be considered fair that you are able to influence internal types in this way. Please consider opening a bug report (with a link to this question) to resolve this.