In my deep learning exercise I had to initialize one parameter D1 of same size as A1 so what I did is:
D1 = np.random.randn(A1.shape[0],A1.shape[1])
But after computing further equations when I checked the results they didn't matched. Then after proper reading the docs I discovered that they have said to initialize D1 using rand()
instead of randn()
:
D1 = np.random.rand(A1.shape[0],A1.shape[1])
But they didn't specified the reason for it as the code is working in both the cases. And also there was a doc for that exercise so I figured out the error, but how, when and why to choose out of these two?
The difference between rand
and randn
is (besides the letter n
) that rand
returns random numbers sampled from a uniform distribution over the interval [0,1), while randn
instead samples from a normal (a.k.a. Gaussian) distribution with a mean of 0 and a variance of 1.
In other words, the distribution of the random numbers produced by rand
looks like this:
In a uniform distribution, all the random values are restricted to a specific interval, and are evenly distributed over that interval. If you generate, say, 10000 random numbers with rand
, you'll find that about 1000 of them will be between 0 and 0.1, around 1000 will be between 0.1 and 0.2, around 1000 will be between 0.2 and 0.3, and so on. And all of them will be between 0 and 1 — you won't ever get any outside that range.
Meanwhile, the distribution for randn
looks like this:
The first obvious difference between the uniform and the normal distributions is that the normal distribution has no upper or lower limits — if you generate enough random numbers with randn
, you'll eventually get one that's as big or as small as you like (well, subject to the limitations of the floating point format used to store the numbers, anyway). But most of the numbers you'll get will still be fairly close to zero, because the normal distribution is not flat: the output of randn
is a lot more likely to fall between, say, 0 and 0.1 than between 0.9 and 1, whereas for rand
both of these are equally likely. In fact, as the picture shows, about 68% of all randn
outputs fall between -1 and +1, while 95% fall between -2 and +2, and about 99.7% fall between -3 and +3.
These are completely different probability distributions. If you switch one for the other, things are almost certainly going to break. If the code doesn't simply crash, you're almost certainly going to get incorrect and/or nonsensical results.