I always thought machine learning results differed because the data is shuffled randomly every time upfront, leading to different training sets. And therefore when there is no shuffle the results should be the same every time. As is the case with sklearn.linear_model.LinearRegression()
, but sklearn.linear_model.RANSACRegressor()
shows different results even though it is fed the same training data in the same order every time. Isn't it just a mathematical function and shouldn't the results be the same every time? Can someone explain this, or do I have a mistake in my code, and am I mistakenly feeding it different data?
According to the documentation, data is randomly selected.
There is an indication in some parameters, like random_state:
random_state : int, RandomState instance or None, optional, default
None
The generator used to initialize the centers. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.