Issue
what is the random_state parameter in shuffle
in sklearn.utils
? any one can explain random_state with some sample?
Solution
The shuffle
is used to shuffle your matrices randomly. Programmatically, random sequences are generated using a seed number. You are guaranteed to have the same random sequence if you use the same seed. The random_state
parameter allows you to provide this random seed to sklearn methods. This is useful because it allows you to reproduce the randomness for your development and testing purposes. So, in the shuffle
method, if I use the same random_state
with the same dataset, then I am always guaranteed to have the same shuffle. Consider the following example:
X = np.array([[1., 0.], [2., 1.], [0., 0.]])
X = shuffle(X, random_state=20)
If this gives me the following output,
array([[ 0., 0.],
[ 2., 1.],
[ 1., 0.]])
Now, I am always guaranteed that if I use the random_state = 20
, I will always get exactly the same shuffling. This si particularly useful for unit tests, where you would like to have reproducible results for asserting your conditions being tested.
Hope that helps!
Answered By - Abhinav Arora
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.