Issue
I want to replace n randomly selected column value to zeros in m randomly selected rows for the purpose of adding noise to the dataset. So which means if my n = 3 and m = 5, it will replace zero to 3 randomly selected columns and 5 randomly selected rows.
For example if my n = 3(columns), m = 5(rows)
array([[10, 6, 1, 4, 8, 11, 12],
[3, 2, 6, 7, 6, 2, 3],
[1, 3, 2, 1, 10, 4, 9],
[8, 1, 2, 4, 11, 12, 13],
[3, 9, 5, 3, 4, 14, 4]])
one of the possible output will be
array([[10, 6, **0**, **0**, **0**, 11, 12],
[**0**, 2, **0**, 7, **0**, 2, 3],
[1, 3, 2, **0**, 10, **0**, **0**],
[8, 1, 2, 4, **0**, **0**, **0**],
[3, 9, **0**, 3, **0**, 14, 0]])
And if my n = 1(columns), m = 2(rows)
array([[10, 6, 1, 4, 8, 11, 12],
[3, 2, 6, 7, 6, 2, 3],
[1, 3, 2, 1, 10, 4, 9],
[8, 1, 2, 4, 11, 12, 13],
[3, 9, 5, 3, 4, 14, 4]])
one of the possible output will be
array([[10, **0**, 1, 4, 8, 11, 12],
[3, 2, 6, 7, 6, 2, 3],
[1, 3, 2, 1, **0**, 4, 9],
[8, 1, 2, 4, 11, 12, 13],
[3, 9, 5, 3, 4, 14, 4]])
Thanks in advance if anyone can help
Solution
For a general answer about adding noise in your data please refer to this SO answer : adding-noise-to-a-signal-in-python.
First create a reproducible example :
import numpy as np
n, m, high = 5, 7, 5
a = np.random.randint(low=0, high=high, size=n*m)
b = a.reshape(n, m).copy()
b
# array([[3, 0, 3, 1, 0, 3, 0],
# [2, 3, 3, 3, 2, 0, 3],
# [0, 2, 1, 4, 1, 4, 3],
# [0, 4, 2, 3, 0, 1, 4],
# [4, 4, 0, 2, 3, 4, 0]])
Then to modify values based on row or column number use :
n_rand = np.random.randint(n)
m_rand = np.random.randint(m)
b[n_rand,:] = -1
b[:,m_rand] = -1
b
# array([[ 3, 0, 3, -1, 0, 3, 0],
# [ 2, 3, 3, -1, 2, 0, 3],
# [ 0, 2, 1, -1, 1, 4, 3],
# [ 0, 4, 2, -1, 0, 1, 4],
# [-1, -1, -1, -1, -1, -1, -1]])
More generally to add noise to a signal, assuming rounding a normal distribution makes sense in your context, you could do :
noise = np.random.randn(n*m).round().reshape(n, m)
c = a.reshape(n, m)
print("noise :\n", noise)
print("\nstart matrix:\n", c ,"\n")
np.add(c, noise)
# noise :
# [[ 0. 1. 0. 0. -0. 0. 0.]
# [-1. 0. -0. 1. 1. 0. 0.]
# [-1. -2. -0. 0. 2. -1. 1.]
# [-0. -0. 0. 0. 0. 2. 1.]
# [-1. -1. -1. 0. -0. 0. -1.]]
# start matrix:
# [[3 0 3 1 0 3 0]
# [2 3 3 3 2 0 3]
# [0 2 1 4 1 4 3]
# [0 4 2 3 0 1 4]
# [4 4 0 2 3 4 0]]
# array([[ 3., 1., 3., 1., 0., 3., 0.],
# [ 1., 3., 3., 4., 3., 0., 3.],
# [-1., 0., 1., 4., 3., 3., 4.],
# [ 0., 4., 2., 3., 0., 3., 5.],
# [ 3., 3., -1., 2., 3., 4., -1.]])
Answered By - cbo
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.