Thursday, October 6, 2022

[FIXED] How to randomly replace n number of columns and m number of rows with zero value from a 2d numpy array using Python

October 06, 2022 arrays, noise, numpy, python, replace No comments

Issue

I want to replace n randomly selected column value to zeros in m randomly selected rows for the purpose of adding noise to the dataset. So which means if my n = 3 and m = 5, it will replace zero to 3 randomly selected columns and 5 randomly selected rows.

For example if my n = 3(columns), m = 5(rows)

array([[10, 6, 1, 4, 8, 11, 12],
       [3, 2, 6, 7, 6, 2, 3],
       [1, 3, 2, 1, 10, 4, 9],
       [8, 1, 2, 4, 11, 12, 13],
       [3, 9, 5, 3, 4, 14, 4]])

one of the possible output will be

array([[10, 6, **0**, **0**, **0**, 11, 12],
       [**0**, 2, **0**, 7, **0**, 2, 3],
       [1, 3, 2, **0**, 10, **0**, **0**],
       [8, 1, 2, 4, **0**, **0**, **0**],
       [3, 9, **0**, 3, **0**, 14, 0]])

And if my n = 1(columns), m = 2(rows)

array([[10, 6, 1, 4, 8, 11, 12],
       [3, 2, 6, 7, 6, 2, 3],
       [1, 3, 2, 1, 10, 4, 9],
       [8, 1, 2, 4, 11, 12, 13],
       [3, 9, 5, 3, 4, 14, 4]])

one of the possible output will be

array([[10, **0**, 1, 4, 8, 11, 12],
       [3, 2, 6, 7, 6, 2, 3],
       [1, 3, 2, 1, **0**, 4, 9],
       [8, 1, 2, 4, 11, 12, 13],
       [3, 9, 5, 3, 4, 14, 4]])

Thanks in advance if anyone can help

Solution

For a general answer about adding noise in your data please refer to this SO answer : adding-noise-to-a-signal-in-python.

First create a reproducible example :

import numpy as np

n, m, high = 5, 7, 5
a = np.random.randint(low=0, high=high, size=n*m)
b = a.reshape(n, m).copy()
b

# array([[3, 0, 3, 1, 0, 3, 0],
#        [2, 3, 3, 3, 2, 0, 3],
#        [0, 2, 1, 4, 1, 4, 3],
#        [0, 4, 2, 3, 0, 1, 4],
#        [4, 4, 0, 2, 3, 4, 0]])

Then to modify values based on row or column number use :

n_rand = np.random.randint(n)
m_rand = np.random.randint(m)

b[n_rand,:] = -1
b[:,m_rand] = -1
b

# array([[ 3,  0,  3, -1,  0,  3,  0],
#        [ 2,  3,  3, -1,  2,  0,  3],
#        [ 0,  2,  1, -1,  1,  4,  3],
#        [ 0,  4,  2, -1,  0,  1,  4],
#        [-1, -1, -1, -1, -1, -1, -1]])

More generally to add noise to a signal, assuming rounding a normal distribution makes sense in your context, you could do :

noise = np.random.randn(n*m).round().reshape(n, m)
c = a.reshape(n, m)
print("noise :\n", noise)
print("\nstart matrix:\n", c ,"\n") 
np.add(c, noise)

# noise :
#  [[ 0.  1.  0.  0. -0.  0.  0.]
#  [-1.  0. -0.  1.  1.  0.  0.]
#  [-1. -2. -0.  0.  2. -1.  1.]
#  [-0. -0.  0.  0.  0.  2.  1.]
#  [-1. -1. -1.  0. -0.  0. -1.]]

# start matrix:
#  [[3 0 3 1 0 3 0]
#  [2 3 3 3 2 0 3]
#  [0 2 1 4 1 4 3]
#  [0 4 2 3 0 1 4]
#  [4 4 0 2 3 4 0]] 

# array([[ 3.,  1.,  3.,  1.,  0.,  3.,  0.],
#        [ 1.,  3.,  3.,  4.,  3.,  0.,  3.],
#        [-1.,  0.,  1.,  4.,  3.,  3.,  4.],
#        [ 0.,  4.,  2.,  3.,  0.,  3.,  5.],
#        [ 3.,  3., -1.,  2.,  3.,  4., -1.]])

Answered By - cbo

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Thursday, October 6, 2022

[FIXED] How to randomly replace n number of columns and m number of rows with zero value from a 2d numpy array using Python

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels