Issue
I'm currently trying to optimize a function in Python.
Is there a clever use of Numpy to get result in less time ?
I have data of shape (30000,18000) and a window_size of 2.
import time
import numpy as np
def denoise(data, window_size, axis=0):
output = np.zeros((data.shape[axis], data.shape[axis+1] // window_size))
i = 0
for k in range(0, data.shape[axis + 1] - window_size, window_size):
output[:,i] = np.sum(data[:,k:k+window_size],axis=axis+1)/window_size
i = i+1
return output
data = np.random.random((30000,18000))
start = time.time()
output = denoise(data,2)
print(f'Elapsed time {time.time()-start} s')
Solution
You can make a small but significant improvement to your current method.
n = data.shape[1]
output = sum( data[:, k:n:window_size] for k in range(window_size) )/window_size
return output
That removes the need for iterating over each element, and instead using numpy's slice routine. The time goes from 25s to less than 2s for me.
Note the axis argument doesn't work in your original example.
Answered By - matt
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.