Issue
I have a function that is used for smoothing a curve by taking a moving average over a factor of 2. However, in it's current form the function is slow because of the loops. I have added numba to increase the speed but still it's slow. Any suggestions on how I could improve the speed?
from numba import prange, jit
@jit(nopython=True, parallel=True)
def smoothing_function(x,y, window=2, pad = 1):
len_x = len(x)
max_x = np.max(x)
xoutmid = np.full(len_x, np.nan)
xoutmean = np.full(len_x, np.nan)
yout = np.full(len_x, np.nan)
for i in prange(len_x):
x0 = x[i]
xf = window*x[i]
if xf < max_x:
e = np.where(x == x.flat[np.abs(x - xf).argmin()])[0][0]
if e<len(x):
yout[i] = np.nanmean(y[i:e])
xoutmid[i] = x[i] + np.log10(0.5) * (x[i] - x[e])
xoutmean[i] = np.nanmean(x[i:e])
return xoutmid, xoutmean, yout
# Working example
f = lambda x: x**(-1.7)*2*np.random.rand(len(x))
x = np.logspace(np.log10(1e-5), np.log10(1), 1000)
xvals, yvals = x, f(x)
%timeit res =smoothing_function(xvals, yvals, window=2, pad = 1)
# plot results
plt.loglog(xvals, yvals)
plt.loglog(res[1], res[2])
Solution
Issue is that you are computing end index (e
) very inefficiently. If you use the fact that x
is in logspace, this can be done much faster since you know the distance between 2 consecutive points and just need to compute index which is log(window) far from initial point. Working example is as follows:
@jit(nopython=True, parallel=True)
def smoothing_function2(x,y, window=2, pad = 1):
len_x = len(x)
max_x = np.max(x)
xoutmid = np.full(len_x, np.nan)
xoutmean = np.full(len_x, np.nan)
yout = np.full(len_x, np.nan)
f_idx = int(len(x)*np.log10(window)/(np.log10(x[-1])-np.log10(x[0])))
for i in prange(len_x):
if window*x[i] < max_x:
e = min(i+f_idx, len_x-1)
yout[i] = np.nanmean(y[i:e])
xoutmid[i] = x[i] + np.log10(0.5) * (x[i] - x[e])
xoutmean[i] = np.nanmean(x[i:e])
return xoutmid, xoutmean, yout
f = lambda x: x**(-1.7)*2*np.random.rand(len(x))
x = np.logspace(np.log10(1e-5), np.log10(1), 1000)
xvals, yvals = x, f(x)
res1 = smoothing_function(xvals, yvals, window=2, pad = 1)
res2 = smoothing_function2(xvals, yvals, window=2, pad = 1)
print([np.nansum((r1 - r2)**2) for r1, r2 in zip(res1, res2)]) # verify that all the outputs are same
%timeit res = smoothing_function(xvals, yvals, window=2, pad = 1)
%timeit res = smoothing_function2(xvals, yvals, window=2, pad = 1)
Output is the following:
[0.0, 0.0, 0.0]
337 µs ± 59.6 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
49.1 µs ± 2.6 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
Which verifies that both functions return same output but smoothing_function2
is ~6.8x faster. If x
is not restricted to be in logspace, you can still use the properties of whatever space you are using to get a similar improvement. There could be more ways to improve this further, it depends on what your target is. You could also try implementing this in C++ or Cython.
Answered By - Abhinav Goyal
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.