Tuesday, April 19, 2022

[FIXED] Calculate the average position of a value in an array fast method

April 19, 2022 arrays, numpy, python No comments

Issue

I've got the following code to calculate the average position of 1s in a 2D numpy array that contains 1's and 0's. The issue is that it's very slow and I was wondering if a faster method is possible?

row_sum = 0
col_sum = 0
ones_count = 0

for row_count, row in enumerate(array):
    for col_count, col in enumerate(row):
        if col == 1:
            row_sum += row_count
            col_sum += col_count
            ones_count += 1

average_position_ones = (row_sum / ones_count, col_sum / ones_count)

Solution

Here are 3 ways to be quicker at calculating row_sum, col_sum and ones_count.

Baseline

For testing I use this array

import numpy as np
import numba as nb

np.random.seed(1)

n = 10**4
array = np.random.randint(0,2,(n,n))

Now your exact code takes 20.3 s ± 397 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) on my machine.

The Lazy One Liner Numpy Version:

%timeit np.stack(np.where(array)).sum(axis=1),array.sum() takes 1.13 s ± 12.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) on my machine.

Here np.stack(np.where(array)).sum(axis=1) is what you call row_sum and col_sum and array.sum() gives your ones_count

Avoid Looping Threw Twice

You can use your exact code with numba.jit

@nb.njit
def test():
    row_sum = 0
    col_sum = 0
    ones_count = 0

    for row_count, row in enumerate(array):
        for col_count, col in enumerate(row):
            if col == 1:
                row_sum += row_count
                col_sum += col_count
                ones_count += 1

    return row_sum,col_sum,ones_count

%timeit test()

This is a bit faster. It takes 50 ms ± 614 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) on my machine. But defininitly not worth the effort.

Multicore Version

A slight modification of your code can run multithreaded with numba

@nb.njit(parallel=True)
def test2():
    row_sum = 0
    col_sum = 0
    ones_count = 0
    
    for row_count in nb.prange(len(array)):
        row = array[row_count]
        for col_count, col in enumerate(row):
            if col == 1:
                row_sum += row_count
                col_sum += col_count
                ones_count += 1

    return row_sum,col_sum,ones_count

%timeit test2()

Now this does give a little speed up compared to the lazy numpy version. It takes 13.3 ms ± 2.14 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) on my 10 core machine. Though it's not using all 10 cores.

Notice that you have to be careful when modifying things in parallel. You could create a race condition. And this is not the case here only because numba takes counter measures for this specific case.

Further Optimisations

As pointed out by Jérôme Richard in a comment. The last version can be optimised by using uint8 to sture the instead of int64 which is the default. Just call .astype(np.uint8) on the array. Then it takes 9.38 ms ± 935 µs per loop (mean ± std. dev. of 7 runs, 1 loop each) on my machine.

Answered By - user2640045

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Tuesday, April 19, 2022

[FIXED] Calculate the average position of a value in an array fast method

Issue

Solution

Baseline

The Lazy One Liner Numpy Version:

Avoid Looping Threw Twice

Multicore Version

Further Optimisations

0 comments:

Post a Comment

Popular Posts

Labels