Saturday, June 4, 2022

[FIXED] What is the most efficient way to deal with a loop on NumPy arrays?

June 04, 2022 numpy, optimization, python No comments

Issue

The question is simple: here is my current algorithm. This is terribly slow because of the loops on the arrays. Is there a way to change it in order to avoid the loops and take advantage of the NumPy arrays types ?

import numpy as np

def loopingFunction(listOfVector1, listOfVector2):
    resultArray = []

    for vector1 in listOfVector1:
        result = 0

        for vector2 in listOfVector2:
            result += np.dot(vector1, vector2) * vector2[2]

        resultArray.append(result)

    return np.array(resultArray)

listOfVector1x = np.linspace(0,0.33,1000)
listOfVector1y = np.linspace(0.33,0.66,1000)
listOfVector1z = np.linspace(0.66,1,1000)

listOfVector1 = np.column_stack((listOfVector1x, listOfVector1y, listOfVector1z))

listOfVector2x = np.linspace(0.33,0.66,1000)
listOfVector2y = np.linspace(0.66,1,1000)
listOfVector2z = np.linspace(0, 0.33, 1000)

listOfVector2 = np.column_stack((listOfVector2x, listOfVector2y, listOfVector2z))

result = loopingFunction(listOfVector1, listOfVector2)

I am supposed to deal with really big arrays, that have way more than 1000 vectors in each. So if you have any advice, I'll take it.

Solution

You can at least remove the two forloop to save alot of time, use matrix computation directly

import time

import numpy as np

def loopingFunction(listOfVector1, listOfVector2):
    resultArray = []

    for vector1 in listOfVector1:
        result = 0

        for vector2 in listOfVector2:
            result += np.dot(vector1, vector2) * vector2[2]

        resultArray.append(result)

    return np.array(resultArray)

def loopingFunction2(listOfVector1, listOfVector2):
    resultArray = np.sum(np.dot(listOfVector1, listOfVector2.T) * listOfVector2[:,2], axis=1)

    return resultArray

listOfVector1x = np.linspace(0,0.33,1000)
listOfVector1y = np.linspace(0.33,0.66,1000)
listOfVector1z = np.linspace(0.66,1,1000)

listOfVector1 = np.column_stack((listOfVector1x, listOfVector1y, listOfVector1z))

listOfVector2x = np.linspace(0.33,0.66,1000)
listOfVector2y = np.linspace(0.66,1,1000)
listOfVector2z = np.linspace(0, 0.33, 1000)

listOfVector2 = np.column_stack((listOfVector2x, listOfVector2y, listOfVector2z))
import time
t0 = time.time()
result = loopingFunction(listOfVector1, listOfVector2)
print('time old version',time.time() - t0)
t0 = time.time()
result2 = loopingFunction2(listOfVector1, listOfVector2)
print('time matrix computation version',time.time() - t0)
print('Are results are the same',np.allclose(result,result2))

Which gives

time old version 1.174513578414917
time matrix computation version 0.011968612670898438
Are results are the same True

Basically, the less loop the better.

Answered By - ymmx

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Saturday, June 4, 2022

[FIXED] What is the most efficient way to deal with a loop on NumPy arrays?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels