Thursday, February 1, 2024

[FIXED] How to use Python's multiprocessing library to parallelize a for loop that iterates a depth array?

February 01, 2024 multiprocessing, multithreading, numpy, performance, python No comments

Issue

I work with well log data that has different values related to different depths. Very often I need to iterate over the depths, performing calculations, and storing these values in a previously created array to create new well logs. It is something equivalent to the example below.

import numpy as np

depth = np.linspace(5000,6000,2001)
y1 = np.random.random((len(depth),3))
y2 = np.random.random(len(depth))

def fun_1(y1, y2):
    return y1 + y2

def fun_2(y1, y2):
    return sum(y1 * y2)

result_1 = np.zeros_like(y1)
result_2 = np.zeros_like(y2)

for i in range(len(depth)):
    prov_result_1 = fun_1(y1[i], y2[i])
    prov_result_2 = fun_2(y1[i], y2[i])
    
    result_1[i,:] = prov_result_1
    result_2[i]   = prov_result_2

In the example, I go depth by depth using the y1 and y2 values to calculate provisory values prov_result_1 and prov_result_2, which I then store in the respective index of result_1 and result_2, generating my final result_1 and result_2 curves when the loop ends. It's easy to see that, depending on the size of the depth array and the functions I apply at each depth, things can get out of hand and this code can take several hours to complete. Keep in mind that the y1 and y2 arrays can be much larger and the functions I apply are much more complex than these.

I would like to know if there is a way to use Python's multiprocessing library to parallelize this for loop. Other answers I've found here on StackOverflow don't directly translate to this problem and always seem to be much more complicated than they should be.

One option I imagine would be to divide the depth by the number of processors and do this same for loop in parallel, something like:

num_pool = 2
depth_cut = (depth[-1]-depth[0])/num_pool
depth_parallel = [depth[depth <= depth[0] + depth_cut],
                  depth[depth >  depth[0] + depth_cut]]
y1_parallel = [y1[depth <= depth[0] + depth_cut],
               y1[depth >  depth[0] + depth_cut]]
y2_parallel = [y2[depth <= depth[0] + depth_cut],
               y2[depth >  depth[0] + depth_cut]]

Only more generic. Then I would put the pieces of data into parallel processing, perform my calculations, and then concatenate everything again.

Solution

You could then :

use multiprocessing library,
create two parallelize functions to deal separately with chunks

The general idea is to create a separate process for each function, and deal with chunks one by one inside each function-process.

To my mind this is a simple way to keep control on differential treatment speeds related to both functions.

Proposed code

import numpy as np
from multiprocessing import Pool

def fun_1(args):
    y1_i, y2_i = args
    return y1_i + y2_i

def fun_2(args):
    y1_i, y2_i = args
    return sum(y1_i * y2_i)

def separate_process(func, y1, y2, num_pool):
    """Separated process to deal chunks with one function at a time"""
    # Create a pool of workers
    pool = Pool(num_pool)
    # Divide the data into chunks for treatment 
    # (good thing to do with large datasets)
    
    ### Assuming len(y1) = len(y2)
    chunk_size = len(y1) // num_pool
    chunks_y1 = [y1[i:i + chunk_size] for i in range (0, len(y1), chunk_size)]
    chunks_y2 = [y2[i:i + chunk_size] for i in range (0, len(y2), chunk_size)]

    # Apply function for each chunk in parallel
    results = pool.map(func, zip(chunks_y1, chunks_y2))
    # Avoid 'float' exception
    results = [result.tolist() if isinstance(result, np.ndarray) else [result] for result in results]
    # Close the pool
    pool.close()
    # Wait for the process to finish to keep control
    pool.join()

    # Concatenation : chunks treatment strategy is not supposed to be visible at the end
    return results

depth = np.linspace(5000, 6000, 2001)
y1 = np.random.random(len(depth))
y2 = np.random.random(len(depth))

num_pool = 2

fun_1_res = separate_process(fun_1, y1, y2, num_pool)
fun_2_res = separate_process(fun_2, y1, y2, num_pool)

print(len(fun_1_res))
### 3
print(len(fun_2_res))
### 3

Chunks results concatenation :

def concatenate_chunks(fun_x_res):
    return sum(fun_x_res, [])
    
fun_1_res = concatenate_chunks(fun_1_res)

# Final length (supposed to be 2001)
print(len(fun_1_res))
### 2001

Note :

We could widen the perspective and calculate fun_res_1 and fun_res_2 in two separate processes too.

So you should have in this last case :

2 main processes + 2 X 2 sub-processes (in separate_process) = a total of 6 processes

A powerful computing environment is required for this.

Answered By - Laurent B.

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Thursday, February 1, 2024

[FIXED] How to use Python's multiprocessing library to parallelize a for loop that iterates a depth array?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels