Monday, January 24, 2022

[FIXED] Speeding up random number generation by parallelizing

January 24, 2022 multiprocessing, numpy, parallel-processing, python, random No comments

Issue

I need to create many large numpy arrays (4e6, 100) with random numbers from a standard normal distribution, which I'm trying to speed up. I tried to generate different parts of the arrays using multiple cores but I'm not getting the expected speed improvements. Is there something I'm doing wrong, or am I wrong to expect speed improvements in this way?

from numpy.random import default_rng
from multiprocessing import Pool
from time import time


def rng_mp(rng):
    return rng.standard_normal((250000, 100))


if __name__ == '__main__':

    n_proc = 4
    rngs = [default_rng(n) for n in range(n_proc)]
    rng_all = default_rng(1)

    start = time()
    result = rng_all.standard_normal((int(1e6), 100))
    print(f'Single process: {time() - start:.3f} seconds')

    start = time()
    with Pool(processes=n_proc) as p:
        result = p.map_async(rng_mp, rngs).get()
    print(f'MP: {time() - start:.3f} seconds')

    # Single process: 1.114 seconds
    # MP: 2.634 seconds

Solution

I suspected the slowdown results simply from the fact that you need to be moving lots of data from the address spaces of the subprocesses back to the main process. I also suspected that the C-language implementation numpy used for random number generation releases the Global Interpreter Lock and that using multithreading instead of multiprocessing would solve your performance problem:

from numpy.random import default_rng
from multiprocessing.pool import ThreadPool
from time import time


def rng_mp(rng):
    return rng.standard_normal((250000, 100))


if __name__ == '__main__':

    n_proc = 4
    rngs = [default_rng(n) for n in range(n_proc)]
    rng_all = default_rng(1)

    start = time()
    result = rng_all.standard_normal((int(1e6), 100))
    print(f'Single process: {time() - start:.3f} seconds')

    start = time()
    with ThreadPool(processes=n_proc) as p:
        result = p.map_async(rng_mp, rngs).get()
    print(f'MT: {time() - start:.3f} seconds')

Prints:

Single process: 1.210 seconds
MT: 0.413 seconds

Answered By - Booboo

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Monday, January 24, 2022

[FIXED] Speeding up random number generation by parallelizing

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels