Skip to content Skip to sidebar Skip to footer

Parallel Processing Loop Using Multiprocessing Pool

I want to process a large for loop in parallel, and from what I have read the best way to do this is to use the multiprocessing library that comes standard with Python. I have a li

Solution 1:

Use a plain function, not a class, when possible. Use a class only when there is a clear advantage to doing so.

If you really need to use a class, then given your setup, pass an instance of Parallel:

results = pool.map(Parallel(args), self.list_objects)

Since the instance has a __call__ method, the instance itself is callable, like a function.


By the way, the __call__ needs to accept an additional argument:

def __call__(self, val):

since pool.map is essentially going to call in parallel

p = Parallel(args)
result = []
for val in self.list_objects:
    result.append(p(val))

Solution 2:

Pool.map simply applies a function (actually, a callable) in parallel. It has no notion of objects or classes. Since you pass it a class, it simply calls __init__ - __call__ is never executed. You need to either call it explicitly from __init__ or use pool.map(Parallel.__call__, preinitialized_objects)


Post a Comment for "Parallel Processing Loop Using Multiprocessing Pool"