Pool
The class multiprocessing.Pool is an operator of conducting multi-process jobs
The given task are splitted in the Pool object and multi-processed
The jobs are splitted into n process in parallel and gethered back after the jobs are done by calling Pool(proccesses = n)
map function
With a little modified from official documentation, the example is given1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16import time
from multiprocessing import Pool
def task(x):
time.sleep(1)
return x
start_time = time.time()
for x in range(10):
square(x)
print('Single-process takes {} seconds'.format(time.time() - start_time))
start_time = time.time()
with Pool(processes = 8) as pool:
pol.map(square, range(10))
print('Multi-process takes {} seconds'.format(time.time() - start_time))
and it gives1
2Single-process takes 10.010777711868286 seconds
Multi-process takes 2.13849139213562 seconds
Get the return value from function
You can get the returned value from the Pool.map function
1 | start_time = time.time() |
and it gives1
2[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]
Pool.map takes 8.063436031341553 seconds
or you can use pool.imap for iterative operation with tqdm you can see the progress of multi-processing
1 | from tqdm import tqdm |
and it gives1
2
3Multi-processing: 100%|██████████| 64/64 [00:08<00:00, 7.99it/s]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]
Pool.imap takes 8.06504487991333 seconds
More than one arguement as input
You may notice the function in the previous example only takes one input, which is usually a number x.
However, sometimes you may want to put more inputs in the function, then you need to use the function partial from functools.
Here is an example of using partial and pprint:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15from functools import partial
from pprint import pprint
def ChineseZodiac(x, zodiac_dict):
x += 1987
return (x, zodiac_dict[(x-4)%len(zodiac_dict)])
animals = ['鼠', '牛', '虎', '兔', '龍', '蛇', '馬', '羊', '猴', '雞', '狗', '豬']
animals = {number:chinese for number, chinese in enumerate(animals)}
partial_func = partial(ChineseZodiac, zodiac_dict = animals)
results = []
with Pool(processes = 8) as pool:
for result in tqdm(pool.imap(partial_func, range(24)), total=24, desc = 'Multi-processing'):
results.append(result)
pprint (results)
Output:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26Multi-processing: 100%|██████████| 24/24 [00:00<00:00, 23082.62it/s]
[(1987, '兔'),
(1988, '龍'),
(1989, '蛇'),
(1990, '馬'),
(1991, '羊'),
(1992, '猴'),
(1993, '雞'),
(1994, '狗'),
(1995, '豬'),
(1996, '鼠'),
(1997, '牛'),
(1998, '虎'),
(1999, '兔'),
(2000, '龍'),
(2001, '蛇'),
(2002, '馬'),
(2003, '羊'),
(2004, '猴'),
(2005, '雞'),
(2006, '狗'),
(2007, '豬'),
(2008, '鼠'),
(2009, '牛'),
(2010, '虎')]