Pool
The class multiprocessing.Pool
is an operator of conducting multi-process jobs
The given task are splitted in the Pool
object and multi-processed
The jobs are splitted into n process in parallel and gethered back after the jobs are done by calling Pool(proccesses = n)
map function
With a little modified from official documentation, the example is given1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16import time
from multiprocessing import Pool
def task(x):
time.sleep(1)
return x
start_time = time.time()
for x in range(10):
square(x)
print('Single-process takes {} seconds'.format(time.time() - start_time))
start_time = time.time()
with Pool(processes = 8) as pool:
pol.map(square, range(10))
print('Multi-process takes {} seconds'.format(time.time() - start_time))
and it gives1
2Single-process takes 10.010777711868286 seconds
Multi-process takes 2.13849139213562 seconds
Get the return value from function
You can get the returned value from the Pool.map function
1 | start_time = time.time() |
and it gives1
2[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]
Pool.map takes 8.063436031341553 seconds
or you can use pool.imap
for iterative operation with tqdm
you can see the progress of multi-processing
1 | from tqdm import tqdm |
and it gives1
2
3Multi-processing: 100%|██████████| 64/64 [00:08<00:00, 7.99it/s]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]
Pool.imap takes 8.06504487991333 seconds
More than one arguement as input
You may notice the function in the previous example only takes one input, which is usually a number x
.
However, sometimes you may want to put more inputs in the function, then you need to use the function partial
from functools
.
Here is an example of using partial and pprint:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15from functools import partial
from pprint import pprint
def ChineseZodiac(x, zodiac_dict):
x += 1987
return (x, zodiac_dict[(x-4)%len(zodiac_dict)])
animals = ['鼠', '牛', '虎', '兔', '龍', '蛇', '馬', '羊', '猴', '雞', '狗', '豬']
animals = {number:chinese for number, chinese in enumerate(animals)}
partial_func = partial(ChineseZodiac, zodiac_dict = animals)
results = []
with Pool(processes = 8) as pool:
for result in tqdm(pool.imap(partial_func, range(24)), total=24, desc = 'Multi-processing'):
results.append(result)
pprint (results)
Output:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26Multi-processing: 100%|██████████| 24/24 [00:00<00:00, 23082.62it/s]
[(1987, '兔'),
(1988, '龍'),
(1989, '蛇'),
(1990, '馬'),
(1991, '羊'),
(1992, '猴'),
(1993, '雞'),
(1994, '狗'),
(1995, '豬'),
(1996, '鼠'),
(1997, '牛'),
(1998, '虎'),
(1999, '兔'),
(2000, '龍'),
(2001, '蛇'),
(2002, '馬'),
(2003, '羊'),
(2004, '猴'),
(2005, '雞'),
(2006, '狗'),
(2007, '豬'),
(2008, '鼠'),
(2009, '牛'),
(2010, '虎')]