python的进程池multiprocessing.Pool有八个重要函数:
apply、apply_async、map、map_async、imap、imap_unordered、starmap、starmap_async
下面是他们的各个比较和区别:
1)apply 和 apply_async:apply 一次执行一个任务,但 apply_async 可以异步执行,因而也可以实现并发
我们使用代码实现下:
apply:
(一个任务执行完再进行下一个任务)

def func(msg):
    print("msg:", msg)
    time.sleep(2)
    print("end")
if __name__ == "__main__":
    pool = multiprocessing.Pool()
    for i in range(2):
        msg = "hello %d" % (i)
        pool.apply(func, (msg,))
    print("Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~")
    pool.close()
    pool.join()
    print("Sub-process(es) done.")
    #输出:
    #	msg: hello 0
	#	end
	#	msg: hello 1
	#	end
	#	Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~
	#	Sub-process(es) done.

apply_async:
(单次启动一个任务,但是异步执行,启动后不等这个进程结束又开始执行新任务)

import multiprocessing
import time
 
def func(msg):
    print("msg:", msg)
    time.sleep(1)
    print("end")
 
if __name__ == "__main__":
    pool = multiprocessing.Pool(processes = 2)
    for i in range(2):
        msg = "hello %d" %(i)
        pool.apply_async(func, (msg, ))   #维持执行的进程总数为processes,当一个进程执行完毕后会添加新的进程进去
 
    print("Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~")
    pool.close()
    pool.join()   #调用join之前,先调用close函数,否则会出错。执行完close后不会有新的进程加入到pool,join函数等待所有子进程结束
    print("Sub-process(es) done.")
 
# 输出
# Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~
# msg: hello 0
# msg: hello 1
# end
# end
# Sub-process(es) done.

2)map 和 map_async 与 apply 和 apply_async 的区别是可以并发执行任务
我们使用代码实现下:
map:
(阻塞到任务列表中所有任务完成再往下执行 map)

import multiprocessing
import time
 
def func(msg):
    print("msg:", msg)
    time.sleep(2)
    print("end")
 
if __name__ == "__main__":
    pool = multiprocessing.Pool(2)
    pool.map(func, range(2))
 
    print("Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~")
    pool.close()
    pool.join()
    print("Sub-process(es) done.")
 
# 输出(注意Mark~位置):
# msg: 0
# msg: 1
# end
# end
# Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~
# Sub-process(es) done.

map_async
(异步,任务执行时不阻塞)

import multiprocessing
import time
 
def func(msg):
    print("msg:", msg)
    time.sleep(2)
    print("end")
 
if __name__ == "__main__":
    pool = multiprocessing.Pool(2)
    pool.map_async(func, range(2))
 
    print("Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~")
    pool.close()
    pool.join()
    print("Sub-process(es) done.")
 
# 输出:
# Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~
# msg: 0
# msg: 1
# end
# end
# Sub-process(es) done.

3)starmap 和 starmap_async 与 map 和 map_async 的区别是,starmap 和 starmap_async 可以传入多个参数
与二中 map 和 map_async 的区别是,这两个函数可以传入多个参数
starmap
(阻塞)

import multiprocessing
import time
 
def func(msg1, msg2):
    print("msg1:", msg1, "msg2:", msg2)
    time.sleep(2)
    print("end")
 
if __name__ == "__main__":
    pool = multiprocessing.Pool(2)
    msgs = [(1,1),(2,2)]
    pool.starmap(func, msgs)
 
    print("Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~")
    pool.close()
    pool.join()
    print("Sub-process(es) done.")
 
# 输出
# msg1: 1 msg2: 1
# msg1: 2 msg2: 2
# end
# end
# Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~
# Sub-process(es) done.

starmap_async
(异步)

import multiprocessing
import time
 
def func(msg1, msg2):
    print("msg1:", msg1, "msg2:", msg2)
    time.sleep(2)
    print("end")
 
if __name__ == "__main__":
    pool = multiprocessing.Pool(2)
    msgs = [(1, 1), (2, 2)]
    pool.starmap_async(func, msgs)
 
    print("Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~")
    pool.close()
    pool.join()
    print("Sub-process(es) done.")
 
# 输出:
# Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~
# msg1: 1 msg2: 1
# msg1: 2 msg2: 2
# end
# end
# Sub-process(es) done.

4)imap 和 imap_unordered 与 map_async 同样是异步,区别是
map_async生成子进程时使用的是list,而imap和 imap_unordered则是Iterable,map_async效率略高,而imap和 imap_unordered内存消耗显著的小。
在处理结果上,imap 和 imap_unordered 可以尽快返回一个Iterable的结果,而map_async则需要等待全部Task执行完毕,返回list。
而imap 和 imap_unordered 的区别是:imap 和 map_async一样,都按顺序等待Task的执行结果,而imap_unordered则不必。 imap_unordered返回的Iterable,会优先迭代到先执行完成的Task。 不理解的看最下面的一组例子。

list、有序——map_async

import multiprocessing
import time
 
def func(msg):
    print("msg:", msg)
    time.sleep(4-msg)
    return msg
 
if __name__ == "__main__":
    pool = multiprocessing.Pool(3)
    results = pool.map_async(func, range(3))
    for res in results.get():
        print(res)
 
    print("Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~")
    pool.close()
    pool.join()
 
    print("Sub-process(es) done.")
 
# 输出
# msg: 0
# msg: 1
# msg: 2
# 0
# 1
# 2
# Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~
# Sub-process(es) done.

iterate、有序——imap

import multiprocessing
import time
 
def func(msg):
    print("msg: ", msg)
    time.sleep(4-msg)
    return msg
 
if __name__ == "__main__":
    pool = multiprocessing.Pool(3)
    results = pool.imap(func, range(3))
    for res in results:
        print("res: ",res)
 
    print("Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~")
    pool.close()
    pool.join()
 
    print("Sub-process(es) done.")
 
# 输出
# msg: 0
# msg: 1
# msg: 2
# res: 0
# res: 1
# res: 2
# Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~
# Sub-process(es) done.

iterate、无序——imap_unordered

import multiprocessing
import time
 
def func(msg):
    print("msg: ", msg)
    time.sleep(4-msg)
    return msg
 
if __name__ == "__main__":
    pool = multiprocessing.Pool(3)
    results = pool.imap_unordered(func, range(3))
    for res in results:
        print("res: ", res)
 
    print("Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~")
    pool.close()
    pool.join()
 
    print("Sub-process(es) done.")
 
# 输出
# msg: 0
# msg: 1
# msg: 2
# res: 2
# res: 1
# res: 0
# Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~
# Sub-process(es) done.

总结:
注意:在获取进程池中的结果时,map_async、imap、imap_unordered三个方法都会阻塞。

map_async 与 imap、imap_unordered区别是:map_async需要等待所有Task执行结束后返回list,而imap 和 imap_unordered 可以尽快返回一个Iterable的结果。

imap 和 imap_unordered 的区别是:imap 和 map_async一样,都按顺序等待Task的执行结果,而imap_unordered则不必。 imap_unordered返回的Iterable,会优先迭代到先执行完成的Task。