消费者和生产者模式框架
目的
实现生产者和消费者这种并行分布式的框架,来分布式的服务实时进行处理。
即实时生产者端产生数据,数据产生在消费者端实时处理,并行计算,没有for循环那种前后关系,实现并行
multiprocessing模块提供了一个Process类,可以用来创建和管理进程
Python多线程的不足
生产者-消费者模型一般采用多线程的方式实现,然而在python中,由于GIT全局锁的存在,多个线程还是在单个内核上分时执行,多线程程序并不能像C++等那样充分利用多核CPU的计算性能,导致单个核心性能不堪重负而其他核心性能的浪费。
Python多进程的一般示例
python中真正支持多核CPU的是多进程并行机制,一般通过调用multiprocessing.Process实现多进程的创建,并且通过multiprocessing.Queue实现生产者进程与消费者进程的数据通信。
代码demo
#!/usr/bin/python
# -*- coding: utf-8 -*-
import os
import sys
curPath = os.path.abspath(os.path.dirname(__file__))
rootPath = os.path.split(curPath)[0]
sys.path.append(rootPath)
import time
from multiprocessing import Queue as multiQueue
from multiprocessing import Process
import json
class Msg(object):
def __init__(self, num):
self.num = num
def producer(msg_queue):
for i in range(5):
m = Msg(i)
msg_queue.put(m)
print('==>producer is processing ==> {}'.format(str(i)))
time.sleep(0.5)
def consumer1(q, send_q):
msg = q.get()
def func1(n):
return n ** 2
msg_2 = func1(msg.num)
print("{} is processed to {} by consumer1".format(msg, msg_2))
send_q.put(msg_2)
def consumer2(send_q):
msg = send_q.get()
def func2(n):
return n - 1
msg_2 = func2(msg)
print("{} is processed to {} by consumer2".format(msg, msg_2))
if __name__ == '__main__':
msg_queue = multiQueue()
send_msg_queue = multiQueue()
producer_process_list = []
consumer1_process_list = []
consumer2_process_list = []
for i in range(2):
producer_process_list.append(Process(target=producer, args=(msg_queue,)))
for i in range(10):
consumer1_process_list.append(Process(target=consumer1, args=(msg_queue, send_msg_queue)))
for i in range(10):
consumer2_process_list.append(Process(target=consumer2, args=(send_msg_queue,)))
for producer_process in producer_process_list:
producer_process.start()
for consumer1_process in consumer1_process_list:
consumer1_process.start()
for consumer2_process in consumer2_process_list:
consumer2_process.start()
for producer_process in producer_process_list:
producer_process.join()
for consumer1_process in consumer1_process_list:
consumer1_process.join()
for consumer2_process in consumer2_process_list:
consumer2_process.join()
在这个示例中,我们可以看到,主进程创建了个生产者和消费者,数据通过Queue实现共享和传递。在程序运行时,会自动创建生产者进程和消费者进程,并分配给不同的内核处理,从而实现了多生产者进程和多消费者进程的并行。然而,生产者与消费者是一个普通的函数,而不是我所设想的类,不符合面向对象的编程法则,只能执行非常简单的任务。在面向复杂设备或对象的数据采集和处理过程中,面向对象设计可以大幅简化程序设计并实现代码重用。而且,构造生产者类和消费者类,是理解并应用生产者-消费者机制的重要的桥梁。
基于Process派生类的生产者-消费者模型
from multiprocessing import Queue, Process
class producer(Process): # 可定义为信号检测或数据生产模块
def __init__(self, name, my_q):
super(producer, self).__init__()
self.name = name # 数据成员在构造函数中初始化
# 更复杂的类对象也在此处初始化
# 函数实现不具备此功能
self.q = my_q # 指定数据共享与通讯的队列
def run(self): # run函数在Process.start函数中调用
for i in range(1000000): # 生产1000000个数据
info = self.name + "的娃娃%s" % str(i)
# 数据放入队列
self.q.put(info)
self.q.put(None)
class consumer(Process): # 可定义为数据分析处理模块
def __init__(self, name, my_q):
super(consumer, self).__init__()
self.name = name
self.q = my_q
def run(self):
while True:
info = self.q.get()
if info:
print("%s拿走了%s" % (self.name, info))
else:
break
if __name__ == '__main__':
my_q = Queue(10)
# 构造生产者类和消费者类的实例
producer_process_list = []
consumer2_process_list = []
for i in range(2):
producer_process_list.append(producer('生产者'+str(i), my_q))
for i in range(10):
consumer2_process_list.append(consumer('消费者'+str(i), my_q))
for producer_process in producer_process_list:
producer_process.start()
for consumer2_process in consumer2_process_list:
consumer2_process.start()
for producer_process in producer_process_list:
producer_process.join()
for consumer2_process in consumer2_process_list:
consumer2_process.join()