目录
- 前言
- Test
- 1. Master / Server
- 2. Slave / Client
- 运行Log
- GitLab
- 参考
前言
通过对multiprocessing.managers的学习,写了一个基于socket的分布式计算的小Demo。这个Demo做的事情是,master产生0-20的整数并放入task queue,slave在集群网络中获取task queue取数,做sum操作并将结果放进result queue,master打印出result queue的元素值。
Test
- demo在本地以多进程模拟分布式环境运行。若需要运行在不同机器环境,则需更改
client.py
中的本地环回地址IP为 server/master 机器IP。 - 运行
server.py
后再运行client.py
,client_2.py
,client_3.py
…也先运行client.py
再运行server.py
程序。
client端运行的代码都相同。
1. Master / Server
# server.py
# -*- coding:utf-8 -*-
# 多进程分布式Demo
# 服务器端
# master服务端原理:通过managers模块把Queue通过网络暴露出去,其他机器的进程就可以访问Queue了
# 服务进程负责启动Queue,把Queue注册到网络上,然后往Queue里面写入任务,代码如下:
import random, Queue as queue
from multiprocessing.managers import BaseManager
import numpy as np
import time
from jc import utils # 私人工具库,可删除此引用
# 初始化自定义logger
mlog = utils.my_logger("Server")
# 发送任务的队列
task_queue = queue.Queue()
# 接收结果的队列
result_queue = queue.Queue()
# 使用标准函数来代替lambda函数,避免python2.7中,pickle无法序列化lambda的问题
def get_task_queue():
global task_queue
return task_queue
# 使用标准函数来代替lambda函数,避免python2.7中,pickle无法序列化lambda的问题
def get_result_queue():
global result_queue
return result_queue
def startManager(host, port, authkey):
# 把两个Queue都注册到网络上,callable参数关联了Queue对象,注意回调函数不能使用括号
BaseManager.register('get_task_queue', callable=get_task_queue)
BaseManager.register('get_result_queue', callable=get_result_queue)
# 设置host,绑定端口port,设置验证码为authkey
manager = BaseManager(address=(host, port), authkey=authkey)
# 启动manager服务器
manager.start()
return manager
def put_queue(manager, objs):
# 通过网络访问queueu
task = manager.get_task_queue()
for obj in objs:
try:
#print("Put obj:{}".format(obj))
mlog.info("Put obj:{}".format(obj))
task.put(obj)
time.sleep(1)
except queue.Full:
mlog.info("put_queue task full.exit ")
break
def get_result(worker):
# 通过网络访问queueu
result = worker.get_result_queue()
while 1:
try:
n = result.get(timeout=10)
mlog.info("Result get {}".format(n))
time.sleep(1)
except queue.Empty:
mlog.info("get_result result empty...retring")
continue
else:
pass
if __name__ == "__main__":
host = '127.0.0.1'
port = 5000
authkey = b'abc'
# 启动manager服务器
manager = startManager(host, port, authkey)
# 数据
data = range(0,20,1)
# 给task队列添加数据
put_queue(manager, data)
#get_queue(manager)
get_result(manager)
# 关闭服务器
manager.shutdown
2. Slave / Client
# client.py
# -*- coding:utf-8 -*-
# 在分布式多进程环境下,添加任务到Queue不可以直接对原始的task_queue进行操作,
# 那样就绕过了QueueManager的封装,必须通过manager.get_task_queue()获得的Queue接口添加。
import random, Queue as queue
import time
from multiprocessing.managers import BaseManager
from jc import utils
# 初始化自定义logger
mlog = utils.my_logger("Client")
cal_queue = queue.Queue(3)
def start_worker(host, port, authkey):
# 由于这个BaseManager只从网络上获取queue,所以注册时只提供名字
BaseManager.register('get_task_queue')
BaseManager.register('get_result_queue')
mlog.info ('Connect to server %s' % host)
# 注意,端口port和验证码authkey必须和manager服务器设置的完全一致
worker = BaseManager(address=(host, port), authkey=authkey)
# 链接到manager服务器
try:
worker.connect()
except Exception as e:
mlog.exception(e)
mlog.info("Tring reconnection...")
time.sleep(1)
start_worker(host, port, authkey)
else:
mlog.info('Connecting server %s' % host)
return worker
def get_queue(worker):
if not worker:
mlog.info("worker is None, exit")
task = worker.get_task_queue()
result = worker.get_result_queue()
# 从task队列取数据,并添加到result队列中
tag = 0
while 1:
tag = tag + 1
time.sleep(1)
if cal_queue.full() or (tag>3 and not cal_queue.empty()):
cal_sum = 0
while not cal_queue.empty():
cal_sum += cal_queue.get()
result.put(cal_sum)
mlog.info('result put %d' % cal_sum)
tag = 0
try:
n = task.get(timeout=10)
mlog.info('worker get %d' % n)
cal_queue.put(n)
except queue.Empty:
mlog.info("get_queue task empty...retring")
continue
except queue.Full:
mlog.info("put_cal_queue task full...waiting")
continue
if __name__ == "__main__":
host = '127.0.0.1'
port = 5000
authkey = b'abc'
# 启动worker
worker = start_worker(host, port, authkey)
# 获取队列
get_queue(worker)
运行Log
Master 1x + Slave 2x
在master中get的结果没有对(0++20)的结果汇总,这是预期的。
- server:
/usr/bin/python2.7 /Users/gdlocal1/PycharmProjects/test/test.py
2019-08-27 18:36:23,439 - Server:put_queue - INFO - Put obj:0
2019-08-27 18:36:24,444 - Server:put_queue - INFO - Put obj:1
2019-08-27 18:36:25,448 - Server:put_queue - INFO - Put obj:2
2019-08-27 18:36:26,453 - Server:put_queue - INFO - Put obj:3
2019-08-27 18:36:27,457 - Server:put_queue - INFO - Put obj:4
2019-08-27 18:36:28,461 - Server:put_queue - INFO - Put obj:5
2019-08-27 18:36:29,466 - Server:put_queue - INFO - Put obj:6
2019-08-27 18:36:30,467 - Server:put_queue - INFO - Put obj:7
2019-08-27 18:36:31,471 - Server:put_queue - INFO - Put obj:8
2019-08-27 18:36:32,476 - Server:put_queue - INFO - Put obj:9
2019-08-27 18:36:33,479 - Server:put_queue - INFO - Put obj:10
2019-08-27 18:36:34,484 - Server:put_queue - INFO - Put obj:11
2019-08-27 18:36:35,488 - Server:put_queue - INFO - Put obj:12
2019-08-27 18:36:36,492 - Server:put_queue - INFO - Put obj:13
2019-08-27 18:36:37,497 - Server:put_queue - INFO - Put obj:14
2019-08-27 18:36:38,497 - Server:put_queue - INFO - Put obj:15
2019-08-27 18:36:39,502 - Server:put_queue - INFO - Put obj:16
2019-08-27 18:36:40,507 - Server:put_queue - INFO - Put obj:17
2019-08-27 18:36:41,509 - Server:put_queue - INFO - Put obj:18
2019-08-27 18:36:42,512 - Server:put_queue - INFO - Put obj:19
2019-08-27 18:36:43,518 - Server:get_result - INFO - Result get 6
2019-08-27 18:36:44,521 - Server:get_result - INFO - Result get 9
2019-08-27 18:36:45,525 - Server:get_result - INFO - Result get 27
2019-08-27 18:36:46,528 - Server:get_result - INFO - Result get 25
2019-08-27 18:36:47,533 - Server:get_result - INFO - Result get 40
2019-08-27 18:36:48,537 - Server:get_result - INFO - Result get 46
2019-08-27 18:36:59,545 - Server:get_result - INFO - get_result result empty...retring
2019-08-27 18:37:05,564 - Server:get_result - INFO - Result get 37
2019-08-27 18:37:16,569 - Server:get_result - INFO - get_result result empty...retring
- client 1:
2019-08-27 18:36:20,573 - Client_1:start_worker - INFO - Connect to server 127.0.0.1
2019-08-27 18:36:23,504 - Client_1:start_worker - INFO - Connecting server 127.0.0.1
2019-08-27 18:36:24,514 - Client_1:get_queue - INFO - worker get 0
2019-08-27 18:36:25,517 - Client_1:get_queue - INFO - worker get 2
2019-08-27 18:36:27,505 - Client_1:get_queue - INFO - worker get 4
2019-08-27 18:36:28,507 - Client_1:get_queue - INFO - result put 6
2019-08-27 18:36:29,485 - Client_1:get_queue - INFO - worker get 6
2019-08-27 18:36:30,489 - Client_1:get_queue - INFO - worker get 7
2019-08-27 18:36:35,501 - Client_1:get_queue - INFO - worker get 12
2019-08-27 18:36:36,506 - Client_1:get_queue - INFO - result put 25
2019-08-27 18:36:36,507 - Client_1:get_queue - INFO - worker get 13
2019-08-27 18:36:39,508 - Client_1:get_queue - INFO - worker get 16
2019-08-27 18:36:40,513 - Client_1:get_queue - INFO - worker get 17
2019-08-27 18:36:41,517 - Client_1:get_queue - INFO - result put 46
2019-08-27 18:36:41,518 - Client_1:get_queue - INFO - worker get 18
2019-08-27 18:36:42,519 - Client_1:get_queue - INFO - worker get 19
2019-08-27 18:36:53,524 - Client_1:get_queue - INFO - get_queue task empty...retring
2019-08-27 18:37:04,533 - Client_1:get_queue - INFO - get_queue task empty...retring
2019-08-27 18:37:05,538 - Client_1:get_queue - INFO - result put 37
2019-08-27 18:37:15,541 - Client_1:get_queue - INFO - get_queue task empty...retring
2019-08-27 18:37:26,547 - Client_1:get_queue - INFO - get_queue task empty...retring
- client 2:
2019-08-27 18:36:16,474 - Client_2:start_worker - INFO - Connect to server 127.0.0.1
2019-08-27 18:36:23,516 - Client_2:start_worker - INFO - Connecting server 127.0.0.1
2019-08-27 18:36:24,526 - Client_2:get_queue - INFO - worker get 1
2019-08-27 18:36:26,508 - Client_2:get_queue - INFO - worker get 3
2019-08-27 18:36:28,482 - Client_2:get_queue - INFO - worker get 5
2019-08-27 18:36:29,487 - Client_2:get_queue - INFO - result put 9
2019-08-27 18:36:31,483 - Client_2:get_queue - INFO - worker get 8
2019-08-27 18:36:32,484 - Client_2:get_queue - INFO - worker get 9
2019-08-27 18:36:33,485 - Client_2:get_queue - INFO - worker get 10
2019-08-27 18:36:34,486 - Client_2:get_queue - INFO - result put 27
2019-08-27 18:36:34,486 - Client_2:get_queue - INFO - worker get 11
2019-08-27 18:36:37,503 - Client_2:get_queue - INFO - worker get 14
2019-08-27 18:36:38,505 - Client_2:get_queue - INFO - worker get 15
2019-08-27 18:36:39,509 - Client_2:get_queue - INFO - result put 40
2019-08-27 18:36:49,513 - Client_2:get_queue - INFO - get_queue task empty...retring
GitLab