python 多线程多 CPU UBantu python多线程能并行吗

转载

西门吹雪 2024-04-10 22:09:51

文章标签 数据进程池 Code 文章分类 Python 后端开发

并发 & 并行

并发：是指系统具有处理多个任务的能力

并行：是指系统具有同时处理多个任务的能力

并行是并发的一个子集

同步 & 异步

同步：当进程执行到一个I/O（等待外部数据的时候）----------等：同步

异步：　　　　　　　　　　　　　　　　　　　　 ----------不等，直到接收到数据再回来执行

GIL：全局解释锁

　　因为有GIL锁，所以同一时刻，只有一个线程被CPU执行

任务：IO密集型

　　　计算密集型

对于IO密集型：Python 的多线程是有意义的

　　　　　　　可以采用多进程+协程

对于计算密集型：Python 的多线程就不推荐了，不适用了。

线程锁(互斥锁Mutex)

一个进程下可以启动多个线程，多个线程共享父进程的内存空间，也就意味着每个线程可以访问同一份数据，此时，如果2个线程同时要修改同一份数据，会出现什么状况？

import
time

import threading

def addNum():

global num #在每个线程中都获取这个全局变量

print('--get num:',num )

time.sleep(1)

lock.acquire() #修改数据前加锁

num -=1 #对此公共变量进行-1操作

lock.release() #修改后释放

num = 100 #设定一个共享变量

thread_list = []

lock = threading.Lock() #生成全局锁

for i in range(100):

t = threading.Thread(target=addNum)

t.start()

thread_list.append(t)

for t in thread_list: #等待所有线程执行完毕

t.join()

print('final num:', num )

RLock（递归锁）

说白了就是在一个大锁中还要再包含子锁

python 多线程多 CPU UBantu python多线程能并行吗_进程池

python 多线程多 CPU UBantu python多线程能并行吗_进程池_02

import threading,time
 
def run1():
    print("grab the first part data")
    lock.acquire()
    global num
    num +=1
    lock.release()
    return num
def run2():
    print("grab the second part data")
    lock.acquire()
    global  num2
    num2+=1
    lock.release()
    return num2
def run3():
    lock.acquire()
    res = run1()
    print('--------between run1 and run2-----')
    res2 = run2()
    lock.release()
    print(res,res2)
 
 
if __name__ == '__main__':
 
    num,num2 = 0,0
    lock = threading.RLock()
    for i in range(10):
        t = threading.Thread(target=run3)
        t.start()
 
while threading.active_count() != 1:
    print(threading.active_count())
else:
    print('----all threads done---')
    print(num,num2)

View Code

Semaphore(信号量)

互斥锁同时只允许一个线程更改数据，而Semaphore是同时允许一定数量的线程更改数据，比如厕所有3个坑，那最多只允许3个人上厕所，后面的人只能等里面有人出来了才能再进去。

python 多线程多 CPU UBantu python多线程能并行吗_进程池

python 多线程多 CPU UBantu python多线程能并行吗_进程池_02

import threading,time
 
def run(n):
    semaphore.acquire()
    time.sleep(1)
    print("run the thread: %s\n" %n)
    semaphore.release()
 
if __name__ == '__main__':
 
    num= 0
    semaphore  = threading.BoundedSemaphore(5) #最多允许5个线程同时运行
    for i in range(20):
        t = threading.Thread(target=run,args=(i,))
        t.start()
 
while threading.active_count() != 1:
    pass #print threading.active_count()
else:
    print('----all threads done---')
    print(num)

View Code

Event

通过Event来实现两个或多个线程间的交互，下面是一个红绿灯的例子，即起动一个线程做交通指挥灯，生成几个线程做车辆，车辆行驶按红灯停，绿灯行的规则。

python 多线程多 CPU UBantu python多线程能并行吗_进程池

python 多线程多 CPU UBantu python多线程能并行吗_进程池_02

import threading,time
import random
def light():
    if not event.isSet():
        event.set() #wait就不阻塞 #绿灯状态
    count = 0
    while True:
        if count < 10:
            print('\033[42;1m--green light on---\033[0m')
        elif count <13:
            print('\033[43;1m--yellow light on---\033[0m')
        elif count <20:
            if event.isSet():
                event.clear()
            print('\033[41;1m--red light on---\033[0m')
        else:
            count = 0
            event.set() #打开绿灯
        time.sleep(1)
        count +=1
def car(n):
    while 1:
        time.sleep(random.randrange(10))
        if  event.isSet(): #绿灯
            print("car [%s] is running.." % n)
        else:
            print("car [%s] is waiting for the red light.." %n)
if __name__ == '__main__':
    event = threading.Event()
    Light = threading.Thread(target=light)
    Light.start()
    for i in range(3):
        t = threading.Thread(target=car,args=(i,))
        t.start()

View Code

队列queue （重点）

1,queue.Queue() FIFO队列-先进先出

2,queue.LifoQueue() LIFO队列，先进后出

3，queue.PriorityQueue() 按照优先级，越低越先出

方法：

q.size 返回队列大小

q.empty() 如果队列为空返回true

q.full() 如果队列为满返回true

q.get () 里面可以设置block 的t,f

q.put()

q.join()实际是队列为空时再执行别的操作

生产者消费者模型

在并发编程中使用生产者和消费者模式能够解决绝大多数并发问题。该模式通过平衡生产线程和消费线程的工作能力来提高程序的整体处理数据的速度。

为什么要使用生产者和消费者模式

在线程世界里，生产者就是生产数据的线程，消费者就是消费数据的线程。在多线程开发当中，如果生产者处理速度很快，而消费者处理速度很慢，那么生产者就必须等待消费者处理完，才能继续生产数据。同样的道理，如果消费者的处理能力大于生产者，那么消费者就必须等待生产者。为了解决这个问题于是引入了生产者和消费者模式。

什么是生产者消费者模式

生产者消费者模式是通过一个容器来解决生产者和消费者的强耦合问题。生产者和消费者彼此之间不直接通讯，而通过阻塞队列来进行通讯，所以生产者生产完数据之后不用等待消费者处理，直接扔给阻塞队列，消费者不找生产者要数据，而是直接从阻塞队列里取，阻塞队列就相当于一个缓冲区，平衡了生产者和消费者的处理能力。

python 多线程多 CPU UBantu python多线程能并行吗_进程池

python 多线程多 CPU UBantu python多线程能并行吗_进程池_02

import threading
import queue
 
def producer():
    for i in range(10):
        q.put("骨头 %s" % i )
 
    print("开始等待所有的骨头被取走...")
    q.join()
    print("所有的骨头被取完了...")
 
 
def consumer(n):
 
    while q.qsize() >0:
 
        print("%s 取到" %n  , q.get())
        q.task_done() #告知这个任务执行完了
 
 
q = queue.Queue()
 
 
 
p = threading.Thread(target=producer,)
p.start()
 
c1 = consumer("李闯")

View Code

python 多线程多 CPU UBantu python多线程能并行吗_进程池

python 多线程多 CPU UBantu python多线程能并行吗_进程池_02

import time,random
import queue,threading
q = queue.Queue()
def Producer(name):
  count = 0
  while count <20:
    time.sleep(random.randrange(3))
    q.put(count)
    print('Producer %s has produced %s baozi..' %(name, count))
    count +=1
def Consumer(name):
  count = 0
  while count <20:
    time.sleep(random.randrange(4))
    if not q.empty():
        data = q.get()
        print(data)
        print('\033[32;1mConsumer %s has eat %s baozi...\033[0m' %(name, data))
    else:
        print("-----no baozi anymore----")
    count +=1
p1 = threading.Thread(target=Producer, args=('A',))
c1 = threading.Thread(target=Consumer, args=('B',))
p1.start()
c1.start()

View Code

多进程multiprocessing

python 多线程多 CPU UBantu python多线程能并行吗_进程池

python 多线程多 CPU UBantu python多线程能并行吗_进程池_02

from multiprocessing import Process
import os
 
def info(title):
    print(title)
    print('module name:', __name__)
    print('parent process:', os.getppid())
    print('process id:', os.getpid())
    print("\n\n")
 
def f(name):
    info('\033[31;1mfunction f\033[0m')
    print('hello', name)
 
if __name__ == '__main__':
    info('\033[32;1mmain process line\033[0m')
    p = Process(target=f, args=('bob',))
    p.start()
    p.join()

View Code

进程间通讯　　

不同进程间内存是不共享的，要想实现两个进程间的数据交换，可以用以下方法：

Queues

使用方法跟threading里的queue差不多

from multiprocessing import Process, Queue
 
def f(q):
    q.put([42, None, 'hello'])
 
if __name__ == '__main__':
    q = Queue()
    p = Process(target=f, args=(q,))
    p.start()
    print(q.get())    # prints "[42, None, 'hello']"
    p.join()

from multiprocessing import Process, Pipe
 
def f(conn):
    conn.send([42, None, 'hello'])
    conn.close()
 
if __name__ == '__main__':
    parent_conn, child_conn = Pipe()
    p = Process(target=f, args=(child_conn,))
    p.start()
    print(parent_conn.recv())   # prints "[42, None, 'hello']"
    p.join()

Managers

Manager返回的Manager对象（）控制一个承载Python对象的服务器进程，并允许其他进程使用代理来操纵它们。

经理返回的经理（）将支持类型 list, dict, Namespace, Lock, RLock, Semaphore, BoundedSemaphore, Condition, Event, Barrier, Queue, Value and Array

from multiprocessing import Process, Manager
 
def f(d, l):
    d[1] = '1'
    d['2'] = 2
    d[0.25] = None
    l.append(1)
    print(l)
 
if __name__ == '__main__':
    with Manager() as manager:
        d = manager.dict()
 
        l = manager.list(range(5))
        p_list = []
        for i in range(10):
            p = Process(target=f, args=(d, l))
            p.start()
            p_list.append(p)
        for res in p_list:
            res.join()
 
        print(d)
        print(l)

进程池　　

进程池内部维护一个进程序列，当使用时，则去进程池中获取一个进程，如果进程池序列中没有可供使用的进进程，那么程序就会等待，直到进程池中有可用进程为止。

进程池中有两个方法：

apply
apply_async

from  multiprocessing import Process,Pool
import time
 
def Foo(i):
    time.sleep(2)
    return i+100
 
def Bar(arg):
    print('-->exec done:',arg)
 
pool = Pool(5)
 
for i in range(10):
    pool.apply_async(func=Foo, args=(i,),callback=Bar)
    #pool.apply(func=Foo, args=(i,))
 
print('end')
pool.close()
pool.join()#进程池中进程执行完毕后再关闭，如果注释，那么程序直接关闭。

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。