目录

​背景​

​代码​

​参考:​


背景

评估模型的推理时间时有需要注意的地方。如torch.cuda.synchronize(),因为pytorch代码执行时异步的,使用该代码会等待gpu上所有操作结束后再接着运行代码、计算时间等【1】。

代码

函数【2】:

import time
def measure_inference_speed(model, data, max_iter=200, log_interval=50):
model.eval()

# the first several iterations may be very slow so skip them
num_warmup = 5
pure_inf_time = 0
fps = 0

# benchmark with 2000 image and take the average
for i in range(max_iter):

torch.cuda.synchronize()
start_time = time.perf_counter()

with torch.no_grad():
model(*data)

torch.cuda.synchronize()
elapsed = time.perf_counter() - start_time

if i >= num_warmup:
pure_inf_time += elapsed
if (i + 1) % log_interval == 0:
fps = (i + 1 - num_warmup) / pure_inf_time
print(
f'Done image [{i + 1:<3}/ {max_iter}], '
f'fps: {fps:.1f} img / s, '
f'times per image: {1000 / fps:.1f} ms / img',
flush=True)

if (i + 1) == max_iter:
fps = (i + 1 - num_warmup) / pure_inf_time
print(
f'Overall fps: {fps:.1f} img / s, '
f'times per image: {1000 / fps:.1f} ms / img',
flush=True)
break
return fps

 调用【2】:

import measure_inference_speed
net = net.cuda()
data = torch.randn((1, 6, 128, 128)).cuda()
measure_inference_speed(net, (data,))