python VSM 实体消歧 vs与python

转载

编程思想者 2024-07-04 19:00:53

文章标签 python VSM 实体消歧 mojo python ide Python 文章分类 Python 后端开发

文章目录

前言
一、什么是欧式距离
二、测试代码
三、测试结果
总结

前言

语言千千万，谁是你的最爱？我推Mojo（咒语），因为Mojo可以让Python（蟒蛇）变成龙。本章就来比一比Mojo和Python谁更快。使用的Mojo版本是0.4.0。

一、什么是欧式距离

图中两个点，一个蓝色，一个红色，他们之间的欧式距离为多少？答案非常简单。

$python VSM 实体消歧 vs与python_python$

很简单的吧。

python VSM 实体消歧 vs与python_ide_02

上面的只是二维空间的计算方式，如果是三维，或者有n个维度的空间该怎么计算呢？还是一样的计算方法。

$python VSM 实体消歧 vs与python_mojo_03$

二、测试代码

纯Python计算欧式距离。

%%python
import numpy as np
from timeit import timeit
np.random.seed(100)#给个随机种子，使得每次随机结果不变
def naive_euclidean_distance(a, b):
    distance = 0
    for i in range(len(a)):
        distance += (a[i] - b[i])**2
    return np.sqrt(distance)
points = 1000000#大一点，比较好测试时间
a = np.random.randn(points).tolist()#申请array，转成列表
b = np.random.randn(points).tolist()#申请array，转成列表
distance = naive_euclidean_distance(a,b)#计算欧式距离
secs = timeit(lambda: naive_euclidean_distance(a,b), number=10)/10#运行10次所用的平均时间，单位s
ms = secs * 1000#毫秒
print("The euclidean distance is {:.2f}, spent {:.2f} ms.".format(distance, ms))

%%python表示，我们在选择Mojo核的jupyter中运行Python解释器解释的代码，使得代码运行效果和Python一样。这样就不需要我们有两个单独的环境来测试。

使用numpy计算欧式距离。

在numpy中有专门计算欧式距离的函数linalg.norm()。

%%python
import numpy as np
from timeit import timeit
np.random.seed(100)#给个随机种子，使得每次随机结果不变
def numpy_euclidean_distance(a, b):
    return np.linalg.norm(a-b)
points = 1000000#大一点，比较好测试时间
a = np.random.randn(points)
b = np.random.randn(points)
distance = numpy_euclidean_distance(a,b)#计算欧式距离
secs = timeit(lambda: numpy_euclidean_distance(a,b), number=10)/10#运行10次所用的平均时间，单位s
ms = secs * 1000#毫秒
print("The euclidean distance is {} and spent {} ms.".format(distance, ms))

Mojo计算欧式距离

from tensor import Tensor
from python import Python
from math import sqrt
from time import now
let np = Python.import_module("numpy")
np.random.seed(100)#给个随机种子，使得每次随机结果不变
def mojo_euclidean_distance(a: Tensor[DType.float64], b: Tensor[DType.float64]) -> Float64:
    var distance: Float64 = 0.0
    points = a.num_elements()
    for i in range(points):
        dist = (a[i] - b[i])
        distance += dist * dist
    return sqrt(distance)

let points: Int = 1000000
let a = np.random.randn(points)
let b = np.random.randn(points)

var alist = Tensor[DType.float64](points)
var blist = Tensor[DType.float64](points)
for i in range(points):
    alist[i] = a[i].to_float64()
    blist[i] = b[i].to_float64()
let start = now()#获取当前时间
let distance = mojo_euclidean_distance(alist, blist)
let end = now()
print("The euclidean distance is", distance, "and spent ", (end-start)/1e6, "ms")

优化后的Mojo计算欧式距离

from tensor import Tensor
from python import Python
from math import sqrt
from time import now
let np = Python.import_module("numpy")
np.random.seed(100)#给个随机种子，使得每次随机结果不变
fn mojo_euclidean_distance(a: Tensor[DType.float64], b: Tensor[DType.float64]) -> Float64:#此处改变
    var distance: Float64 = 0.0
    let points = a.num_elements()#此处改变
    for i in range(points):
        let dist = (a[i] - b[i])#此处改变
        distance += dist * dist
    return sqrt(distance)

let points: Int = 1000000
let a = np.random.randn(points)
let b = np.random.randn(points)

var alist = Tensor[DType.float64](points)
var blist = Tensor[DType.float64](points)
for i in range(points):
    alist[i] = a[i].to_float64()
    blist[i] = b[i].to_float64()
let start = now()
let distance = mojo_euclidean_distance(alist, blist)
let end = now()
print("The euclidean distance is", distance, "and spent ", (end-start)/1e6, "ms")