pytorch把稠密矩阵转化成稀疏矩阵 scipy稀疏矩阵

转载

智能开发先锋 2023-11-23 22:32:56

文章标签 pytorch把稠密矩阵转化成稀疏矩阵线性代数机器学习 html5 稀疏矩阵 文章分类 PyTorch 人工智能

1 稀疏矩阵介绍

在networkx包中，很多运算返回的是sparse matrix（如nx.laplacian_matrix），这是稀疏矩阵格式。隶属于scipy.sparse

import networkx as nx
G = nx.Graph()
G.add_node(1)
G.add_nodes_from([2, 3])
G.add_edge(1, 2)
G.add_edges_from([(1, 3)])

print(G.edges([3,2]))
#[(3, 1), (2, 1)

nx.laplacian_matrix(G)
'''
<3x3 sparse matrix of type '<class 'numpy.intc'>'
	with 7 stored elements in Compressed Sparse Row format>
'''

在矩阵中，若数值为0的元素数目远远多于非0元素的数目，并且非0元素分布没有规律时，则称该矩阵为稀疏矩阵。

2 稀疏矩阵举例：

2.1 bsr矩阵

block sparse row matrix

bsr_matrix(arg1[, shape, dtype, copy, blocksize])

BSR矩阵有三个参数：

inptr列表的第i个元素与i+1个元素是储存第i行的数据的列索引以及数据的区间索引
即indices[indptr[i]:indptr[i+1]]【左闭右开】为第i行元素的列索引
data[indptr[i]: indptr[i+1]]【左闭右开】为第i行元素的data。

from scipy.sparse import *
indptr = np.array([0, 2, 3, 6])
indices = np.array([0, 2, 2, 0, 1, 2])
data = np.array([1, 2, 3, 4, 5, 6]).repeat(4).reshape(6, 2, 2)

data
'''
array([[[1, 1],
        [1, 1]],

       [[2, 2],
        [2, 2]],

       [[3, 3],
        [3, 3]],

       [[4, 4],
        [4, 4]],

       [[5, 5],
        [5, 5]],

       [[6, 6],
        [6, 6]]])
'''

bsr_matrix((data,indices,indptr), shape=(6, 6)).toarray()
'''
array([[1, 1, 0, 0, 2, 2],
       [1, 1, 0, 0, 2, 2],
       [0, 0, 0, 0, 3, 3],
       [0, 0, 0, 0, 3, 3],
       [4, 4, 5, 5, 6, 6],
       [4, 4, 5, 5, 6, 6]])
'''

我们逐行分析

对于第0行，indptr[0]:indptr[1] -> 0:2，即[0,2)。因此第0行的列为indice[0:2]=[0,2]，
data为：
,对应的就是最后结果的第0,1行。
对于第1行，indptr[1]:indptr[2] -> 2:3，即[2,3)。因此第0行的列为indice[2:3]=[2]，
data为：
,对应的就是最后结果的第2,3行。
对于第2行，indptr[2]:indptr[3] -> 3:6，即[3,6)。因此第2行的列为indice[3:6]=[0,1,2]，
data为：
,对应的就是最后结果的第4,5行。

2.2 coo矩阵

coo_matrix(arg1[, shape, dtype, copy])

coo_matrix(arg1[, shape, dtype, copy])

coo_matrix是可以根据行和列索引进行data值的累加。

from scipy.sparse import *
row  = np.array([0, 0, 1, 3, 1, 0, 0])
col  = np.array([0, 2, 1, 3, 1, 0, 0])
data = np.array([1, 1, 1, 1, 1, 1, 2])
coo_matrix((data, (row, col)), shape=(4, 4)).toarray()
'''
array([[4, 0, 1, 0],
       [0, 2, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 1]])
'''

pytorch把稠密矩阵转化成稀疏矩阵 scipy稀疏矩阵_线性代数_04

2.3 csc矩阵 & csr矩阵

Compressed Sparse Column & Row matrix

并没有看出来和前面的coo有什么区别。。。希望评论区批评指针~

from scipy.sparse import *
row  = np.array([0, 0, 1, 3, 1, 0, 0])
col  = np.array([0, 2, 1, 3, 1, 0, 0])
data = np.array([1, 1, 1, 1, 1, 1, 2])
csc_matrix((data, (row, col)), shape=(4, 4)).toarray()
'''
array([[4, 0, 1, 0],
       [0, 2, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 1]], dtype=int32)
'''

from scipy.sparse import *
row  = np.array([0, 0, 1, 3, 1, 0, 0])
col  = np.array([0, 2, 1, 3, 1, 0, 0])
data = np.array([1, 1, 1, 1, 1, 1, 2])
csr_matrix((data, (row, col)), shape=(4, 4)).toarray()
'''
array([[4, 0, 1, 0],
       [0, 2, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 1]], dtype=int32)
'''

2.4 dia矩阵

dia_matrix(arg1[, shape, dtype, copy])

from scipy.sparse import *

data = np.array([[1, 2, 3, 4],[1,3,5,7],[2,4,5,7]])

offsets = np.array([0, -1, 2])

dia_matrix((data, offsets), shape=(4, 4)).toarray()
'''
array([[1, 0, 5, 0],
       [1, 2, 0, 7],
       [0, 3, 3, 0],
       [0, 0, 5, 4]])
'''

data定义对角线元素

offsets定义对角线的偏移量，0代表正对角线，正数代表往上偏移，负数代表往下偏移

一上图为例：最终的结果是：data[0]就是正对角线、data[1]向下偏移一格，data[2]向上偏移两格

2.5 dok矩阵

Dictionary Of Keys based sparse matrix
dok_matrix可以高效地逐渐构造稀疏矩阵。

from scipy.sparse import *
S = dok_matrix((5, 5), dtype=np.float32)
for i in range(5):
     for j in range(5):
         S[i, j] = i + j
S.toarray()
'''
array([[0., 1., 2., 3., 4.],
       [1., 2., 3., 4., 5.],
       [2., 3., 4., 5., 6.],
       [3., 4., 5., 6., 7.],
       [4., 5., 6., 7., 8.]], dtype=float32)
'''

这个一目了然了

3 基本初始化方法

3.1 sparse_matrix((data, (row, col)), shape=(4, 4)).toarray()

除了dok和dia，其他的都适用

row  = np.array([0, 3, 1, 0])
col  = np.array([0, 3, 1, 2])
data = np.array([4, 5, 7, 9])
print(bsr_matrix((data, (row, col)), shape=(4, 4)).toarray())
print(coo_matrix((data, (row, col)), shape=(4, 4)).toarray())    
print(csc_matrix((data, (row, col)), shape=(4, 4)).toarray()) 
print(csr_matrix((data, (row, col)), shape=(4, 4)).toarray())
#print(dia_matrix((data, (row, col)), shape=(4, 4)).toarray())
#print(dok_matrix((data, (row, col)), shape=(4, 4)).toarray())  
'''
[[4 0 9 0]
 [0 7 0 0]
 [0 0 0 0]
 [0 0 0 5]]

[[4 0 9 0]
 [0 7 0 0]
 [0 0 0 0]
 [0 0 0 5]]

[[4 0 9 0]
 [0 7 0 0]
 [0 0 0 0]
 [0 0 0 5]]

[[4 0 9 0]
 [0 7 0 0]
 [0 0 0 0]
 [0 0 0 5]]
'''

3.2 sparse_matrix(array).toarray()

array可以是list，也可以是np.array

适用于所有类型的矩阵

row  = np.array([0, 3, 1, 0])
col  = np.array([0, 3, 1, 2])
data = np.array([4, 5, 7, 9])
print(bsr_matrix(array).toarray())
print(coo_matrix(array).toarray())  
print(csc_matrix(array).toarray())
print(csr_matrix(array).toarray())
print(dia_matrix(array).toarray())
print(dok_matrix(array).toarray())
'''
[[4 0 9 0]
 [0 7 0 0]
 [0 0 0 0]
 [0 0 0 5]]

[[4 0 9 0]
 [0 7 0 0]
 [0 0 0 0]
 [0 0 0 5]]

[[4 0 9 0]
 [0 7 0 0]
 [0 0 0 0]
 [0 0 0 5]]

[[4 0 9 0]
 [0 7 0 0]
 [0 0 0 0]
 [0 0 0 5]]

[[4 0 9 0]
 [0 7 0 0]
 [0 0 0 0]
 [0 0 0 5]]

[[4 0 9 0]
 [0 7 0 0]
 [0 0 0 0]
 [0 0 0 5]]
'''

4 判别函数

issparse(x) isspmatrix(x)	x是否为sparse类型
isspmatrix_csc(x)	x是否为csc_matrix类型
isspmatrix_csr(x)	x是否为csr_matrix类型
isspmatrix_bsr(x)	x是否为bsr_matrix类型
isspmatrix_lil(x)	x是否为lil_matrix类型
isspmatrix_dok(x)	x是否为dok_matrix类型
isspmatrix_coo(x)	x是否为coo_matrix类型
isspmatrix_dia(x)	x是否为dia_matrix类型

5 文件操作

save_npz(file, matrix[, compressed])	以.npz格式保存稀疏矩阵
load_npz(file)	导入.npz格式的稀疏矩阵

6 转化函数

todense([order, out])	返回稀疏矩阵的np.matrix形式
toarray([order, out])	返回稀疏矩阵的np.array形式
tobsr([blocksize, copy])	返回稀疏矩阵的bsr_matrix形式
tocoo([copy])	返回稀疏矩阵的coo_matrix形式
tocsc([copy])	返回稀疏矩阵的csc_matrix形式
tocsr([copy])	返回稀疏矩阵的csr_matrix形式
todia([copy])	返回稀疏矩阵的dia_matrix形式
todok([copy])	返回稀疏矩阵的dok_matrix形式

7 其他函数（待补充）

find(A)	返回稀疏矩阵A中的非零元的位置以及数值

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。

上一篇：简述BSP与BIOS区别 bsp bootloader区别

下一篇：docker 容器状态up docker upperdir

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯