Scipy快速入门教程

  • 1、介绍
  • 1.1 Scipy 组织
  • 2、基本功能
  • 2.1 与 NumPy 的交互
  • 1)索引技巧
  • 2)形状操作
  • 3)多项式(Polynomials)
  • 4)向量函数(向量化)
  • 5、类型处理
  • 6)其他有用的函数


1、介绍

SciPy是建立在 Python 的 NumPy 扩展上的数学算法和便利函数的集合。
SciPy 是在数学运算、科学和工程学方面被广泛应用 Python 类库。它包括统计、
优化、整合、线性代数模块、傅里叶变换、信号和图像处理、常微分方程求解器等,因
此被广泛 应用在 器学习项目。

1.1 Scipy 组织

SciPy组织涵盖不同科学计算领域的子包。下表总结了这些情况:

Subpackage

Description

cluster

聚类算法(Clustering algorithms)

constants

物理和数学常量(Physical and mathematical constants)

fftpack

快速傅里叶变换(Fast Fourier Transform routines)

integrate

积分和常微分方程求解器(Integration and ordinary differential equation solvers)

interpolate

插值和滤波曲线(Interpolation and smoothing splines)

io

输入输出(Input and Output)

linalg

线性代数(Linear algebra)

ndimage

N 维图像处理(N-dimensional image processing)

odr

正交距离回归(Orthogonal distance regression)

optimize

优化和寻根例程(Optimization and root-finding routines)

signal

信号处理(Signal processing)

sparse

稀疏矩阵和相关例程(Sparse matrices and associated routines)

spatial

空间数据结构和算法(Spatial data structures and algorithms)

special

特殊函数(Special functions)

stats

统计分布和函数(Statistical distributions and functions)

SciPy 子软件包需要单独导入,例如:from scipy import linalg, optimize

2、基本功能

2.1 与 NumPy 的交互

SciPy 是建立在 NumPy 之上,对于所有基本的数组处理需求,可以使用 NumPy 函数:

1)索引技巧

有一些类实例专门使用了切片功能来为数组构造提供有效的方法。这一部分将讨论使用 numpy.mgridnumpy.ogridnumpy.r_numpy.c_ 快速构造数组。

例如,使用 r_ 而不是使用 concatenate:

# concatenate:沿着现有的轴连接数组序列。
a = np.concatenate(([3], [0]*5, np.arange(-1, 1.002, 2/9.0)))

a = np.r_[3,[0]*5,-1:1:10j]

这样可以简化键入并使代码更易读。
“ r” 代表行串联,因为如果逗号之间的对象是二维数组,则它们按行堆叠(因此必须具有相应的列)。有一个等效的命令 c_可以按列堆叠二维数组,但与 r_ 一维数组的工作原理相同 。

另一个使用扩展切片符号的非常有用的类实例是函数 mgrid。在最简单的情况下,这个函数可以用来构造1-D 范围的数据,作为 arange 的一种方便的替代品。它还允许在步长中使用复数来表示(包含在内的)端点之间的点数。然而,这个函数的真正目的是生成 N, N-D 数组,为N-D卷提供坐标数组。示例:

>>> np.mgrid[0:5,0:5]
array([[[0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1],
        [2, 2, 2, 2, 2],
        [3, 3, 3, 3, 3],
        [4, 4, 4, 4, 4]],
       [[0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4]]])
>>> np.mgrid[0:5:4j,0:5:4j]
array([[[ 0.    ,  0.    ,  0.    ,  0.    ],
        [ 1.6667,  1.6667,  1.6667,  1.6667],
        [ 3.3333,  3.3333,  3.3333,  3.3333],
        [ 5.    ,  5.    ,  5.    ,  5.    ]],
       [[ 0.    ,  1.6667,  3.3333,  5.    ],
        [ 0.    ,  1.6667,  3.3333,  5.    ],
        [ 0.    ,  1.6667,  3.3333,  5.    ],
        [ 0.    ,  1.6667,  3.3333,  5.    ]]])

Having meshed arrays like this is sometimes very useful. However, it is not always needed just to evaluate some N-D function over a grid due to the array-broadcasting rules of NumPy and SciPy. If this is the only purpose for generating a meshgrid, you should instead use the function ogrid which generates an “open” grid using newaxis judiciously to create N, N-D arrays where only one dimension in each array has length greater than 1. This will save memory and create the same result if the only purpose for the meshgrid is to generate sample points for evaluation of an N-D function.

2)形状操作

在这类函数中,有从 N-D 数组中压缩出一维长度的例程,确保数组至少是1-、2-或3-D,并按行、列和“pages”(第三维)堆叠(连接)数组。还可以使用分割数组(与堆叠数组大致相反)的例程。

3)多项式(Polynomials)

在 SciPy 中有两种(可互换的)处理一维多项式的方法。第一个是使用来自 NumPy 的 poly1d 类。这个类接受系数或多项式根来初始化多项式。然后,多项式对象可以在代数表达式中进行操作、积分、微分和计算。它甚至像一个多项式:

from numpy import poly1d
p = poly1d([3,4,5])		# 3x^2 + 4x + 5
p(0.5)		# 如果多项式的x值为 0.5,即 3x^2 + 4x + 5 = 7.75
p*p			# 9x^4 + 24x^3 + 46x^2 +40x + 25
p.deriv()	# 6x + 4	返回多项式的导数
p.integ(k=6)	# x^3 + 2x^2 + 5x + 6		返回该多项式的不定积分
p([4, 5])	# array([ 69, 100])		3*4^2 + 4*4 + 5 = 69

处理多项式的另一种方法是将多项式作为一个系数数组,数组的第一个元素给出最高次的系数。有显式的函数来加、减、乘、除、积分、微分和计算以系数序列表示的多项式。

4)向量函数(向量化)

NumPy 提供的特性之一是类向量化 vectorize,它将一个普通 Python 函数转换为一个“向量化函数”,该函数接受标量并将标量返回为一个“向量化函数”,与其他NumPy函数具有相同的广播规则(例如,通用函数或 ufuncs)。例如,假设你有一个 Python 函数 addsubtract 被定义为:

def addsubtract(a,b):
   if a > b:
       return a - b
   else:
       return a + b

定义包含两个标量变量的函数并返回标量结果。类 vectorize 可用于对该函数进行“向量化”:

vec_addsubtract = np.vectorize(addsubtract)

返回一个函数,它接受数组参数并返回数组结果:

>>> vec_addsubtract([0,3,6,9],[1,3,5,7])
array([1, 6, 1, 2])

这个特殊的函数可以写成向量形式而不用向量化。但是,使用优化或集成例程的函数可能只能使用向量化。

5、类型处理

Note the difference between numpy.iscomplex/numpy.isreal and numpy.iscomplexobj/numpy.isrealobj. The former command is array-based and returns byte arrays of ones and zeros providing the result of the element-wise test. The latter command is object-based and returns a scalar describing the result of the test on the entire object.

Often it is required to get just the real and/or imaginary part of a complex number. While complex numbers and arrays have attributes that return those values, if one is not sure whether or not the object will be complex-valued, it is better to use the functional forms numpy.real and numpy.imag . These functions succeed for anything that can be turned into a NumPy array. Consider also the function numpy.real_if_close which transforms a complex-valued number with a tiny imaginary part into a real number.

Occasionally the need to check whether or not a number is a scalar (Python (long)int, Python float, Python complex, or rank-0 array) occurs in coding. This functionality is provided in the convenient function numpy.isscalar which returns a 1 or a 0.

6)其他有用的函数


There are also several other useful functions which should be mentioned. For doing phase processing, the functions angle, and unwrap are useful. Also, the linspace and logspace functions return equally spaced samples in a linear or log scale. Finally, it’s useful to be aware of the indexing capabilities of NumPy. Mention should be made of the function select which extends the functionality of where to include multiple conditions and multiple choices. The calling convention is select(condlist, choicelist, default=0). numpy.select is a vectorized form of the multiple if-statement. It allows rapid construction of a function which returns an array of results based on a list of conditions. Each element of the return array is taken from the array in a choicelist corresponding to the first condition in condlist that is true. For example:



>>>

>>> x = np.arange(10)
>>> condlist = [x<3, x>5]
>>> choicelist = [x, x**2]
>>> np.select(condlist, choicelist)
array([ 0,  1,  2,  0,  0,  0, 36, 49, 64, 81])



Some additional useful functions can also be found in the module scipy.special. For example the factorial and comb functions compute n!-point approximation to the derivative of order o. These weights must be multiplied by the function corresponding to these points and the results added to obtain the derivative approximation. This function is intended for use when only samples of the function are available. When the function is an object that can be handed to a routine and evaluated, the function derivative can be used to automatically evaluate the object at the correct points to obtain an N-point approximation to the o-th derivative at a given point.