python对栅格进行赋多值 python 栅格计算

转载

误会一场 2024-04-24 18:51:13

文章标签 python对栅格进行赋多值 python arcgis 开发语言栅格 文章分类 Python 后端开发

本文结合实例详细讲解了如何使用Python对栅格数据进行分区统计，关注公众号GeodataAnalysis，回复20230401获取示例数据和代码，包含这个工具的代码的写作思路。

分区统计操作是一种用于计算由另一个数据集定义的区域内的栅格（值栅格）的像元值的统计操作。首先根据矢量数据将栅格化分区多个区域，提取每个区域的像元值并分别进行统计计算，而后输出结果（一般直接输出的输入矢量的某个字段里）。

ArcGIS 有栅格的分区统计工具，但不够灵活，只能进行诸如最大值、最小值之类的统计计算。本文介绍基于Python开源库rasterio和geopandas实现的栅格数据分区统计工具，能够更加灵活的实现分区统计功能，满足多样化的需求。

先看一下怎么使用。整个工具被封装成了一个Python的类，使用的话需要先创建类实例，初始化参数是栅格和矢量的文件路径，如：

zs = ZonalStatistics(tif_path, shp_path)

执行分区统计只需要调用类实例的zonal_statistics方法即可，该方法有四个参数，具体含义如下：

zone_field：字符串，必选参数，指定输出到矢量数据的哪个字段，若没有该字段则会新建字段。
statistics_type：字符串，统计计算的算法，func为空时为必选参数，可选算法有min、max、mean、std、sum、range、median，分别表示最小值、最大值、平均值、标准差、总和、极差、中位数。
func：自定义函数，statistics_type为空时为必选参数，需注意该函数在运行过程中的输入为numpy的掩码数组。
all_touched：布尔值，可选参数，如果为True，矢量的几何要素接触的所有像素都将参与计算，如果为False，则只有中心位于多边形内的像素才会参与计算。

调用该函数的代码示例如下：

zs.zonal_statistics('min', 'min')
zs.zonal_statistics('max', 'max')
zs.zonal_statistics('range', func=lambda x: np.ma.max(x)-np.ma.min(x))

保存结果也很简单，因为这个类使用geopandas对矢量数据进行操作，其通过属性gdf就是一个GeoDataFrame，可以通过它直接保存结果，示例如下：

zs.gdf.to_file('./result/zonal_statistics.shp', encoding='utf-8')

计算结果如下，按最大值字段显示：

python对栅格进行赋多值 python 栅格计算_python

ZonalStatistics类的代码如下，供参考：

class ZonalStatistics(object):

    def __init__(self, raster_path, shp_path) -> None:
        self.raster_path = raster_path
        self.gdf = gpd.read_file(shp_path)
        self._init_params(self.raster_path)
        
    def _init_params(self, raster_path):
        src = rio.open(raster_path)
        self.transform = src.transform
        self.crs = src.crs
        self.shape = src.shape
    
    def _geometry_mask(self, src, geometries, all_touched=False):
        if isinstance(src, rio.DatasetReader):
            pass
        elif isinstance(src, str):
            src = rio.open(src)
        else:
            raise ValueError

        if not isinstance(geometries, (tuple, list)):
            raise ValueError

        geometry_mask = features.geometry_mask(
            geometries=geometries,
            out_shape=src.shape,
            transform=src.transform,
            all_touched=all_touched,
            invert=True)
        
        return geometry_mask
    
    def _valid_range(self, mask):
        mask_col = np.any(mask, axis=0)
        mask_row = np.any(mask, axis=1)
        col_index = np.array(np.where(mask_col)[0])
        min_col, max_col = min(col_index), max(col_index)
        row_index = np.array(np.where(mask_row)[0])
        min_row, max_row = min(row_index), max(row_index)

        return (min_row, max_row), (min_col, max_col)
    
    def _read_from_geometry(self, geometries, all_touched=False):
        src = rio.open(self.raster_path)
        mask = self._geometry_mask(src, geometries, all_touched)
        (min_row, max_row), (min_col, max_col) = self._valid_range(mask)
        window = Window.from_slices(rows=(min_row, max_row+1), 
                                    cols=(min_col, max_col+1))

        geom_array = src.read(1, window=window)
        geom_mask = ~mask[min_row:max_row+1, min_col:max_col+1]
        nodata_mask = (geom_array == src.nodata)
        nan_mask = np.isnan(geom_array)
        geom_array = np.ma.masked_array(geom_array, 
                                        geom_mask | nodata_mask | nan_mask)

        return geom_array
    
    def _statistics(self, geom_array, statistics_type):
        if statistics_type == 'min':
            return np.ma.min(geom_array)
        elif statistics_type == 'max':
            return np.ma.max(geom_array)
        elif statistics_type == 'median':
            return np.ma.median(geom_array)
        elif statistics_type == 'sum':
            return np.ma.sum(geom_array)
        elif statistics_type == 'std':
            return np.ma.std(geom_array)
        elif statistics_type == 'range':
            return np.ma.max(geom_array)-np.ma.min(geom_array)
        elif statistics_type == 'mean':
            return np.ma.mean(geom_array)
        else:
            raise ValueError
    
    def zonal_statistics(self, zone_field, statistics_type=None, 
                         func=None, all_touched=False):
        for i, geom in enumerate(self.gdf.geometry.to_list()):
            geom_array = self._read_from_geometry([geom], all_touched)

            if (isinstance(func, type(None)) and 
                isinstance(statistics_type, type(None))):
                raise ValueError
            
            if isinstance(func, type(None)):
                value = self._statistics(geom_array, statistics_type)
            else:
                value = func(geom_array)
            
            if isinstance(value, type(np.ma.masked)):
                continue
            else:
                self.gdf.loc[i, zone_field] = value

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。