文章目录

  • 最小生成树
  • Kruskal
  • Prim


最小生成树

对于一个带权重的连通无向图Python实现最小生成树问题 python最小生成树算法_python和权重函数Python实现最小生成树问题 python最小生成树算法_权重_02,该权重函数将每条边映射到实数值的权重上。最小生成树(Minimum Spanning Tree,MST)问题是指,找到一个无环子集Python实现最小生成树问题 python最小生成树算法_算法_03,能够将所有的结点连接起来,又具有最小的权重。

解决最小生成树问题有两种算法:Kruskal算法和Prim算法。这两种算法都是贪心算法。贪心算法通常在每一步有多个可能的选择,并推荐选择在当前看来最好的选择。这种策略一般并不能保证找到一个全局最优的解决方案。但是,对于最小生成树问题来说,可以证明,Kruskal算法和Prim算法使用的贪心策略确实能够找到一棵权重最小的生成树。

Kruskal

对于一个带权重的连通无向图Python实现最小生成树问题 python最小生成树算法_python,Kruskal算法把图中的每一个结点看作一棵树,所以图中的所有结点可以组成一个森林。该算法按照边的权重大小依次进行考虑,如果一条边可以将两棵不同的树连接起来,它就被加入到森林中,从而完成对两棵树的合并。

在Kruskal算法的实现中,使用了一种叫做并查集的数据结构,其作用是用来维护几个不相交的元素集合。在该算法中,每个集合代表当前森林中的一棵树。

对于一个用邻接链表表示的带权重的连通无向图,Kruskal算法的实现如下所示:

def mst_kruskal(graph, weights):
    edges = []
    for edge, weight in weights.items():
        if edge[0] < edge[1]:
            edges.append((edge, weight))
    edges.sort(key=lambda x: x[1])
    parents = {node: node for node in graph}  # 并查集,每个结点默认的父结点为自己

    def find_parent(node):
        if node != parents[node]:
            parents[node] = find_parent(parents[node])
        return parents[node]

    minimum_cost = 0
    minimum_spanning_tree = []

    for edge in edges:
        parent_from_node = find_parent(edge[0][0])
        parent_to_node = find_parent(edge[0][1])
        if parent_from_node != parent_to_node:
            minimum_cost += edge[1]
            minimum_spanning_tree.append(edge)
            parents[parent_from_node] = parent_to_node

    return minimum_spanning_tree, minimum_cost


if __name__ == "__main__":
    # 算法导论图23-4
    graph = {
        "a": ["b", "h"],
        "b": ["a", "c", "h"],
        "c": ["b", "d", "f", "i"],
        "d": ["c", "e", "f"],
        "e": ["d", "f"],
        "f": ["c", "d", "e", "g"],
        "g": ["f", "h", "i"],
        "h": ["a", "b", "g", "i"],
        "i": ["c", "g", "h"],
    }
    weights = {
        ("a", "b"): 4, ("a", "h"): 8,
        ("b", "a"): 4, ("b", "c"): 8, ("b", "h"): 11,
        ("c", "b"): 8, ("c", "d"): 7, ("c", "f"): 4, ("c", "i"): 2,
        ("d", "c"): 7, ("d", "e"): 9, ("d", "f"): 14,
        ("e", "d"): 9, ("e", "f"): 10,
        ("f", "c"): 4, ("f", "d"): 14, ("f", "e"): 10, ("f", "g"): 2,
        ("g", "f"): 2, ("g", "h"): 1, ("g", "i"): 6,
        ("h", "a"): 8, ("h", "b"): 11, ("h", "g"): 1, ("h", "i"): 7,
        ("i", "c"): 2, ("i", "g"): 6, ("i", "h"): 7,
    }
    minimum_spanning_tree, minimum_cost = mst_kruskal(graph, weights)
    print(minimum_spanning_tree)
    print(minimum_cost)
    # [(('g', 'h'), 1), (('c', 'i'), 2), (('f', 'g'), 2), (('a', 'b'), 4), (('c', 'f'), 4), (('c', 'd'), 7), (('a', 'h'), 8), (('d', 'e'), 9)]
    # 37

Kruskal算法的运行时间依赖于不相交集合数据结构的实现方式。如果使用不相交集合森林(并查集)实现,Kruskal算法的总运行时间为Python实现最小生成树问题 python最小生成树算法_python_05

Prim

对于一个带权重的连通无向图Python实现最小生成树问题 python最小生成树算法_python,Prim算法从图中任意一个结点Python实现最小生成树问题 python最小生成树算法_Python实现最小生成树问题_07开始建立最小生成树,这棵树一直长大到覆盖Python实现最小生成树问题 python最小生成树算法_权重_08中的所有结点为止。与Kruskal算法不同,该算法始终保持只有一棵树,每一步选择与当前的树相邻的权重最小的一条边(也就是选择与当前的树最近的一个结点),加入到这棵树中。当算法终止时,所有已选择的边形成一棵最小生成树。本策略也属于贪心策略,因为每一步所加入的边都必须是使树的总权重增加量最小的边。

在Prim算法的实现中,需要使用最小优先队列来快速选择一条新的边,以便加入到已选择的边构成的树中。所以,在算法的执行过程中,对于不在当前的树中的每一个结点,需要记录其和树中结点的所有边中最小边的权重。

对于一个用邻接链表表示的带权重的连通无向图,Prim算法的实现如下所示:

class MinHeap:
    def __init__(self, nodes, keys):
        """
        :param nodes: 保存结点元素
        :param keys: 保存结点的关键值
        item_pos: 保存结点元素在堆中的下标
        """
        self.heap = nodes
        self.size = len(nodes)
        self.keys = keys
        self.item_pos = {item: i for i, item in enumerate(self.heap)}
        self._heapify()

    def __len__(self):
        return self.size

    def _siftup(self, pos):
        """当前元素上筛"""
        old_item = self.heap[pos]
        while pos > 0:
            parent_pos = (pos - 1) >> 1
            parent_item = self.heap[parent_pos]
            if self.keys[old_item] < self.keys[parent_item]:
                self.heap[pos] = parent_item
                self.item_pos[parent_item] = pos
                pos = parent_pos
            else:
                break
        self.heap[pos] = old_item
        self.item_pos[old_item] = pos

    def _siftdown(self, pos):
        """当前元素下筛"""
        old_item = self.heap[pos]
        child_pos = 2 * pos + 1  # left child position
        while child_pos < self.size:
            child_item = self.heap[child_pos]
            right_child_pos = child_pos + 1
            right_child_item = self.heap[right_child_pos]
            if right_child_pos < self.size and \
                    self.keys[child_item] > self.keys[right_child_item]:
                child_pos = right_child_pos
                child_item = self.heap[child_pos]
            if self.keys[old_item] > self.keys[child_item]:
                self.heap[pos] = child_item
                self.item_pos[child_item] = pos
                pos = child_pos
                child_pos = 2 * pos + 1  # 更新循环判断条件
            else:
                break
        self.heap[pos] = old_item
        self.item_pos[old_item] = pos

    def _heapify(self):
        for i in reversed(range(self.size // 2)):
            self._siftdown(i)

    def extract_min(self):
        old_item = self.heap[0]
        self.heap[0] = self.heap[self.size - 1]
        self.item_pos[self.heap[0]] = 0
        self.heap[self.size - 1] = old_item
        self.item_pos[old_item] = self.size - 1
        self.size -= 1
        self._siftdown(0)
        return old_item

    def decrease_key(self, item):
        pos = self.item_pos[item]
        self._siftup(pos)

    def exist(self, item):
        return self.item_pos[item] < self.size


def mst_prim(graph, weights, start):
    keys = {}  # 保存每个结点的关键值(与树的最小距离)
    predecessors = {}  # 保存每个结点在最小生成树中的父结点
    for node in graph.keys():
        keys[node] = float("INF")
        predecessors[node] = None
    keys[start] = 0

    priority_queue = MinHeap(list(graph.keys()), keys)
    minimum_spanning_tree = []
    minimum_cost = 0

    while len(priority_queue) > 0:
        node = priority_queue.extract_min()
        minimum_spanning_tree.append((node, predecessors[node]))
        edge = (node, predecessors[node])
        if edge in weights:
            minimum_cost += weights[edge]
        for adj_node in graph[node]:
            if priority_queue.exist(adj_node) and weights[(node, adj_node)] < keys[adj_node]:
                predecessors[adj_node] = node
                keys[adj_node] = weights[(node, adj_node)]
                priority_queue.decrease_key(adj_node)

    return minimum_spanning_tree, minimum_cost


if __name__ == "__main__":
    # 算法导论图23-5
    graph = {
        "a": ["b", "h"],
        "b": ["a", "c", "h"],
        "c": ["b", "d", "f", "i"],
        "d": ["c", "e", "f"],
        "e": ["d", "f"],
        "f": ["c", "d", "e", "g"],
        "g": ["f", "h", "i"],
        "h": ["a", "b", "g", "i"],
        "i": ["c", "g", "h"],
    }
    weights = {
        ("a", "b"): 4, ("a", "h"): 8,
        ("b", "a"): 4, ("b", "c"): 8, ("b", "h"): 11,
        ("c", "b"): 8, ("c", "d"): 7, ("c", "f"): 4, ("c", "i"): 2,
        ("d", "c"): 7, ("d", "e"): 9, ("d", "f"): 14,
        ("e", "d"): 9, ("e", "f"): 10,
        ("f", "c"): 4, ("f", "d"): 14, ("f", "e"): 10, ("f", "g"): 2,
        ("g", "f"): 2, ("g", "h"): 1, ("g", "i"): 6,
        ("h", "a"): 8, ("h", "b"): 11, ("h", "g"): 1, ("h", "i"): 7,
        ("i", "c"): 2, ("i", "g"): 6, ("i", "h"): 7,
    }
    minimum_spanning_tree, minimum_cost = mst_prim(graph, weights, "a")
    print(minimum_spanning_tree)
    print(minimum_cost)
    # [('a', None), ('b', 'a'), ('h', 'a'), ('g', 'h'), ('f', 'g'), ('c', 'f'), ('i', 'c'), ('d', 'c'), ('e', 'd')]
    # 37

Prim算法的运行时间取决于最小优先队列的实现方式。如果最小优先队列使用二叉最小优先队列(最小堆),该算法的时间复杂度为Python实现最小生成树问题 python最小生成树算法_算法_09。从渐进意义上来说,它与Kruskal算法的运行时间相同。如果使用斐波那契堆来实现最小优先队列,则Prim算法的运行时间将改进到Python实现最小生成树问题 python最小生成树算法_权重_10