lua脚本清理hash结构数据

转载

charlesc 2024-11-29 07:21:12

文章标签 lua脚本清理hash结构数据清空map System 删除元素 ci 文章分类 游戏开发

当我需要一个 Map 时，大多时候使用 HashMap。JDK还有一个 Map 实现是 IdentityHashMap 在某些场景发挥着重要作用。

lua脚本清理hash结构数据_删除元素

从继承关系可以看到，IdentityHashMap 继承自 AbstractMap。实现了 Map 接口。

IdentityHashMap 与 HashMap 的一个重要区别，在 IdentityHashMap doc 中的第一句话就给出了答案。

This class implements the Map interface with a hash table, using reference-equality in place of object-equality when comparing keys (and values).

IdentityHashMap 使用==替代了equals方法来比较对象是否相同。这一点和 HashMap完全不同。

示例程序

IdentityHashMap map = new IdentityHashMap<>();
    map.put(String.valueOf(1), "我是谁");
    map.put(String.valueOf(1), "我在哪");
    System.out.println(map);

输出：

{1=我是谁, 1=我在哪}

String.valueOf(1) 每次返回的都是新的 String 对象，所以两次put的是不同的对象，所以我们可以看到两个key “相同”的值。

从下面的图可以看出两个“1”是不同的对象。

lua脚本清理hash结构数据_ci_02

源码分析

数据结构

从上面的图可以看到，IdentityHashMap是通过table保存kv的，但是它和HashMap，ThreadLocal都不同。它使用的是kv相邻的保存方式，并且有冲突的时候保存到下一个空闲的kv对中。因为是这样的结构，所以 table 的长度必须保持是偶数。

0	1	2	3	4	5	6	7
						k	v

put方法

public V put(K key, V value) {
        // null 会返回一个默认的空对象
        final Object k = maskNull(key);

        retryAfterResize: for (;;) {
            final Object[] tab = table;
            final int len = tab.length;
            int i = hash(k, len);
            // 当 i 已经有值，则跳到下一个key的位置，继续判断。
            for (Object item; (item = tab[i]) != null;
                 i = nextKeyIndex(i, len)) {
               //  如果 key 相同，则替换value
                if (item == k) {
                    @SuppressWarnings("unchecked")
                        V oldValue = (V) tab[i + 1];
                    tab[i + 1] = value;
                    return oldValue;
                }
            }

            final int s = size + 1;
            // Use optimized form of 3 * s.
            // Next capacity is len, 2 * current capacity.
            if (s + (s << 1) > len && resize(len))
                continue retryAfterResize;

           // 如果为空，说明没有冲突，直接设置kv
            modCount++;
            tab[i] = k;
            tab[i + 1] = value;
            size = s;
            return null;
        }
    }

    private static int hash(Object x, int length) {
        // 这里调用的不是 hashCode方法，而是 System.identityHashCode 本地方法。
        // 只有对象 == 返回的值才是相同的。
        int h = System.identityHashCode(x);
        // Multiply by -127, and left-shift to use least bit as part of hash
        return ((h << 1) - (h << 8)) & (length - 1);
    }
    // 这是一个循环，返回下一个key的idx，
    private static int nextKeyIndex(int i, int len) {
        return (i + 2 < len ? i + 2 : 0);
    }

nextKeyIndex 方法

获取下一个idx，因为 kv 是成对出现，所以下一个 index 必须保持是偶数，从下面的输出可以得到佐证。

@Test
  void nextKeyIndexTest() {
    int len = 16;
    int idx = 0;
    for (int i = 0; i < 20; i++) {
      idx = nextKeyIndex(idx, len);
      System.out.printf("%d, ", idx);
    }
  }

  private static int nextKeyIndex(int i, int len) {
    return (i + 2 < len ? i + 2 : 0);
  }

输出，达到14之后，又从0开始下一轮循环。

2, 4, 6, 8, 10, 12, 14, 0, 2, 4, 6, 8, 10, 12, 14, 0, 2, 4, 6, 8,

get 方法

public V get(Object key) {
        Object k = maskNull(key);
        Object[] tab = table;
        int len = tab.length;
        // 和 put 一样，获取key的index
        int i = hash(k, len);
        while (true) {
            // 这个 item 其实就是现在 table 中的key，用来和参数比较
            Object item = tab[i];
            // 如果相等
            if (item == k)
                return (V) tab[i + 1];
            // 如果== null，则说明已经遍历完，没有匹配的key，返回null。
            if (item == null)
                return null;
            // 跳到下一个 key 的 index，继续判断 item与key是否 == 。
            i = nextKeyIndex(i, len);
        }
    }

remove 方法

public V remove(Object key) {
        Object k = maskNull(key);
        Object[] tab = table;
        int len = tab.length;
        int i = hash(k, len);

        while (true) {
            Object item = tab[i];
            // 匹配到了key
            if (item == k) {
                modCount++;
                size--;
                @SuppressWarnings("unchecked")
                    V oldValue = (V) tab[i + 1];
                // 清空当前匹配的kv
                tab[i + 1] = null;
                tab[i] = null;
               // 关键的逻辑在这里，把 hash 冲突的元素往前移动
                closeDeletion(i);
                return oldValue;
            }
            // 和get的逻辑相同，说明没有匹配的，返回null
            if (item == null)
                return null;
            // 如果没有匹配的key并且 item 不是 null，跳转到下一个key的index继续匹配。
            i = nextKeyIndex(i, len);
        }
    }

这个方法会把当前删除的 idx之后的，符合直接通过hash定位的key往前移动。
因为identityHashCode的匹配逻辑是通过hash计算得到idx，然后判断是否==。如果相当则返回，如果不相等继续往后遍历直到null为止。

如果不把当前删除元素后面可能发生hash冲突的kv往前移动，会导致查找不到元素。

如下所示，k和k1的hash值直接定位到的都是4，因为有hash冲突，所以k1保存到了4的下一个位置，也就是6。

0	1	2	3	4	5	6	7
				k	v	k1	v1

当移除k，没有做其他处理的时候，结果如下表所示。这是 get 方法传入 k1，首先通过hash定位到 index=4，因为 index 对应的值为 null，则返回 null，没有匹配到元素。这和我们的预期是不符的。
k1是我们要匹配的元素。当删除k时，如果没有把 hash 冲突的值(k1)向前移动，那永远也不会被获取到它们。

0	1	2	3	4	5	6	7
						k1	v1

所以需要把hash冲突的值向前移动，形成下面的结果。这时 get(k1) 就可以获取到 v1 了。

0	1	2	3	4	5	6	7
				k1	v1

closeDeletion 方法的作用就是把 hash 冲突的元素往前移动。

private void closeDeletion(int d) {
        // Adapted from Knuth Section 6.4 Algorithm R
        Object[] tab = table;
        int len = tab.length;

        // Look for items to swap into newly vacated slot
        // starting at index immediately following deletion,
        // and continuing until a null slot is seen, indicating
        // the end of a run of possibly-colliding keys.
        Object item;
        for (int i = nextKeyIndex(d, len); (item = tab[i]) != null;
             i = nextKeyIndex(i, len) ) {
            // The following test triggers if the item at slot i (which
            // hashes to be at slot r) should take the spot vacated by d.
            // If so, we swap it in, and then continue with d now at the
            // newly vacated i.  This process will terminate when we hit
            // the null slot at the end of this run.
            // The test is messy because we are using a circular table.
            // 
            int r = hash(item, len);
            if ((i < r && (r <= d || d <= i)) || (r <= d && d <= i)) {
                tab[d] = item;
                tab[d + 1] = tab[i + 1];
                tab[i] = null;
                tab[i + 1] = null;
                d = i;
            }
        }
    }

这里的if判断有点绕，因为要考虑循环table，所以有两种场景。我举例子说明一下。