undefinedfix
Sign in

Where does weakhashmap link hash conflict nodes to linked lists?

ICR edited in Thu, 18 Aug 2022

Question: when reading the source code of weakhashmap in java8, I found a problem: when the put operation has hash conflict, it only has replacement operation, but there is no operation to add a new linked list node. When did the linked list node link up?

The put source code in java8 is as follows:

public V put(K key, V value) {

    Object k = maskNull(key);
    int h = hash(k);
    Entry<K,V>[] tab = getTable();
    int i = indexFor(h, tab.length);
     //这里只是在链表上找到key存在的节点做替换,但是对于key不存在的情况,
     //并没有创建新的节点链到原来的链表后面。
    for (Entry<K,V> e = tab[i]; e != null; e = e.next) {
        if (h == e.hash && eq(k, e.get())) {
            V oldValue = e.value;
            if (value != oldValue)
                e.value = value;
            return oldValue;
        }
        //?????当hash 相同但是key不同不是应该创建新节点链到链表上吗?
    }

    modCount++;
    Entry<K,V> e = tab[i];
    tab[i] = new Entry<>(k, value, queue, h, e);
    if (++size >= threshold)
        resize(tab.length * 2);
    return null;
}
2 Replies
luoiliembac
commented on Fri, 19 Aug 2022

First of all, correct a little bit that the expression of put operation only replace operation is incorrect when hash conflict occurs.


Hash conflict is due to different K hit the same block address, while replacement operation is obviously due to the same K, the same hash hit should be his place, which should be normal addressing. Normal addressing naturally determines whether there is an update / replacement operation, so the condition of if is determined by hash and key, and only the update / replacement operation is processed


The real hash conflict is bound to add new nodes. Weakhashmap doesn't deal with hash conflicts. It just adds to the header. Only update / replace operation is done in advance to ensure that the data saved later must be new data (non update / non replace).


An entry is a single linked list, the new entry in your code (...) Called

Entry(Object key, V value,ReferenceQueue<Object> queue,int hash, Entry<K,V> next) {
    super(key, queue);
    this.value = value;
    this.hash  = hash;
    // 这行代码就是插入表头,新建节点的后继节点为传入的原链表next
    this.next  = next;
}

The for loop in your pasted code is just to detect whether the current put operation is a replace / repeat add operation. If it is not a repeat add or update, but because of adding different data, the code after for will be executed.

I added some notes to your put method

public V put(K key, V value) {
    // 屏蔽null用一个new Object()代替
    Object k = maskNull(key);
    // hash计算
    int h = hash(k);
    // 获取当前最新hash表(删除了过期数据)
    Entry<K,V>[] tab = getTable();
    // 根据hash值寻址
    int i = indexFor(h, tab.length);
    for (Entry<K,V> e = tab[i]; e != null; e = e.next) {
        // 如果hash和key都相同则可能会替换,在if逻辑里面判断具体处理逻辑
        if (h == e.hash && eq(k, e.get())) {
            V oldValue = e.value;
            if (value != oldValue)
                e.value = value;
            return oldValue;
        }
    }
    modCount++;
    Entry<K,V> e = tab[i];
    // 这里面将会把新节点添加到表头
    //上面代码已经处理了重复添加和更新value的操作,能执行到这里的代码都是新节点,
    //此时不需要管原来的节点里面保存的是什么可能是null也可能是一个链表,反正不管是啥
    //,直接放到后继节点就行,反正新加的节点一定是头节点,至于后继有没有,
    //根本不需要处理了,null也好还是什么也好无所谓,但是这里面对null值进行了处理,不可能真的是null)
    // 执行到这里不一定就一定有hash冲突,可能有,可能没有,`e = null` 就没有,`e !=null` 就有hash冲突
    // 但是我不管你冲突不冲突,反正插在表头。数据不可能重复,重复的上面的for遍历处理过了。
    tab[i] = new Entry<>(k, value, queue, h, e);
    if (++size >= threshold)
        resize(tab.length * 2);
    return null;
}
blwqh
commented on Fri, 19 Aug 2022

First of all, the function of the for loop is to find the key, = = addressing and EQ method correctly through if (H = = e.hash & & EQ (k, e.get()). When the key is found, the new value is replaced and the old value is returned for easy operation. How many times do you want to create nodes with the same hash but different keys in the for loop? Therefore, when the for loop finds no result, the creation logic should be implemented after the loop ends. If I guess correctly, the mystery lies in this line of code tab [i] = new entry < > (k, value, queue, h, e); for specific insertion logic, you should see the construction method of entry < > (k, value, queue, h, e). After the reset is also in line with the new value to determine whether the size is in line with the range!

lock This question has been locked and the reply function has been disabled.