computeIfAbsent optimized for missing entries

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

computeIfAbsent optimized for missing entries

Amir Hadadi

I read the source code for ConcurrentHashMap's computeIfAbsent, and I noticed it enters a synchronized block whether or not the key is present.

I would expect this method to first check if the key is present, and only then resort to locking. This would make computeIfAbsent slower when the key is missing, but will speed it up and reduce contention when the key is present.

In the common use case of using ConcurrentHashMap + computeIfAbsent to implement a registry that lazily initializes objects, and the same resource is accessed through the map frequently, contention will ensue.

Why this method was optimized for the missing entry case?


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: computeIfAbsent optimized for missing entries

Doug Lea
On 02/08/2017 01:52 AM, Amir Hadadi wrote:
> I read the source code for ConcurrentHashMap's computeIfAbsent, and I
> noticed it enters a synchronized block whether or not the key is present.

computeIfAbsent must be more conservative than putIfAbsent because
we guarantee exclusion during the "compute" part.
Even if an element is apparently present, we need to guarantee
that it is not in the process of being removed (while holding lock).

There was a fair amount of pre-jdk8 discussion about whether to
guarantee locking/exclusion in CHM (it is not guaranteed for
ConcurrentMaps in general). The consensus was that it was
worth doing despite the extra overhead. Especially since it
is avoidable in cases where you know that a key will never be
removed once added, first calling get(), and then only
call computeIfAbsent if it returns null.


-Doug


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: computeIfAbsent optimized for missing entries

Benjamin Manes
In JDK9, it looks like you added a small prescreening to check at the first node in the bin and eagerly return. That seems like a smart compromise, given the performance analysis you shared on this forum previously that dissuaded you from adopting a full pre-screening.

On Wed, Feb 8, 2017 at 8:55 AM, Doug Lea <[hidden email]> wrote:
On 02/08/2017 01:52 AM, Amir Hadadi wrote:
I read the source code for ConcurrentHashMap's computeIfAbsent, and I
noticed it enters a synchronized block whether or not the key is present.

computeIfAbsent must be more conservative than putIfAbsent because
we guarantee exclusion during the "compute" part.
Even if an element is apparently present, we need to guarantee
that it is not in the process of being removed (while holding lock).

There was a fair amount of pre-jdk8 discussion about whether to
guarantee locking/exclusion in CHM (it is not guaranteed for
ConcurrentMaps in general). The consensus was that it was
worth doing despite the extra overhead. Especially since it
is avoidable in cases where you know that a key will never be
removed once added, first calling get(), and then only
call computeIfAbsent if it returns null.


-Doug


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: computeIfAbsent optimized for missing entries

Amir Hadadi
In reply to this post by Doug Lea

> Even if an element is apparently present, we need to guarantee
> that it is not in the process of being removed (while holding lock).


Let's assume computeIfAbsent is implemented by first checking if the key is present, returning the associated value if it is, and otherwise goes on to lock. So the mapping function is still performed under lock.

Let's say we have a CHM which initially maps a key K -> V.

Now thread T1 calls remove(K) concurrently with thread T2 calling computeIfAbsent(K, mappingFunction).

What sequence of events could T1, T2 or another thread T3 observe that is precluded by the existing implementation?


From: Concurrency-interest <[hidden email]> on behalf of Doug Lea <[hidden email]>
Sent: Wednesday, February 8, 2017 6:55:02 PM
To: [hidden email]
Subject: Re: [concurrency-interest] computeIfAbsent optimized for missing entries
 
On 02/08/2017 01:52 AM, Amir Hadadi wrote:
> I read the source code for ConcurrentHashMap's computeIfAbsent, and I
> noticed it enters a synchronized block whether or not the key is present.

computeIfAbsent must be more conservative than putIfAbsent because
we guarantee exclusion during the "compute" part.
Even if an element is apparently present, we need to guarantee
that it is not in the process of being removed (while holding lock).

There was a fair amount of pre-jdk8 discussion about whether to
guarantee locking/exclusion in CHM (it is not guaranteed for
ConcurrentMaps in general). The consensus was that it was
worth doing despite the extra overhead. Especially since it
is avoidable in cases where you know that a key will never be
removed once added, first calling get(), and then only
call computeIfAbsent if it returns null.


-Doug


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: computeIfAbsent optimized for missing entries

Doug Lea
In reply to this post by Benjamin Manes
On 02/08/2017 12:22 PM, Benjamin Manes wrote:
> In JDK9, it looks like you added a small prescreening
> <http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/concurrent/ConcurrentHashMap.java?r1=1.295&r2=1.296>
> to check at the first node in the bin and eagerly return. That seems
> like a smart compromise, given the performance analysis you shared on
> this forum previously that dissuaded you from adopting a full
> pre-screening.

Thanks! I forgot that this made it into jdk9 (vs jdk8).
Amir: for more details see the discussions on this in the
archives: http://cs.oswego.edu/pipermail/concurrency-interest/

-Doug

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: computeIfAbsent optimized for missing entries

Amir Hadadi

I went over the discussions, and found the following:
http://cs.oswego.edu/pipermail/concurrency-interest/2014-December/013360.html

And these are the current benchmarks from caffeine (the links in the post above are dead):
https://github.com/ben-manes/caffeine/wiki/Benchmarks


> As an intermediate
> test, I measured the impact of adding a single-node prescreen
> ("1cif") before locking inside CHM.computeIfAbsent, that is similar
> to what was done in some pre-release versions:
>
> Same key
>
> cif:        1402559
> get+cif: 3775886700
> 1cif:    1760916148
>
> Zipf-distributed keys
>
> cif:     1414945003
> get+cif:  882477874
> 1cif:     618668961


You refer to 2 benchmarks in this post, ComputingTest and ComputeBenchmark. I couldn't find ComputingTest, but I found ComputeBenchmark here:
https://github.com/ben-manes/caffeine/blob/master/caffeine/src/jmh/java/com/github/benmanes/caffeine/cache/ComputeBenchmark.java

I tested only the Zipf-distributed keys, running the cif vs get + cif versions using 8 threads, OSX 10.11.6, 2.8 GHz Intel Quad Core i7.
My results were:

cif:
Result "org.sample.ComputeBenchmark.compute_spread":
  83559118.933 ±(99.9%) 18403807.660 ops/s [Average]
  (min, avg, max) = (80998142.816, 83559118.933, 86575174.221), stdev = 2848009.607
  CI (99.9%): [65155311.273, 101962926.593] (assumes normal distribution)


get + cif:
  208177607.279 ±(99.9%) 12987916.190 ops/s [Average]
  (min, avg, max) = (206351511.981, 208177607.279, 210850461.612), stdev = 2009894.407
  CI (99.9%): [195189691.090, 221165523.469] (assumes normal distribution)

So the get + cif version is faster by a factor of 2.5.
I would be very surprised to find that get + cif is slower than cif when there are no absent entries.

Do these results seem reasonable? Did you run other benchmarks that convinced you that cif is better than the alternatives?


From: Concurrency-interest <[hidden email]> on behalf of Doug Lea <[hidden email]>
Sent: Thursday, February 9, 2017 1:51:01 PM
To: Benjamin Manes
Cc: [hidden email]
Subject: Re: [concurrency-interest] computeIfAbsent optimized for missing entries
 
On 02/08/2017 12:22 PM, Benjamin Manes wrote:
> In JDK9, it looks like you added a small prescreening
> <http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/concurrent/ConcurrentHashMap.java?r1=1.295&r2=1.296>
> to check at the first node in the bin and eagerly return. That seems
> like a smart compromise, given the performance analysis you shared on
> this forum previously that dissuaded you from adopting a full
> pre-screening.

Thanks! I forgot that this made it into jdk9 (vs jdk8).
Amir: for more details see the discussions on this in the
archives: http://cs.oswego.edu/pipermail/concurrency-interest/

-Doug

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: computeIfAbsent optimized for missing entries

Benjamin Manes
In that thread, Doug explains why:

However, the exact benefit depends on access patterns. For example, I reran your benchmark cases (urls below) on a 32way x86, and got throughputs (ops/sec) that are dramatically better with pre-screen for the case of a single key, but worse with your Zipf-distributed keys. 
 
One might think (I did until running similar experiments) that the "1cif" version would be the best compromise. But currently it isn't. This is in part due to interactions with biased locking, that in some cases basically provide "free" prescreens, but in other cases add safepoint/GC pressure in addition to lock contention. This is all up for re-consideration though.

At the time, I spun up a 32-core machine and reproduced his findings. The JDK9 partial pre-screening seems to be a compromise that hopefully avoids lock contention in the common case, but I have not benchmarked it myself to advise. Caffeine always performs its own pre-screening since a cache has a high likelihood of the entry being present.

I believe ComputingTest became AsMapTest. This has the recursive computation test cases discussed and fixed in that thread.

On Tue, Feb 14, 2017 at 5:46 AM, Amir Hadadi <[hidden email]> wrote:

I went over the discussions, and found the following:

And these are the current benchmarks from caffeine (the links in the post above are dead):


> As an intermediate
> test, I measured the impact of adding a single-node prescreen
> ("1cif") before locking inside CHM.computeIfAbsent, that is similar
> to what was done in some pre-release versions:
>
> Same key
>
> cif:        1402559
> get+cif: 3775886700
> 1cif:    1760916148
>
> Zipf-distributed keys
>
> cif:     1414945003
> get+cif:  882477874
> 1cif:     618668961


You refer to 2 benchmarks in this post, ComputingTest and ComputeBenchmark. I couldn't find ComputingTest, but I found ComputeBenchmark here:

I tested only the Zipf-distributed keys, running the cif vs get + cif versions using 8 threads, OSX 10.11.6, 2.8 GHz Intel Quad Core i7.
My results were:

cif:
Result "org.sample.ComputeBenchmark.compute_spread":
  83559118.933 ±(99.9%) 18403807.660 ops/s [Average]
  (min, avg, max) = (80998142.816, 83559118.933, 86575174.221), stdev = 2848009.607
  CI (99.9%): [65155311.273, 101962926.593] (assumes normal distribution)


get + cif:
  208177607.279 ±(99.9%) 12987916.190 ops/s [Average]
  (min, avg, max) = (206351511.981, 208177607.279, 210850461.612), stdev = 2009894.407
  CI (99.9%): [195189691.090, 221165523.469] (assumes normal distribution)

So the get + cif version is faster by a factor of 2.5.
I would be very surprised to find that get + cif is slower than cif when there are no absent entries.

Do these results seem reasonable? Did you run other benchmarks that convinced you that cif is better than the alternatives?


From: Concurrency-interest <[hidden email]> on behalf of Doug Lea <[hidden email]>
Sent: Thursday, February 9, 2017 1:51:01 PM
To: Benjamin Manes
Cc: [hidden email]
Subject: Re: [concurrency-interest] computeIfAbsent optimized for missing entries
 
On 02/08/2017 12:22 PM, Benjamin Manes wrote:
> In JDK9, it looks like you added a small prescreening
> <http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/concurrent/ConcurrentHashMap.java?r1=1.295&r2=1.296>
> to check at the first node in the bin and eagerly return. That seems
> like a smart compromise, given the performance analysis you shared on
> this forum previously that dissuaded you from adopting a full
> pre-screening.

Thanks! I forgot that this made it into jdk9 (vs jdk8).
Amir: for more details see the discussions on this in the
archives: http://cs.oswego.edu/pipermail/concurrency-interest/

-Doug

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest



_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest