Missed submissions in latest versions of Java ForkJoinPool

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Missed submissions in latest versions of Java ForkJoinPool

Andriy Plokhotnyuk
Code to reproduce:

        ForkJoinPool e = new ForkJoinPool(1);
        AtomicBoolean b = new AtomicBoolean();
        for (int i = 0; i < 100000; i++) {
            b.set(true);
            e.execute(new Runnable() {
                @Override
                public void run() {
                    b.set(false);
                }
            });
            while (b.get());
        }

It completes successfully on 8u31 or older versions, but hangs up on 8u40 or above.

I have found that first version of FJ pool which fails this test was committed to jsr166 repository at 07.07.2017 with following comment: "Overhaul throttling; other internal refactorings".

Best regards,
Andriy


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Missed submissions in latest versions of Java ForkJoinPool

Martin Buchholz-3
I can reproduce this with latest jsr166 CVS.  SSCCE:

import java.util.concurrent.*;
import java.util.concurrent.atomic.*;

public class ForkJoinPoolThrottling {
    public static void main(String[] args) throws Throwable {
        final ForkJoinPool e = new ForkJoinPool(1);
        final AtomicBoolean b = new AtomicBoolean();
        final Runnable setFalse = new Runnable() { public void run() {
            b.set(false);
        }};
        for (int i = 0; i < 100000; i++) {
            b.set(true);
            e.execute(setFalse);
            do {} while (b.get()); // spins forever here
        }
    }
}

It spins with no sign of any pool thread.  I agree this looks like a serious bug.

I can confirm it was introduced by the (scary) change

date: 2014/07/07 18:29:07;  author: dl;  state: Exp;  lines: +1328 -1333
Overhaul throttling; other internal refactorings

Diffs can be obtained using:

TZ=GMT cvs diff -D '2014-07-07 18:25:00' -D '2014-07-07 18:35:00'


On Sun, Apr 19, 2015 at 8:45 AM, Andriy Plokhotnyuk <[hidden email]> wrote:
Code to reproduce:

        ForkJoinPool e = new ForkJoinPool(1);
        AtomicBoolean b = new AtomicBoolean();
        for (int i = 0; i < 100000; i++) {
            b.set(true);
            e.execute(new Runnable() {
                @Override
                public void run() {
                    b.set(false);
                }
            });
            while (b.get());
        }

It completes successfully on 8u31 or older versions, but hangs up on 8u40 or above.

I have found that first version of FJ pool which fails this test was committed to jsr166 repository at 07.07.2017 with following comment: "Overhaul throttling; other internal refactorings".

Best regards,
Andriy


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest



_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Missed submissions in latest versions of Java ForkJoinPool

Doug Lea
In reply to this post by Andriy Plokhotnyuk

Thanks. We are trying to figure out why hotspot is generating some
surprising code for the putOrderedInt intrinsic. In the mean time,
a version with the volatile orderings used previously is committed
and can be run with jsr166.jar at
http://gee.cs.oswego.edu/dl/jsr166/dist/jsr166.jar

-Doug


On 04/19/2015 11:45 AM, Andriy Plokhotnyuk wrote:

> Code to reproduce:
>
>          ForkJoinPool e = new ForkJoinPool(1);
>          AtomicBoolean b = new AtomicBoolean();
>          for (int i = 0; i < 100000; i++) {
>              b.set(true);
>              e.execute(new Runnable() {
>                  @Override
>                  public void run() {
>                      b.set(false);
>                  }
>              });
>              while (b.get());
>          }
>
> It completes successfully on 8u31 or older versions, but hangs up on 8u40 or above.
>
> I have found that first version of FJ pool which fails this test was committed
> to jsr166 repository at 07.07.2017 with following comment: "Overhaul throttling;
> other internal refactorings".
>
> Best regards,
> Andriy
>
>
>
> _______________________________________________
> Concurrency-interest mailing list
> [hidden email]
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Missed submissions in latest versions of Java ForkJoinPool

Andrew Haley
On 04/20/2015 11:54 AM, Doug Lea wrote:
>
> Thanks. We are trying to figure out why hotspot is generating some
> surprising code for the putOrderedInt intrinsic.

Have you got an OpenJDK bugzilla # for this?

Thanks,
Andrew.


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Missed submissions in latest versions of Java ForkJoinPool

Doug Lea
On 04/22/2015 06:06 AM, Andrew Haley wrote:
> On 04/20/2015 11:54 AM, Doug Lea wrote:
>>
>> Thanks. We are trying to figure out why hotspot is generating some
>> surprising code for the putOrderedInt intrinsic.
>
> Have you got an OpenJDK bugzilla # for this?
>

Not yet, pending diagnosis of whether "surprising" indicates
any further problems beyond the need for the re-strengthened
write in current patch.

-Doug


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Missed submissions in latest versions of Java ForkJoinPool

Aleksey Shipilev-2
In reply to this post by Andrew Haley
On 04/22/2015 01:06 PM, Andrew Haley wrote:
> On 04/20/2015 11:54 AM, Doug Lea wrote:
>>
>> Thanks. We are trying to figure out why hotspot is generating some
>> surprising code for the putOrderedInt intrinsic.
>
> Have you got an OpenJDK bugzilla # for this?

Not yet, trying to isolate. Not yet sure if that's a HotSpot bug. I'll
post once something is clear.

Thanks,
-Aleksey.


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Missed submissions in latest versions of Java ForkJoinPool

Doug Lea
In reply to this post by Andriy Plokhotnyuk

Thanks again for reporting this. Now fixed by re-strengthening a write.
See https://bugs.openjdk.java.net/browse/JDK-8078490
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8078490-fj-missed-submissions/webrev/

Thanks to Paul Sandoz, Aleksey Shipilev, and Martin Buchholz.

For the curious, the problem was a volatile write that was
incorrectly weakened to a releasing write (putOrderedInt),
emphasizing the delicacy of such optimizations. The weakening
was correct considered in isolation, but left a path in the
case of unjoined tasks with a size-1 FJP such that the single
worker need not see an available task and becomes idle. (As in
this test case.)

-Doug


On 04/19/2015 11:45 AM, Andriy Plokhotnyuk wrote:

> Code to reproduce:
>
>          ForkJoinPool e = new ForkJoinPool(1);
>          AtomicBoolean b = new AtomicBoolean();
>          for (int i = 0; i < 100000; i++) {
>              b.set(true);
>              e.execute(new Runnable() {
>                  @Override
>                  public void run() {
>                      b.set(false);
>                  }
>              });
>              while (b.get());
>          }
>
> It completes successfully on 8u31 or older versions, but hangs up on 8u40 or above.
>
> I have found that first version of FJ pool which fails this test was committed
> to jsr166 repository at 07.07.2017 with following comment: "Overhaul throttling;
> other internal refactorings".
>
> Best regards,
> Andriy
>
>
>
> _______________________________________________
> Concurrency-interest mailing list
> [hidden email]
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Missed submissions in latest versions of Java ForkJoinPool

Dávid Karnok
Hi. I'm using such putOrdered constructs quite often and I'd like to understand why is it wrong particularly there? U.putOrderedInt(w, QLOCK, 0); appears many times elsewhere. My guess is that the ordered release puts that 0 into a write buffer and the subsequent worker unpark still sees 1 in memory somehow. Alternatively, I find top being non-volatile odd as it is written with putOrdered and either read before or after base (depending on whether it is an owner-called method) yet in externalPush the expression in the condition (n = (s = q.top) - q.base) reads it in different order but I think externalPush called from execute() is not an owner call.

2015-04-24 1:07 GMT+02:00 Doug Lea <[hidden email]>:

Thanks again for reporting this. Now fixed by re-strengthening a write.
See https://bugs.openjdk.java.net/browse/JDK-8078490
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8078490-fj-missed-submissions/webrev/

Thanks to Paul Sandoz, Aleksey Shipilev, and Martin Buchholz.

For the curious, the problem was a volatile write that was
incorrectly weakened to a releasing write (putOrderedInt),
emphasizing the delicacy of such optimizations. The weakening
was correct considered in isolation, but left a path in the
case of unjoined tasks with a size-1 FJP such that the single
worker need not see an available task and becomes idle. (As in
this test case.)

-Doug


On 04/19/2015 11:45 AM, Andriy Plokhotnyuk wrote:
Code to reproduce:

         ForkJoinPool e = new ForkJoinPool(1);
         AtomicBoolean b = new AtomicBoolean();
         for (int i = 0; i < 100000; i++) {
             b.set(true);
             e.execute(new Runnable() {
                 @Override
                 public void run() {
                     b.set(false);
                 }
             });
             while (b.get());
         }

It completes successfully on 8u31 or older versions, but hangs up on 8u40 or above.

I have found that first version of FJ pool which fails this test was committed
to jsr166 repository at 07.07.2017 with following comment: "Overhaul throttling;
other internal refactorings".

Best regards,
Andriy



_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest



--
Best regards,
David Karnok

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Missed submissions in latest versions of Java ForkJoinPool

Andrew Haley
On 24/04/15 08:30, Dávid Karnok wrote:
> Hi. I'm using such putOrdered constructs quite often and I'd like to
> understand why is it wrong particularly there? U.putOrderedInt(w, QLOCK,
> 0); appears many times elsewhere.

Indeed, there's a very similar construct here:

        final CountedCompleter<?> popCC(CountedCompleter<?> task, int mode) {
...
                            if (mode < 0) { // must lock
                                if (U.compareAndSwapInt(this, QLOCK, 0, 1)) {
                                    if (top == s && array == a &&
                                        U.compareAndSwapObject(a, j, t, null)) {
                                        U.putOrderedInt(this, QTOP, s - 1);
                                        U.putOrderedInt(this, QLOCK, 0);
                                        return t;
                                    }
                                    U.compareAndSwapInt(this, QLOCK, 1, 0);
                                }

It certainly looks to me like this method returns without making
the release of this.qlock visible to all observers.

Andrew.

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Missed submissions in latest versions of Java ForkJoinPool

Dávid Karnok
Another thing I don't understand is why not use U.putOrderedInt(this, QLOCK, 0); instead of U.compareAndSwapInt(this, QLOCK, 1, 0); because only the single thread who successfully set QLOCK to 1 can set it back to 0? Is it that by using a full barrier operation, other observers can pick up the unlocked state more quickly?

2015-04-24 11:22 GMT+02:00 Andrew Haley <[hidden email]>:
On 24/04/15 08:30, Dávid Karnok wrote:
> Hi. I'm using such putOrdered constructs quite often and I'd like to
> understand why is it wrong particularly there? U.putOrderedInt(w, QLOCK,
> 0); appears many times elsewhere.

Indeed, there's a very similar construct here:

        final CountedCompleter<?> popCC(CountedCompleter<?> task, int mode) {
...
                            if (mode < 0) { // must lock
                                if (U.compareAndSwapInt(this, QLOCK, 0, 1)) {
                                    if (top == s && array == a &&
                                        U.compareAndSwapObject(a, j, t, null)) {
                                        U.putOrderedInt(this, QTOP, s - 1);
                                        U.putOrderedInt(this, QLOCK, 0);
                                        return t;
                                    }
                                    U.compareAndSwapInt(this, QLOCK, 1, 0);
                                }

It certainly looks to me like this method returns without making
the release of this.qlock visible to all observers.

Andrew.




--
Best regards,
David Karnok

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Missed submissions in latest versions of Java ForkJoinPool

Doug Lea
In reply to this post by Dávid Karnok
On 04/24/2015 03:30 AM, Dávid Karnok wrote:
> Hi. I'm using such putOrdered constructs quite often and I'd like to understand
> why is it wrong particularly there? U.putOrderedInt(w, QLOCK, 0); appears many
> times elsewhere.

The use of putorderedInt is correct considered only with respect
to its role as a spinlock. However, the use in externalPush
before a call to signalWork also serves a Dekker-like role in
ensuring that some scanning worker sees the submitted task
before giving up and idling. (In every case other than a
single worker and an unjoined task, there is some other path
ensuring this.) So the full-volatile write is logically
associated with the update to top, but is implemented for
the qlock write that is guaranteed to follow it.

-Doug



_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Missed submissions in latest versions of Java ForkJoinPool

Doug Lea
In reply to this post by Dávid Karnok
On 04/24/2015 06:13 AM, Dávid Karnok wrote:
> Another thing I don't understand is why not use U.putOrderedInt(this, QLOCK, 0);
> instead of U.compareAndSwapInt(this, QLOCK, 1, 0); because only the single
> thread who successfully set QLOCK to 1 can set it back to 0?

Except that during shutdown, the thread initiating termination can
poison some locks (with -1), but not others. So a few need to use
CAS to ensure that termination is noticed.

-Doug




_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Missed submissions in latest versions of Java ForkJoinPool

Andrew Haley
In reply to this post by Doug Lea
On 04/24/2015 11:46 AM, Doug Lea wrote:

> On 04/24/2015 03:30 AM, Dávid Karnok wrote:
>> Hi. I'm using such putOrdered constructs quite often and I'd like to understand
>> why is it wrong particularly there? U.putOrderedInt(w, QLOCK, 0); appears many
>> times elsewhere.
>
> The use of putorderedInt is correct considered only with respect
> to its role as a spinlock. However, the use in externalPush
> before a call to signalWork also serves a Dekker-like role in
> ensuring that some scanning worker sees the submitted task
> before giving up and idling. (In every case other than a
> single worker and an unjoined task, there is some other path
> ensuring this.) So the full-volatile write is logically
> associated with the update to top, but is implemented for
> the qlock write that is guaranteed to follow it.

Got it.  But Doug, don't you think that this rather frightening
concurrent-programming idiom can be blamed for this bug?  That is
to say, we've created code that's so clever we can't understand it?
Is it worth it?

(For some values of "we", of course.  :-)

Andrew.

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Missed submissions in latest versions of Java ForkJoinPool

Doug Lea
On 04/24/2015 07:15 AM, Andrew Haley wrote:

> Got it.  But Doug, don't you think that this rather frightening
> concurrent-programming idiom can be blamed for this bug?

Modulo the usual issue that some concurrent constructions are
not very intuitive, the main problem has been expressivity.
In particular, without enhanced-volatiles (or C++/C11 modes),
there's a gap between intent and code that makes things
harder to understand and easier to get wrong. We're hoping
that adding these via VarHandles to jdk9 will improve the
current situation in which only a few people are brave enough
to use the best known algorithms in error-prone ways. (I have
some parts of some tentative jdk9 j.u.c  updates rewritten using
these, and they seem clearer.)

-Doug


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Missed submissions in latest versions of Java ForkJoinPool

Andrew Haley
On 24/04/15 12:29, Doug Lea wrote:

> On 04/24/2015 07:15 AM, Andrew Haley wrote:
>
>> Got it.  But Doug, don't you think that this rather frightening
>> concurrent-programming idiom can be blamed for this bug?
>
> Modulo the usual issue that some concurrent constructions are
> not very intuitive, the main problem has been expressivity.
> In particular, without enhanced-volatiles (or C++/C11 modes),
> there's a gap between intent and code that makes things
> harder to understand and easier to get wrong.

Sure, I get that, but isn't this a good opportunity to insert a
comment or two?  This:

  //  The use of putorderedInt is correct considered only with respect
  //  to its role as a spinlock. However, the use in externalPush
  //  before a call to signalWork also serves a Dekker-like role in
  //  ensuring that some scanning worker sees the submitted task
  //  before giving up and idling. (In every case other than a
  //  single worker and an unjoined task, there is some other path
  //  ensuring this.) So the full-volatile write is logically
  //  associated with the update to top, but is implemented for
  //  the qlock write that is guaranteed to follow it.

would do nicely.

Andrew.
_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest