Varieties of CAS semantics (another doc fix request)

classic Classic list List threaded Threaded
75 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

Varieties of CAS semantics (another doc fix request)

Justin Sampson
Howdy,

The recent discussion of alternative CAS implementations inspired me
to read through the existing docs & code in j.u.c.atomic, whereupon
I noticed some discrepancies in semantics.

For this discussion I'll focus on AtomicReference and
AtomicStampedReference; the other single-value atomics follow the
semantics of AtomicReference while AtomicMarkableReference follows
the semantics of AtomicStampedReference.

The package documentation indicates that compareAndSet is strong in
two ways: it cannot fail spuriously _and_ it provides the memory
effects of _both_ a volatile read and a volatile write. In contrast,
weakCompareAndSet is documented to be weak in both of those
respects, allowing spurious failure and not providing any memory
consistency effects outside of the atomic variable itself.

AtomicReference et al. implement compareAndSet by calling into the
intrinsic CAS operation on Unsafe, so I'll assume they behave as
documented.

AtomicStampedReference, on the other hand, actually implements
something in-between strong and weak CAS semantics, without that
fact being documented. Due to the indirection through a private
Pair object, this class's compareAndSet implementation has to do
some of the work itself before calling into the intrinsic CAS. As a
result, the actual semantics are:

1. It always provides the memory effects of a volatile read.

2. It is only guaranteed to provide the memory effects of a volatile
   write _if_ the stamp or reference is successfully _altered_ to a
   different value than it had before.

3. It may fail spuriously.

The attemptStamp method has exactly the same semantics. The spurious
failure is at least documented for attemptStamp, but not for
compareAndSet, and neither one documents the lack of volatile write
effects on failure or no-op success.

Now, I think these semantics are entirely fine. They seem like the
most intuitive and implementable semantics for CAS in general. If
the intrinsic CAS weren't already spec'd to be strong for many years
I'd advocate for it to have these semantics as well. :) There's no
practical need for volatile write effects unless you've actually
altered the value, since otherwise no other threads are going to
notice you've done anything anyway. And the usual idiom is to call
CAS in a loop, so spurious failure is perfectly acceptable.

Therefore I certainly don't want to advocate for changing the
implementation in AtomicStampedReference, but some clarification of
the docs seems in order.

What would y'all think of changing the _package_ docs to describe
this looser kind of semantics, as a _minimum_ strength for _all_
compareAndSet implementations in the package? I believe it's strong
enough for any practical usage, so for most purposes a programmer
need look no further. In those cases such as AtomicReference where
compareAndSet is implemented much stronger, it can be documented on
that class.

Cheers,
Justin

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Varieties of CAS semantics (another doc fix request)

Hans Boehm
I don't think it makes sense in the memory model to say something has volatile write semantics if nothing is actually written.  The spec could perhaps be clearer that it's not saying that.  But if it does, that's arguably a vacuous statement anyway.

It does seem to be (barely) observable whether a write of the value that was already there has volatile semantics or not:

(v volatile, everything initially zero)
Thread 1:
x = 1;
v = 0;

Thread 2:
v = 1;

Thread 3:
r1 = v;
r2 = x;

After all threads are joined:
r3 = v;

If r3 = 1, we know that v = 0 preceded v = 1 in the synchronization order.  Hence r1 = r3 = 1 must imply that v = 0 and v = 1 both synchronize with r1 = v and hence r2 = 1.  I think that would not be true if the v = 0 assignment were not volatile.

Thus it does seem to matter whether we atomically replace a value by itself or don't write at all.

Hans

On Wed, Jan 14, 2015 at 2:02 PM, Justin Sampson <[hidden email]> wrote:
Howdy,

The recent discussion of alternative CAS implementations inspired me
to read through the existing docs & code in j.u.c.atomic, whereupon
I noticed some discrepancies in semantics.

For this discussion I'll focus on AtomicReference and
AtomicStampedReference; the other single-value atomics follow the
semantics of AtomicReference while AtomicMarkableReference follows
the semantics of AtomicStampedReference.

The package documentation indicates that compareAndSet is strong in
two ways: it cannot fail spuriously _and_ it provides the memory
effects of _both_ a volatile read and a volatile write. In contrast,
weakCompareAndSet is documented to be weak in both of those
respects, allowing spurious failure and not providing any memory
consistency effects outside of the atomic variable itself.

AtomicReference et al. implement compareAndSet by calling into the
intrinsic CAS operation on Unsafe, so I'll assume they behave as
documented.

AtomicStampedReference, on the other hand, actually implements
something in-between strong and weak CAS semantics, without that
fact being documented. Due to the indirection through a private
Pair object, this class's compareAndSet implementation has to do
some of the work itself before calling into the intrinsic CAS. As a
result, the actual semantics are:

1. It always provides the memory effects of a volatile read.

2. It is only guaranteed to provide the memory effects of a volatile
   write _if_ the stamp or reference is successfully _altered_ to a
   different value than it had before.

3. It may fail spuriously.

The attemptStamp method has exactly the same semantics. The spurious
failure is at least documented for attemptStamp, but not for
compareAndSet, and neither one documents the lack of volatile write
effects on failure or no-op success.

Now, I think these semantics are entirely fine. They seem like the
most intuitive and implementable semantics for CAS in general. If
the intrinsic CAS weren't already spec'd to be strong for many years
I'd advocate for it to have these semantics as well. :) There's no
practical need for volatile write effects unless you've actually
altered the value, since otherwise no other threads are going to
notice you've done anything anyway. And the usual idiom is to call
CAS in a loop, so spurious failure is perfectly acceptable.

Therefore I certainly don't want to advocate for changing the
implementation in AtomicStampedReference, but some clarification of
the docs seems in order.

What would y'all think of changing the _package_ docs to describe
this looser kind of semantics, as a _minimum_ strength for _all_
compareAndSet implementations in the package? I believe it's strong
enough for any practical usage, so for most purposes a programmer
need look no further. In those cases such as AtomicReference where
compareAndSet is implemented much stronger, it can be documented on
that class.

Cheers,
Justin

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Varieties of CAS semantics (another doc fix request)

Justin Sampson
Hans Boehm wrote:

> I don't think it makes sense in the memory model to say something
> has volatile write semantics if nothing is actually written. The
> spec could perhaps be clearer that it's not saying that. But if it
> does, that's arguably a vacuous statement anyway.

From java/util/concurrent/atomic/package-summary.html:

"compareAndSet and all other read-and-update operations such as
getAndIncrement have the memory effects of both reading and writing
volatile variables."

I don't know if this is actually true for the intrinsic CAS -- if
you think it's not, or shouldn't be, then it definitely needs to be
rewritten. :) But it's _definitely_ not true for the compareAndSet
on AtomicStampedReference, which only has the memory effects of
writing a volatile variable if it actually writes a different value
than was there before (and it can fail spuriously).

> It does seem to be (barely) observable whether a write of the
> value that was already there has volatile semantics or not:
>
> [...]
>
> Thus it does seem to matter whether we atomically replace a value
> by itself or don't write at all.

Right, the synchronized-with edges come out differently. I don't
think anyone should be relying on the synchronized-with edges coming
from a failed or no-op CAS for any practical purpose, but that's
currently how it's spec'd.

Out of curiosity, is a weakCompareAndSet at least considered a
synchronization action, part of the overall synchronization order,
such that it can't be reordered with any nearby volatile reads or
writes? Or is it truly free-for-all?

Thanks!
Justin

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Varieties of CAS semantics (another doc fix request)

oleksandr otenko
re: synchronizes-with in failing CAS.

The edge is there, if CAS doesn't fail before reading the value. I don't
know if any hardware may at times fail CAS without reading the value.
Note also that if CAS fails, then you won't have anything that depends
on CAS result to synchronize-with.

Alex

On 16/01/2015 01:38, Justin Sampson wrote:

> Hans Boehm wrote:
>
>> I don't think it makes sense in the memory model to say something
>> has volatile write semantics if nothing is actually written. The
>> spec could perhaps be clearer that it's not saying that. But if it
>> does, that's arguably a vacuous statement anyway.
>  From java/util/concurrent/atomic/package-summary.html:
>
> "compareAndSet and all other read-and-update operations such as
> getAndIncrement have the memory effects of both reading and writing
> volatile variables."
>
> I don't know if this is actually true for the intrinsic CAS -- if
> you think it's not, or shouldn't be, then it definitely needs to be
> rewritten. :) But it's _definitely_ not true for the compareAndSet
> on AtomicStampedReference, which only has the memory effects of
> writing a volatile variable if it actually writes a different value
> than was there before (and it can fail spuriously).
>
>> It does seem to be (barely) observable whether a write of the
>> value that was already there has volatile semantics or not:
>>
>> [...]
>>
>> Thus it does seem to matter whether we atomically replace a value
>> by itself or don't write at all.
> Right, the synchronized-with edges come out differently. I don't
> think anyone should be relying on the synchronized-with edges coming
> from a failed or no-op CAS for any practical purpose, but that's
> currently how it's spec'd.
>
> Out of curiosity, is a weakCompareAndSet at least considered a
> synchronization action, part of the overall synchronization order,
> such that it can't be reordered with any nearby volatile reads or
> writes? Or is it truly free-for-all?
>
> Thanks!
> Justin
>
> _______________________________________________
> Concurrency-interest mailing list
> [hidden email]
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Varieties of CAS semantics (another doc fix request)

Justin Sampson
Oleksandr Otenko wrote:

> re: synchronizes-with in failing CAS.
>
> The edge is there, if CAS doesn't fail before reading the value. I
> don't know if any hardware may at times fail CAS without reading
> the value.

That would be a spurious failure, wouldn't it? Doesn't the intrinsic
CAS implementation promise that it never fails spuriously?

The edge is definitely _not_ there in the case of failure or no-op
success in AtomicStampedReference's compareAndSet method, which is
the doc discrepency I'm proposing to fix. (The edge actually _is_
there in the case of spurious failure, though, as long as it's there
for a non-spurious intrinsic CAS failure, due to the way the stamped
pair object is handled. But I wouldn't want to spec that.)

> Note also that if CAS fails, then you won't have anything that
> depends on CAS result to synchronize-with.

Thanks, I think that confirms my intuition. It doesn't seem like any
well-behaved code can actually benefit from such a synchronized-with
edge.

Cheers,
Justin

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Varieties of CAS semantics (another doc fix request)

Hans Boehm
In reply to this post by Justin Sampson

On Thu, Jan 15, 2015 at 5:38 PM, Justin Sampson <[hidden email]> wrote:

>
> Hans Boehm wrote:
>
> > I don't think it makes sense in the memory model to say something
> > has volatile write semantics if nothing is actually written. The
> > spec could perhaps be clearer that it's not saying that. But if it
> > does, that's arguably a vacuous statement anyway.
>
> From java/util/concurrent/atomic/package-summary.html:
>
> "compareAndSet and all other read-and-update operations such as
> getAndIncrement have the memory effects of both reading and writing
> volatile variables."
>
> I don't know if this is actually true for the intrinsic CAS -- if
> you think it's not, or shouldn't be, then it definitely needs to be
> rewritten. :) But it's _definitely_ not true for the compareAndSet
> on AtomicStampedReference, which only has the memory effects of
> writing a volatile variable if it actually writes a different value
> than was there before (and it can fail spuriously).
It's not well-written.  But I'm not that concerned if it says it has "the memory effects of both reading and writing volatile variables", and the description then states that in some cases it doesn't write the underlying object at all, as I think the definition of CAS does.  Clearly the nonexistent write can't be volatile.  If it were to make a volatile write to some other unspecified location, that wouldn't be observable.  This seems like a minor editorial issue to me.

The fact that StampedReference seems to elide a "redundant" volatile write seems much more dubious, and I don't see why that's likely to be a successful optimization anyway.  It only seems to help if a CAS is used to conditionally replace a value by itself.  Does anyone understand why it's a good idea to "optimize" that case?

...

>
> Out of curiosity, is a weakCompareAndSet at least considered a
> synchronization action, part of the overall synchronization order,
> such that it can't be reordered with any nearby volatile reads or
> writes? Or is it truly free-for-all?

The current Java memory model doesn't cover either weakCompareAndSet() or lazySet().  They weren't around when the memory model work was done.  My assumption is that weakCompareAndSet is similar to compare_exchange_weak(..., memory_order_relaxed) in C++, but that leaves open some questions about cache coherence, I think.  My assumption is that they do not participate in a single total synchronization order.  (lazySet operations already can't.)

Hans


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Varieties of CAS semantics (another doc fix request)

Justin Sampson
Hans Boehm wrote:

> It's not well-written. But I'm not that concerned if it says it
> has "the memory effects of both reading and writing volatile
> variables", and the description then states that in some cases it
> doesn't write the underlying object at all, as I think the
> definition of CAS does. Clearly the nonexistent write can't be
> volatile. If it were to make a volatile write to some other
> unspecified location, that wouldn't be observable. This seems like
> a minor editorial issue to me.

Doesn't a hardware CAS typically involve pretty strong fences on
both sides? Even if the value isn't updated, it's still possible to
ensure that all writes prior to the CAS in program order will be
visible following all subsequent reads of the same atomic variable
by other threads, which is what I figured "the memory effects of
writing volatile variables" was referring to. I don't think it's
important that CAS behave that way, but at least it's a consistent
reading of the spec.

> The fact that StampedReference seems to elide a "redundant"
> volatile write seems much more dubious, and I don't see why that's
> likely to be a successful optimization anyway. It only seems to
> help if a CAS is used to conditionally replace a value by itself.
> Does anyone understand why it's a good idea to "optimize" that
> case?

For what it's worth, it does avoid an object allocation...

I may try my hand at a patch for the docs at some point. You and
Alex have at least confirmed my intuition about how things are
expected to work.

Thanks!
Justin

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Varieties of CAS semantics (another doc fix request)

Peter Levart
In reply to this post by Hans Boehm
On 01/17/2015 01:59 AM, Hans Boehm wrote:
On Thu, Jan 15, 2015 at 5:38 PM, Justin Sampson [hidden email]
wrote:
Hans Boehm wrote:

I don't think it makes sense in the memory model to say something
has volatile write semantics if nothing is actually written. The
spec could perhaps be clearer that it's not saying that. But if it
does, that's arguably a vacuous statement anyway.
From java/util/concurrent/atomic/package-summary.html:

"compareAndSet and all other read-and-update operations such as
getAndIncrement have the memory effects of both reading and writing
volatile variables."

I don't know if this is actually true for the intrinsic CAS -- if
you think it's not, or shouldn't be, then it definitely needs to be
rewritten. :) But it's _definitely_ not true for the compareAndSet
on AtomicStampedReference, which only has the memory effects of
writing a volatile variable if it actually writes a different value
than was there before (and it can fail spuriously).
It's not well-written.  But I'm not that concerned if it says it has "the
memory effects of both reading and writing volatile variables", and the
description then states that in some cases it doesn't write the underlying
object at all, as I think the definition of CAS does.  Clearly the
nonexistent write can't be volatile.  If it were to make a volatile write
to some other unspecified location, that wouldn't be observable.  This
seems like a minor editorial issue to me.


Perhaps the intention is to specify that compiler must treat CAS as a volatile read+write regarding reordering of surrounding reads and writes regardless of whether CAS is successful or not (which compiler can't know).

It's interesting that unsuccessful CAS (at least on Java and Intel i7 Sandy Bridge) takes exactly the same time to execute as successful. For example, the following JMH benchmark:

@BenchmarkMode(Mode.AverageTime)
@Fork(1)
@Warmup(iterations = 5)
@Measurement(iterations = 10)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class AtomicBench {

    static final AtomicInteger i = new AtomicInteger();

    @Benchmark
    public int get() {
        return i.get();
    }

    @Benchmark
    public void set() {
        i.set(123);
    }

    @Benchmark
    public int setGet() {
        i.set(123);
        return i.get();
    }

    @Benchmark
    public boolean casSuccess() {
        return i.compareAndSet(0, 0);
    }

    @Benchmark
    public boolean casFail() {
        return i.compareAndSet(42, 43);
    }
}


Prints the following:

Benchmark                     Mode  Samples  Score   Error  Units
j.t.AtomicBench.casFail       avgt       10  7.246 ± 0.255  ns/op
j.t.AtomicBench.casSuccess    avgt       10  7.168 ± 0.283  ns/op
j.t.AtomicBench.get           avgt       10  2.128 ± 0.055  ns/op
j.t.AtomicBench.set           avgt       10  5.506 ± 0.203  ns/op
j.t.AtomicBench.setGet        avgt       10  9.370 ± 0.072  ns/op



So if unsuccessful CAS is not a volatile write, why does it seem to take the same time as successful one?

If the benchmark tells the truth, would pre-screening a CAS with a volatile read from the same location help if we anticipate failure frequently?


Regards, Peter


...

Out of curiosity, is a weakCompareAndSet at least considered a
synchronization action, part of the overall synchronization order,
such that it can't be reordered with any nearby volatile reads or
writes? Or is it truly free-for-all?
The current Java memory model doesn't cover either weakCompareAndSet() or
lazySet().  They weren't around when the memory model work was done.  My
assumption is that weakCompareAndSet is similar to
compare_exchange_weak(..., memory_order_relaxed) in C++, but that leaves
open some questions about cache coherence, I think.  My assumption is that
they do not participate in a single total synchronization order.  (lazySet
operations already can't.)

Hans



_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Varieties of CAS semantics (another doc fix request)

David Holmes-6
In hotspot all the Atomic read-modify-write operations (which includes the cmpxchg underpinning these CAS operations) are required to have the semantics of:
 
fence(); <op>; membar storeload|storestore
 
so that the overall operation acts as a full bi-directional barrier regardless of the success or failure of the CAS. This comes from the fact that on x86 and SPARC the atomic primitives already imply full bi-directional barriers (so there are no explicit fence/membar issued on those platforms) but otherwise we want to be able to reason about the code the same on all platforms, so if the architecture provides weaker atomics then they must be supplemented to get the required barrier semantics.
 
David
-----Original Message-----
From: [hidden email] [mailto:[hidden email]]On Behalf Of Peter Levart
Sent: Tuesday, 20 January 2015 1:44 AM
To: Hans Boehm; Justin Sampson
Cc: [hidden email]
Subject: Re: [concurrency-interest] Varieties of CAS semantics (another doc fix request)

On 01/17/2015 01:59 AM, Hans Boehm wrote:
On Thu, Jan 15, 2015 at 5:38 PM, Justin Sampson [hidden email]
wrote:
Hans Boehm wrote:

I don't think it makes sense in the memory model to say something
has volatile write semantics if nothing is actually written. The
spec could perhaps be clearer that it's not saying that. But if it
does, that's arguably a vacuous statement anyway.
From java/util/concurrent/atomic/package-summary.html:

"compareAndSet and all other read-and-update operations such as
getAndIncrement have the memory effects of both reading and writing
volatile variables."

I don't know if this is actually true for the intrinsic CAS -- if
you think it's not, or shouldn't be, then it definitely needs to be
rewritten. :) But it's _definitely_ not true for the compareAndSet
on AtomicStampedReference, which only has the memory effects of
writing a volatile variable if it actually writes a different value
than was there before (and it can fail spuriously).
It's not well-written.  But I'm not that concerned if it says it has "the
memory effects of both reading and writing volatile variables", and the
description then states that in some cases it doesn't write the underlying
object at all, as I think the definition of CAS does.  Clearly the
nonexistent write can't be volatile.  If it were to make a volatile write
to some other unspecified location, that wouldn't be observable.  This
seems like a minor editorial issue to me.


Perhaps the intention is to specify that compiler must treat CAS as a volatile read+write regarding reordering of surrounding reads and writes regardless of whether CAS is successful or not (which compiler can't know).

It's interesting that unsuccessful CAS (at least on Java and Intel i7 Sandy Bridge) takes exactly the same time to execute as successful. For example, the following JMH benchmark:

@BenchmarkMode(Mode.AverageTime)
@Fork(1)
@Warmup(iterations = 5)
@Measurement(iterations = 10)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class AtomicBench {

    static final AtomicInteger i = new AtomicInteger();

    @Benchmark
    public int get() {
        return i.get();
    }

    @Benchmark
    public void set() {
        i.set(123);
    }

    @Benchmark
    public int setGet() {
        i.set(123);
        return i.get();
    }

    @Benchmark
    public boolean casSuccess() {
        return i.compareAndSet(0, 0);
    }

    @Benchmark
    public boolean casFail() {
        return i.compareAndSet(42, 43);
    }
}


Prints the following:

Benchmark                     Mode  Samples  Score   Error  Units
j.t.AtomicBench.casFail       avgt       10  7.246 ± 0.255  ns/op
j.t.AtomicBench.casSuccess    avgt       10  7.168 ± 0.283  ns/op
j.t.AtomicBench.get           avgt       10  2.128 ± 0.055  ns/op
j.t.AtomicBench.set           avgt       10  5.506 ± 0.203  ns/op
j.t.AtomicBench.setGet        avgt       10  9.370 ± 0.072  ns/op



So if unsuccessful CAS is not a volatile write, why does it seem to take the same time as successful one?

If the benchmark tells the truth, would pre-screening a CAS with a volatile read from the same location help if we anticipate failure frequently?


Regards, Peter

...

Out of curiosity, is a weakCompareAndSet at least considered a
synchronization action, part of the overall synchronization order,
such that it can't be reordered with any nearby volatile reads or
writes? Or is it truly free-for-all?
The current Java memory model doesn't cover either weakCompareAndSet() or
lazySet().  They weren't around when the memory model work was done.  My
assumption is that weakCompareAndSet is similar to
compare_exchange_weak(..., memory_order_relaxed) in C++, but that leaves
open some questions about cache coherence, I think.  My assumption is that
they do not participate in a single total synchronization order.  (lazySet
operations already can't.)

Hans



_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Varieties of CAS semantics (another doc fix request)

Hans Boehm

On Mon, Jan 19, 2015 at 2:37 PM, David Holmes <[hidden email]> wrote:
>
> In hotspot all the Atomic read-modify-write operations (which includes the cmpxchg underpinning these CAS operations) are required to have the semantics of:
>  
> fence(); <op>; membar storeload|storestore
>  
> so that the overall operation acts as a full bi-directional barrier regardless of the success or failure of the CAS. This comes from the fact that on x86 and SPARC the atomic primitives already imply full bi-directional barriers (so there are no explicit fence/membar issued on those platforms) but otherwise we want to be able to reason about the code the same on all platforms, so if the architecture provides weaker atomics then they must be supplemented to get the required barrier semantics.
>  
> David

That seems to me to be at odds with some of the original Java memory model goals, notably the idea that you should be able to remove any fences or atomicity overhead from synchronization objects accessed by only a local thread.  It also seems to significantly disadvantage ARMv8 for very minimal benefit by insisting that CAS operations can be abused as a full fence purely to order racing operations before and after the fence.  It adds overhead purely to support incorrectly synchronized/racy programs.  This is also a property that's currently quite difficult to define in the memory model.  In short, I think this is a mistake.

On ARMv8, the natural implementation should be a load-exclusive-acquire, followed by a store-exclusive-release, which does not order prior and subsequent non-volatile operations.  It should however suffice for a C++ sequentially consistent compare_exchange, since only racy programs or programs with relaxed atomics can tell the difference.

Hans

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Varieties of CAS semantics (another doc fix request)

David Holmes-6

Hi Hans,
 
I don't see how this is at odds with any JMM goals as if you remove the atomic op you would also remove any additional barriers.
 
These implementations are for internal hotspot use as well as being the back end of the j.u.c.atomic operations. If this is a serious impediment on some platforms then a separate j.u.c.atomic back end could be defined - but then it would have to add back in the barriers related to the volatile read/write accesses implied - which I think would get you very close to where we are now. Your description for ARMv8 below is not providing the semantics of the Java-level volatile read+write AFAICS.
 
David
-----Original Message-----
From: [hidden email] [mailto:[hidden email]]On Behalf Of Hans Boehm
Sent: Tuesday, 20 January 2015 2:53 PM
To: David Holmes
Cc: [hidden email]
Subject: Re: [concurrency-interest] Varieties of CAS semantics (another doc fix request)


On Mon, Jan 19, 2015 at 2:37 PM, David Holmes <[hidden email]> wrote:
>
> In hotspot all the Atomic read-modify-write operations (which includes the cmpxchg underpinning these CAS operations) are required to have the semantics of:
>  
> fence(); <op>; membar storeload|storestore
>  
> so that the overall operation acts as a full bi-directional barrier regardless of the success or failure of the CAS. This comes from the fact that on x86 and SPARC the atomic primitives already imply full bi-directional barriers (so there are no explicit fence/membar issued on those platforms) but otherwise we want to be able to reason about the code the same on all platforms, so if the architecture provides weaker atomics then they must be supplemented to get the required barrier semantics.
>  
> David

That seems to me to be at odds with some of the original Java memory model goals, notably the idea that you should be able to remove any fences or atomicity overhead from synchronization objects accessed by only a local thread.  It also seems to significantly disadvantage ARMv8 for very minimal benefit by insisting that CAS operations can be abused as a full fence purely to order racing operations before and after the fence.  It adds overhead purely to support incorrectly synchronized/racy programs.  This is also a property that's currently quite difficult to define in the memory model.  In short, I think this is a mistake.

On ARMv8, the natural implementation should be a load-exclusive-acquire, followed by a store-exclusive-release, which does not order prior and subsequent non-volatile operations.  It should however suffice for a C++ sequentially consistent compare_exchange, since only racy programs or programs with relaxed atomics can tell the difference.

Hans

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Varieties of CAS semantics (another doc fix request)

Hans Boehm
I think the core question here is if I can implement the Dekker's example with something like the following in each thread, where x and y are not volatile:

Thread 1:
x = 1;
local1.CAS(...)
r1 = y;

Thread 2:
y = 1;
local2.CAS(....)
r1 = x;

i.e. if I can use CAS on a local as a fence replacement.

If this is intended to work, as implied by the posted description, then the implementation is not allowed to remove the fence(s) associated with e.g. local1.CAS, even if local1 is only used by thread1.  That would be quite unfortunate.

The canonical ARMv8 CAS implementations should guarantee the volatile load semantics for the CAS, and volatile store semantics for a successful CAS (which I think is the only reasonable way to read the current j.u.c. spec).  But I believe it is not usable as a fence, as in this example. 

Hans

On Mon, Jan 19, 2015 at 9:21 PM, David Holmes <[hidden email]> wrote:
Hi Hans,
 
I don't see how this is at odds with any JMM goals as if you remove the atomic op you would also remove any additional barriers.
 
These implementations are for internal hotspot use as well as being the back end of the j.u.c.atomic operations. If this is a serious impediment on some platforms then a separate j.u.c.atomic back end could be defined - but then it would have to add back in the barriers related to the volatile read/write accesses implied - which I think would get you very close to where we are now. Your description for ARMv8 below is not providing the semantics of the Java-level volatile read+write AFAICS.
 
David
-----Original Message-----
From: [hidden email] [mailto:[hidden email]]On Behalf Of Hans Boehm
Sent: Tuesday, 20 January 2015 2:53 PM
To: David Holmes
Cc: [hidden email]
Subject: Re: [concurrency-interest] Varieties of CAS semantics (another doc fix request)


On Mon, Jan 19, 2015 at 2:37 PM, David Holmes <[hidden email]> wrote:
>
> In hotspot all the Atomic read-modify-write operations (which includes the cmpxchg underpinning these CAS operations) are required to have the semantics of:
>  
> fence(); <op>; membar storeload|storestore
>  
> so that the overall operation acts as a full bi-directional barrier regardless of the success or failure of the CAS. This comes from the fact that on x86 and SPARC the atomic primitives already imply full bi-directional barriers (so there are no explicit fence/membar issued on those platforms) but otherwise we want to be able to reason about the code the same on all platforms, so if the architecture provides weaker atomics then they must be supplemented to get the required barrier semantics.
>  
> David

That seems to me to be at odds with some of the original Java memory model goals, notably the idea that you should be able to remove any fences or atomicity overhead from synchronization objects accessed by only a local thread.  It also seems to significantly disadvantage ARMv8 for very minimal benefit by insisting that CAS operations can be abused as a full fence purely to order racing operations before and after the fence.  It adds overhead purely to support incorrectly synchronized/racy programs.  This is also a property that's currently quite difficult to define in the memory model.  In short, I think this is a mistake.

On ARMv8, the natural implementation should be a load-exclusive-acquire, followed by a store-exclusive-release, which does not order prior and subsequent non-volatile operations.  It should however suffice for a C++ sequentially consistent compare_exchange, since only racy programs or programs with relaxed atomics can tell the difference.

Hans


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Varieties of CAS semantics (another doc fix request)

David Holmes-6

Sorry Hans I've lost the context here - what "posted description" are you referring to? What I was describing were internal hotspot details not anything that can be relied upon at the Java level.
 
David
-----Original Message-----
From: [hidden email] [mailto:[hidden email]]On Behalf Of Hans Boehm
Sent: Tuesday, 20 January 2015 3:41 PM
To: David Holmes
Cc: [hidden email]
Subject: Re: [concurrency-interest] Varieties of CAS semantics (another doc fix request)

I think the core question here is if I can implement the Dekker's example with something like the following in each thread, where x and y are not volatile:

Thread 1:
x = 1;
local1.CAS(...)
r1 = y;

Thread 2:
y = 1;
local2.CAS(....)
r1 = x;

i.e. if I can use CAS on a local as a fence replacement.

If this is intended to work, as implied by the posted description, then the implementation is not allowed to remove the fence(s) associated with e.g. local1.CAS, even if local1 is only used by thread1.  That would be quite unfortunate.

The canonical ARMv8 CAS implementations should guarantee the volatile load semantics for the CAS, and volatile store semantics for a successful CAS (which I think is the only reasonable way to read the current j.u.c. spec).  But I believe it is not usable as a fence, as in this example. 

Hans

On Mon, Jan 19, 2015 at 9:21 PM, David Holmes <[hidden email]> wrote:
Hi Hans,
 
I don't see how this is at odds with any JMM goals as if you remove the atomic op you would also remove any additional barriers.
 
These implementations are for internal hotspot use as well as being the back end of the j.u.c.atomic operations. If this is a serious impediment on some platforms then a separate j.u.c.atomic back end could be defined - but then it would have to add back in the barriers related to the volatile read/write accesses implied - which I think would get you very close to where we are now. Your description for ARMv8 below is not providing the semantics of the Java-level volatile read+write AFAICS.
 
David
-----Original Message-----
From: [hidden email] [mailto:[hidden email]]On Behalf Of Hans Boehm
Sent: Tuesday, 20 January 2015 2:53 PM
To: David Holmes
Cc: [hidden email]
Subject: Re: [concurrency-interest] Varieties of CAS semantics (another doc fix request)


On Mon, Jan 19, 2015 at 2:37 PM, David Holmes <[hidden email]> wrote:
>
> In hotspot all the Atomic read-modify-write operations (which includes the cmpxchg underpinning these CAS operations) are required to have the semantics of:
>  
> fence(); <op>; membar storeload|storestore
>  
> so that the overall operation acts as a full bi-directional barrier regardless of the success or failure of the CAS. This comes from the fact that on x86 and SPARC the atomic primitives already imply full bi-directional barriers (so there are no explicit fence/membar issued on those platforms) but otherwise we want to be able to reason about the code the same on all platforms, so if the architecture provides weaker atomics then they must be supplemented to get the required barrier semantics.
>  
> David

That seems to me to be at odds with some of the original Java memory model goals, notably the idea that you should be able to remove any fences or atomicity overhead from synchronization objects accessed by only a local thread.  It also seems to significantly disadvantage ARMv8 for very minimal benefit by insisting that CAS operations can be abused as a full fence purely to order racing operations before and after the fence.  It adds overhead purely to support incorrectly synchronized/racy programs.  This is also a property that's currently quite difficult to define in the memory model.  In short, I think this is a mistake.

On ARMv8, the natural implementation should be a load-exclusive-acquire, followed by a store-exclusive-release, which does not order prior and subsequent non-volatile operations.  It should however suffice for a C++ sequentially consistent compare_exchange, since only racy programs or programs with relaxed atomics can tell the difference.

Hans


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Varieties of CAS semantics (another doc fix request)

Peter Levart
In reply to this post by Hans Boehm
On 01/20/2015 06:41 AM, Hans Boehm wrote:
I think the core question here is if I can implement the Dekker's example
with something like the following in each thread, where x and y are not
volatile:

Thread 1:
x = 1;
local1.CAS(...)
r1 = y;

Thread 2:
y = 1;
local2.CAS(....)
r1 = x;

i.e. if I can use CAS on a local as a fence replacement.

If this is intended to work, as implied by the posted description, then the
implementation is not allowed to remove the fence(s) associated with e.g.
local1.CAS, even if local1 is only used by thread1.  That would be quite
unfortunate.

Right. So we can view Java CAS as equivalent to:

synchronized(lock) {
    if (value == expected) {
        value = newValue;
        return true;
    } else {
        return false;
    }
}

...where 'lock' is an Object uniquely associated with the memory location being CASed, meaning that synchronized(lock) {} can be elided where appropriate (for example, if compiler can prove that memory location can only be accessed by single thread).

Isn't this actually the way how 64bit CAS is (or was?) implemented on some architectures?

Peter


The canonical ARMv8 CAS implementations should guarantee the volatile load
semantics for the CAS, and volatile store semantics for a successful CAS
(which I think is the only reasonable way to read the current j.u.c.
spec).  But I believe it is not usable as a fence, as in this example.

Hans

On Mon, Jan 19, 2015 at 9:21 PM, David Holmes [hidden email]
wrote:

 Hi Hans,

I don't see how this is at odds with any JMM goals as if you remove the
atomic op you would also remove any additional barriers.

These implementations are for internal hotspot use as well as being the
back end of the j.u.c.atomic operations. If this is a serious impediment on
some platforms then a separate j.u.c.atomic back end could be defined - but
then it would have to add back in the barriers related to the volatile
read/write accesses implied - which I think would get you very close to
where we are now. Your description for ARMv8 below is not providing the
semantics of the Java-level volatile read+write AFAICS.

David

-----Original Message-----
*From:* [hidden email] [mailto:
[hidden email]]*On Behalf Of *Hans Boehm
*Sent:* Tuesday, 20 January 2015 2:53 PM
*To:* David Holmes
*Cc:* [hidden email]
*Subject:* Re: [concurrency-interest] Varieties of CAS semantics (another
doc fix request)


On Mon, Jan 19, 2015 at 2:37 PM, David Holmes [hidden email]
wrote:
In hotspot all the Atomic read-modify-write operations (which includes
the cmpxchg underpinning these CAS operations) are required to have the
semantics of:
fence(); <op>; membar storeload|storestore

so that the overall operation acts as a full bi-directional barrier
regardless of the success or failure of the CAS. This comes from the fact
that on x86 and SPARC the atomic primitives already imply full
bi-directional barriers (so there are no explicit fence/membar issued on
those platforms) but otherwise we want to be able to reason about the code
the same on all platforms, so if the architecture provides weaker atomics
then they must be supplemented to get the required barrier semantics.
David
That seems to me to be at odds with some of the original Java memory model
goals, notably the idea that you should be able to remove any fences or
atomicity overhead from synchronization objects accessed by only a local
thread.  It also seems to significantly disadvantage ARMv8 for very minimal
benefit by insisting that CAS operations can be abused as a full fence
purely to order racing operations before and after the fence.  It adds
overhead purely to support incorrectly synchronized/racy programs.  This is
also a property that's currently quite difficult to define in the memory
model.  In short, I think this is a mistake.

On ARMv8, the natural implementation should be a load-exclusive-acquire,
followed by a store-exclusive-release, which does not order prior and
subsequent non-volatile operations.  It should however suffice for a C++
sequentially consistent compare_exchange, since only racy programs or
programs with relaxed atomics can tell the difference.

Hans



      

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Varieties of CAS semantics (another doc fix request)

David Holmes-6
There is provision for 64-bit CAS to be implemented using locks on 32-bit platforms that don't support a 64-bit CAS eg PPC32.
 
David
-----Original Message-----
From: Peter Levart [mailto:[hidden email]]
Sent: Tuesday, 20 January 2015 7:38 PM
To: Hans Boehm; David Holmes
Cc: [hidden email]
Subject: Re: [concurrency-interest] Varieties of CAS semantics (another doc fix request)

On 01/20/2015 06:41 AM, Hans Boehm wrote:
I think the core question here is if I can implement the Dekker's example
with something like the following in each thread, where x and y are not
volatile:

Thread 1:
x = 1;
local1.CAS(...)
r1 = y;

Thread 2:
y = 1;
local2.CAS(....)
r1 = x;

i.e. if I can use CAS on a local as a fence replacement.

If this is intended to work, as implied by the posted description, then the
implementation is not allowed to remove the fence(s) associated with e.g.
local1.CAS, even if local1 is only used by thread1.  That would be quite
unfortunate.

Right. So we can view Java CAS as equivalent to:

synchronized(lock) {
    if (value == expected) {
        value = newValue;
        return true;
    } else {
        return false;
    }
}

...where 'lock' is an Object uniquely associated with the memory location being CASed, meaning that synchronized(lock) {} can be elided where appropriate (for example, if compiler can prove that memory location can only be accessed by single thread).

Isn't this actually the way how 64bit CAS is (or was?) implemented on some architectures?

Peter

The canonical ARMv8 CAS implementations should guarantee the volatile load
semantics for the CAS, and volatile store semantics for a successful CAS
(which I think is the only reasonable way to read the current j.u.c.
spec).  But I believe it is not usable as a fence, as in this example.

Hans

On Mon, Jan 19, 2015 at 9:21 PM, David Holmes [hidden email]
wrote:

 Hi Hans,

I don't see how this is at odds with any JMM goals as if you remove the
atomic op you would also remove any additional barriers.

These implementations are for internal hotspot use as well as being the
back end of the j.u.c.atomic operations. If this is a serious impediment on
some platforms then a separate j.u.c.atomic back end could be defined - but
then it would have to add back in the barriers related to the volatile
read/write accesses implied - which I think would get you very close to
where we are now. Your description for ARMv8 below is not providing the
semantics of the Java-level volatile read+write AFAICS.

David

-----Original Message-----
*From:* [hidden email] [mailto:
[hidden email]]*On Behalf Of *Hans Boehm
*Sent:* Tuesday, 20 January 2015 2:53 PM
*To:* David Holmes
*Cc:* [hidden email]
*Subject:* Re: [concurrency-interest] Varieties of CAS semantics (another
doc fix request)


On Mon, Jan 19, 2015 at 2:37 PM, David Holmes [hidden email]
wrote:
In hotspot all the Atomic read-modify-write operations (which includes
the cmpxchg underpinning these CAS operations) are required to have the
semantics of:
fence(); <op>; membar storeload|storestore

so that the overall operation acts as a full bi-directional barrier
regardless of the success or failure of the CAS. This comes from the fact
that on x86 and SPARC the atomic primitives already imply full
bi-directional barriers (so there are no explicit fence/membar issued on
those platforms) but otherwise we want to be able to reason about the code
the same on all platforms, so if the architecture provides weaker atomics
then they must be supplemented to get the required barrier semantics.
David
That seems to me to be at odds with some of the original Java memory model
goals, notably the idea that you should be able to remove any fences or
atomicity overhead from synchronization objects accessed by only a local
thread.  It also seems to significantly disadvantage ARMv8 for very minimal
benefit by insisting that CAS operations can be abused as a full fence
purely to order racing operations before and after the fence.  It adds
overhead purely to support incorrectly synchronized/racy programs.  This is
also a property that's currently quite difficult to define in the memory
model.  In short, I think this is a mistake.

On ARMv8, the natural implementation should be a load-exclusive-acquire,
followed by a store-exclusive-release, which does not order prior and
subsequent non-volatile operations.  It should however suffice for a C++
sequentially consistent compare_exchange, since only racy programs or
programs with relaxed atomics can tell the difference.

Hans




_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Varieties of CAS semantics (another doc fix request)

Andrew Haley
In reply to this post by David Holmes-6
On 20/01/15 05:21, David Holmes wrote:

> Your description for ARMv8 below is not providing the semantics of
> the Java-level volatile read+write AFAICS.

Yes it does: ARMv8 load-exclusive-acquire and store-exclusive-release
are designed to provide the semantics of the Java-level volatile
read+write.  At the present time we have to use barriers that are
stronger than necessary because of the way that HotSpot works
internally, but I do intend to try to fix that.

Andrew.
_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Varieties of CAS semantics (another doc fix request)

Vitaly Davidovich

My suspicion is that most people using CAS expect that other non-volatile read/writes aren't reordered with the CAS in the same way that a volatile read+write wouldn't permit that.  Hans was saying that on ARMv8 you'd want to use instructions that don't preclude other non-volatile accesses from being reordered, if I'm not mistaken.  I'm not seeing how that's equivalent?

sent from my phone

On Jan 20, 2015 5:16 AM, "Andrew Haley" <[hidden email]> wrote:
On 20/01/15 05:21, David Holmes wrote:

> Your description for ARMv8 below is not providing the semantics of
> the Java-level volatile read+write AFAICS.

Yes it does: ARMv8 load-exclusive-acquire and store-exclusive-release
are designed to provide the semantics of the Java-level volatile
read+write.  At the present time we have to use barriers that are
stronger than necessary because of the way that HotSpot works
internally, but I do intend to try to fix that.

Andrew.
_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Varieties of CAS semantics (another doc fix request)

Andrew Haley
On 01/20/2015 02:24 PM, Vitaly Davidovich wrote:
> My suspicion is that most people using CAS expect that other non-volatile
> read/writes aren't reordered with the CAS in the same way that a volatile
> read+write wouldn't permit that.

I agree.

> Hans was saying that on ARMv8 you'd want to use instructions that
> don't preclude other non-volatile accesses from being reordered, if
> I'm not mistaken.  I'm not seeing how that's equivalent?

David said:

> [Hans's] description for ARMv8 below is not providing the semantics
> of the Java-level volatile read+write AFAICS.

I think that's wrong.  the ARMv8 code would provide the semantics of
the Java-level volatile read+write.

There is no doubt in my mind that ARMv8 load-exclusive-acquire and
store-exclusive-release implement semantics which are compatible with
Java's volatile.  However, the CAS is not atomic because neither
load-exclusive-acquire and store-exclusive-release prevent

  store r2 -> x
  load-exclusive-acquire(r2) -> r3
  r4 -> store-exclusive-release(r2)
  load y -> r5

from being turned into

  load-exclusive-acquire(r2) -> r3
  load y -> r5
  store r2 -> x
  r4 -> store-exclusive-release(r2)

Andrew.
_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Varieties of CAS semantics (another doc fix request)

Vitaly Davidovich
I think that's wrong.  the ARMv8 code would provide the semantics of
the Java-level volatile read+write.

There is no doubt in my mind that ARMv8 load-exclusive-acquire and
store-exclusive-release implement semantics which are compatible with
Java's volatile.  However, the CAS is not atomic because neither
load-exclusive-acquire and store-exclusive-release prevent 

  store r2 -> x
  load-exclusive-acquire(r2) -> r3
  r4 -> store-exclusive-release(r2)
  load y -> r5

from being turned into 

  load-exclusive-acquire(r2) -> r3
  load y -> r5
  store r2 -> x
  r4 -> store-exclusive-release(r2)

Ok, thanks for clarifying -- yeah, agree on that.  I guess my question is: when the docs talk about CAS having the effects of a volatile read+write, does that also imply atomicity or no? i.e. is it like a volatile read + write as two back to back instructions (as if written in source like that), or is it 1 instruction that has those reordering effects with surrounding code.

On Tue, Jan 20, 2015 at 10:06 AM, Andrew Haley <[hidden email]> wrote:
On 01/20/2015 02:24 PM, Vitaly Davidovich wrote:
> My suspicion is that most people using CAS expect that other non-volatile
> read/writes aren't reordered with the CAS in the same way that a volatile
> read+write wouldn't permit that.

I agree.

> Hans was saying that on ARMv8 you'd want to use instructions that
> don't preclude other non-volatile accesses from being reordered, if
> I'm not mistaken.  I'm not seeing how that's equivalent?

David said:

> [Hans's] description for ARMv8 below is not providing the semantics
> of the Java-level volatile read+write AFAICS.

I think that's wrong.  the ARMv8 code would provide the semantics of
the Java-level volatile read+write.

There is no doubt in my mind that ARMv8 load-exclusive-acquire and
store-exclusive-release implement semantics which are compatible with
Java's volatile.  However, the CAS is not atomic because neither
load-exclusive-acquire and store-exclusive-release prevent

  store r2 -> x
  load-exclusive-acquire(r2) -> r3
  r4 -> store-exclusive-release(r2)
  load y -> r5

from being turned into

  load-exclusive-acquire(r2) -> r3
  load y -> r5
  store r2 -> x
  r4 -> store-exclusive-release(r2)

Andrew.


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Varieties of CAS semantics (another doc fix request)

Andrew Haley
On 01/20/2015 03:26 PM, Vitaly Davidovich wrote:
> I guess my question is:
> when the docs talk about CAS having the effects of a volatile
> read+write, does that also imply atomicity or no? i.e. is it like a
> volatile read + write as two back to back instructions (as if
> written in source like that), or is it 1 instruction that has those
> reordering effects with surrounding code.

I would have thought that when the docs talk about CAS having the
effects of a volatile read+write, that's what the docs mean: if they'd
wanted to imply the effect of an atomic instruction with a full fence
it would have been easy enough to say so.

Andrew.
_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
1234