LoadStore and StoreStore , are both are required before lazySet and volatile write

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

LoadStore and StoreStore , are both are required before lazySet and volatile write

vikas

I was reading a article from Hans and he argues that LoadStore is also needed before lazySet or a final variable write (i assume then also before volatile write).

He demonstrated a particular race condition which i couldn't understand  (link below).

http://www.hboehm.info/c++mm/no_write_fences.html see Recipient writes to object.

Its very counter intuitive how   store of x.a = 43 can be done past the StoreStore barrier in thread 1.

x.a = 0; x.a++;
x_init.store_write_release(true);

and the code that uses x in thread 2 updates it, with e.g.

if (x_init.load_acquire())
    x.a = 42;



Similar argument here http://shipilev.net/blog/2014/all-fields-are-final/

Copying Shiplev here :

JSR 133 Cookbook only requires StoreStore, but might also require LoadStore barriers. This covers for a corner case when the final field is getting initialized off some other field which experiences a racy update. This corner case can be enabled by runtime optimization which figures out the final store is not needed, puts the value in the local variable, and hence breaks out of ordering guarantees of StoreStore alone

How runtime can figure out that final Store is not needed, also if load is getting passed/reorder the StoreStore Barrier then Store to local Variable also is getting passed/reorder with the storeStore barrier, This is the part i don't quite understand , why store to local variable can reorder with StoreStore Barrier.

It would be very help full if anybody can explain in more details what is the race condition they both are mentioning by some simple example.
Reply | Threaded
Open this post in threaded view
|

Re: LoadStore and StoreStore , are both are required before lazySet and volatile write

Vitaly Davidovich
For Hans' example, he's basically saying this:
x.a = 0;
tmp = x.a; << read of x itself
tmp = tmp + 1;
x.a = tmp;
x.init_store_write_release(true);

The argument is that if only stores are ordered and loads can be reordered with the stores, then the read of x.a could be reordered with the releasing store, i.e.:
x.a = 0;
x.init_store_write_release(true);
tmp = x.a; << this load re-ordered with the store and may now see x.a = 42 if thread 2 ran
tmp = tmp + 1;
x.a = tmp; << results in x.a = 43

As Hans mentions, this is somewhat esoteric and strange.  It would require something (e.g. cpu, compiler) determining/allowing reordering dependent memory operations; in terms of cpu, only one I know that allows this is the Alpha.  But he has a separate argument about not relying on memory dependence to enforce any ordering ...






On Mon, Sep 22, 2014 at 3:14 PM, vikas <[hidden email]> wrote:

I was reading a article from Hans and he argues that LoadStore is also
needed before lazySet or a final variable write (i assume then also before
volatile write).

He demonstrated a particular race condition which i couldn't understand
(link below).

http://www.hboehm.info/c++mm/no_write_fences.html see *Recipient writes to
object*.

Its very counter intuitive how  * store of x.a = 43* can be done past the
StoreStore barrier in thread 1.
*
x.a = 0; x.a++;
x_init.store_write_release(true);

and the code that uses x in thread 2 updates it, with e.g.

if (x_init.load_acquire())
    x.a = 42;*


Similar argument here http://shipilev.net/blog/2014/all-fields-are-final/

Copying Shiplev here :

/JSR 133 Cookbook only requires StoreStore, but might also require LoadStore
barriers. This covers for a corner case when the final field is getting
initialized off some other field which experiences a racy update. *This
corner case can be enabled by runtime optimization which figures out the
final store is not needed, puts the value in the local variable, and hence
breaks out of ordering guarantees of StoreStore alone*/

How runtime can figure out that final Store is not needed, also if load is
getting passed/reorder the StoreStore Barrier then Store to local Variable
also is getting passed/reorder with the storeStore barrier, *This is the
part i don't quite understand , why store to local variable can reorder with
StoreStore Barrier. *

It would be very help full if anybody can explain in more details what is
the race condition they both are mentioning by some simple example.




--
View this message in context: http://jsr166-concurrency.10961.n7.nabble.com/LoadStore-and-StoreStore-are-both-are-required-before-lazySet-and-volatile-write-tp11291.html
Sent from the JSR166 Concurrency mailing list archive at Nabble.com.
_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: LoadStore and StoreStore , are both are required before lazySet and volatile write

Vitaly Davidovich
In reply to this post by vikas

As for Aleksey's point, I don't quite follow why initializing a final field off some racy field *requires* a LoadStore.  Eliminating StoreStore may happen, I guess, if something like escape analysis removes the allocation of an object with final field and scalar replaces it.  But again, unsure why another racy memory location requires LoadStore (unless this is talking about same thing as Hans in different terms).

Sent from my phone

On Sep 22, 2014 3:46 PM, "vikas" <[hidden email]> wrote:

I was reading a article from Hans and he argues that LoadStore is also
needed before lazySet or a final variable write (i assume then also before
volatile write).

He demonstrated a particular race condition which i couldn't understand
(link below).

http://www.hboehm.info/c++mm/no_write_fences.html see *Recipient writes to
object*.

Its very counter intuitive how  * store of x.a = 43* can be done past the
StoreStore barrier in thread 1.
*
x.a = 0; x.a++;
x_init.store_write_release(true);

and the code that uses x in thread 2 updates it, with e.g.

if (x_init.load_acquire())
    x.a = 42;*


Similar argument here http://shipilev.net/blog/2014/all-fields-are-final/

Copying Shiplev here :

/JSR 133 Cookbook only requires StoreStore, but might also require LoadStore
barriers. This covers for a corner case when the final field is getting
initialized off some other field which experiences a racy update. *This
corner case can be enabled by runtime optimization which figures out the
final store is not needed, puts the value in the local variable, and hence
breaks out of ordering guarantees of StoreStore alone*/

How runtime can figure out that final Store is not needed, also if load is
getting passed/reorder the StoreStore Barrier then Store to local Variable
also is getting passed/reorder with the storeStore barrier, *This is the
part i don't quite understand , why store to local variable can reorder with
StoreStore Barrier. *

It would be very help full if anybody can explain in more details what is
the race condition they both are mentioning by some simple example.




--
View this message in context: http://jsr166-concurrency.10961.n7.nabble.com/LoadStore-and-StoreStore-are-both-are-required-before-lazySet-and-volatile-write-tp11291.html
Sent from the JSR166 Concurrency mailing list archive at Nabble.com.
_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: LoadStore and StoreStore , are both are required before lazySet and volatile write

oleksandr otenko
In reply to this post by Vitaly Davidovich
Is this sufficient?

Without LoadStore in the second thread, Thread 2 is allowed to observe x after writing x.a=42. I am not sure what that means to Thread 1.

Alex

On 22/09/2014 21:31, Vitaly Davidovich wrote:
For Hans' example, he's basically saying this:
x.a = 0;
tmp = x.a; << read of x itself
tmp = tmp + 1;
x.a = tmp;
x.init_store_write_release(true);

The argument is that if only stores are ordered and loads can be reordered with the stores, then the read of x.a could be reordered with the releasing store, i.e.:
x.a = 0;
x.init_store_write_release(true);
tmp = x.a; << this load re-ordered with the store and may now see x.a = 42 if thread 2 ran
tmp = tmp + 1;
x.a = tmp; << results in x.a = 43

As Hans mentions, this is somewhat esoteric and strange.  It would require something (e.g. cpu, compiler) determining/allowing reordering dependent memory operations; in terms of cpu, only one I know that allows this is the Alpha.  But he has a separate argument about not relying on memory dependence to enforce any ordering ...






On Mon, Sep 22, 2014 at 3:14 PM, vikas <[hidden email]> wrote:

I was reading a article from Hans and he argues that LoadStore is also
needed before lazySet or a final variable write (i assume then also before
volatile write).

He demonstrated a particular race condition which i couldn't understand
(link below).

http://www.hboehm.info/c++mm/no_write_fences.html see *Recipient writes to
object*.

Its very counter intuitive how  * store of x.a = 43* can be done past the
StoreStore barrier in thread 1.
*
x.a = 0; x.a++;
x_init.store_write_release(true);

and the code that uses x in thread 2 updates it, with e.g.

if (x_init.load_acquire())
    x.a = 42;*


Similar argument here http://shipilev.net/blog/2014/all-fields-are-final/

Copying Shiplev here :

/JSR 133 Cookbook only requires StoreStore, but might also require LoadStore
barriers. This covers for a corner case when the final field is getting
initialized off some other field which experiences a racy update. *This
corner case can be enabled by runtime optimization which figures out the
final store is not needed, puts the value in the local variable, and hence
breaks out of ordering guarantees of StoreStore alone*/

How runtime can figure out that final Store is not needed, also if load is
getting passed/reorder the StoreStore Barrier then Store to local Variable
also is getting passed/reorder with the storeStore barrier, *This is the
part i don't quite understand , why store to local variable can reorder with
StoreStore Barrier. *

It would be very help full if anybody can explain in more details what is
the race condition they both are mentioning by some simple example.




--
View this message in context: http://jsr166-concurrency.10961.n7.nabble.com/LoadStore-and-StoreStore-are-both-are-required-before-lazySet-and-volatile-write-tp11291.html
Sent from the JSR166 Concurrency mailing list archive at Nabble.com.
_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest



_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: LoadStore and StoreStore , are both are required before lazySet and volatile write

Vitaly Davidovich
That would be somewhat hard to believe as this means something (compiler or cpu) speculated about x_init.load_acquire() returning true but wrote to memory beforehand? A cpu can speculate like this, I suppose, but it wouldn't be able to actually retire the store of 42 until the speculation is proven correct, at which point the store is committed (but then this doesn't change observed behavior).  Or did you mean something else?

On Tue, Sep 23, 2014 at 11:54 AM, Oleksandr Otenko <[hidden email]> wrote:
Is this sufficient?

Without LoadStore in the second thread, Thread 2 is allowed to observe x after writing x.a=42. I am not sure what that means to Thread 1.

Alex


On 22/09/2014 21:31, Vitaly Davidovich wrote:
For Hans' example, he's basically saying this:
x.a = 0;
tmp = x.a; << read of x itself
tmp = tmp + 1;
x.a = tmp;
x.init_store_write_release(true);

The argument is that if only stores are ordered and loads can be reordered with the stores, then the read of x.a could be reordered with the releasing store, i.e.:
x.a = 0;
x.init_store_write_release(true);
tmp = x.a; << this load re-ordered with the store and may now see x.a = 42 if thread 2 ran
tmp = tmp + 1;
x.a = tmp; << results in x.a = 43

As Hans mentions, this is somewhat esoteric and strange.  It would require something (e.g. cpu, compiler) determining/allowing reordering dependent memory operations; in terms of cpu, only one I know that allows this is the Alpha.  But he has a separate argument about not relying on memory dependence to enforce any ordering ...






On Mon, Sep 22, 2014 at 3:14 PM, vikas <[hidden email]> wrote:

I was reading a article from Hans and he argues that LoadStore is also
needed before lazySet or a final variable write (i assume then also before
volatile write).

He demonstrated a particular race condition which i couldn't understand
(link below).

http://www.hboehm.info/c++mm/no_write_fences.html see *Recipient writes to
object*.

Its very counter intuitive how  * store of x.a = 43* can be done past the
StoreStore barrier in thread 1.
*
x.a = 0; x.a++;
x_init.store_write_release(true);

and the code that uses x in thread 2 updates it, with e.g.

if (x_init.load_acquire())
    x.a = 42;*


Similar argument here http://shipilev.net/blog/2014/all-fields-are-final/

Copying Shiplev here :

/JSR 133 Cookbook only requires StoreStore, but might also require LoadStore
barriers. This covers for a corner case when the final field is getting
initialized off some other field which experiences a racy update. *This
corner case can be enabled by runtime optimization which figures out the
final store is not needed, puts the value in the local variable, and hence
breaks out of ordering guarantees of StoreStore alone*/

How runtime can figure out that final Store is not needed, also if load is
getting passed/reorder the StoreStore Barrier then Store to local Variable
also is getting passed/reorder with the storeStore barrier, *This is the
part i don't quite understand , why store to local variable can reorder with
StoreStore Barrier. *

It would be very help full if anybody can explain in more details what is
the race condition they both are mentioning by some simple example.




--
View this message in context: http://jsr166-concurrency.10961.n7.nabble.com/LoadStore-and-StoreStore-are-both-are-required-before-lazySet-and-volatile-write-tp11291.html
Sent from the JSR166 Concurrency mailing list archive at Nabble.com.
_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest



_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest



_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: LoadStore and StoreStore , are both are required before lazySet and volatile write

oleksandr otenko
When discussing reordering of the dependencies, it was made clear that Thread 2 may observe x==null, and still the store of x.a=42 may succeed. Depending on the magic involved, Thread 1 may observe x.a==42. So my question is mostly about what magic is feasible in modern day technology.


Alex


On 23/09/2014 17:04, Vitaly Davidovich wrote:
That would be somewhat hard to believe as this means something (compiler or cpu) speculated about x_init.load_acquire() returning true but wrote to memory beforehand? A cpu can speculate like this, I suppose, but it wouldn't be able to actually retire the store of 42 until the speculation is proven correct, at which point the store is committed (but then this doesn't change observed behavior).  Or did you mean something else?

On Tue, Sep 23, 2014 at 11:54 AM, Oleksandr Otenko <[hidden email]> wrote:
Is this sufficient?

Without LoadStore in the second thread, Thread 2 is allowed to observe x after writing x.a=42. I am not sure what that means to Thread 1.

Alex


On 22/09/2014 21:31, Vitaly Davidovich wrote:
For Hans' example, he's basically saying this:
x.a = 0;
tmp = x.a; << read of x itself
tmp = tmp + 1;
x.a = tmp;
x.init_store_write_release(true);

The argument is that if only stores are ordered and loads can be reordered with the stores, then the read of x.a could be reordered with the releasing store, i.e.:
x.a = 0;
x.init_store_write_release(true);
tmp = x.a; << this load re-ordered with the store and may now see x.a = 42 if thread 2 ran
tmp = tmp + 1;
x.a = tmp; << results in x.a = 43

As Hans mentions, this is somewhat esoteric and strange.  It would require something (e.g. cpu, compiler) determining/allowing reordering dependent memory operations; in terms of cpu, only one I know that allows this is the Alpha.  But he has a separate argument about not relying on memory dependence to enforce any ordering ...






On Mon, Sep 22, 2014 at 3:14 PM, vikas <[hidden email]> wrote:

I was reading a article from Hans and he argues that LoadStore is also
needed before lazySet or a final variable write (i assume then also before
volatile write).

He demonstrated a particular race condition which i couldn't understand
(link below).

http://www.hboehm.info/c++mm/no_write_fences.html see *Recipient writes to
object*.

Its very counter intuitive how  * store of x.a = 43* can be done past the
StoreStore barrier in thread 1.
*
x.a = 0; x.a++;
x_init.store_write_release(true);

and the code that uses x in thread 2 updates it, with e.g.

if (x_init.load_acquire())
    x.a = 42;*


Similar argument here http://shipilev.net/blog/2014/all-fields-are-final/

Copying Shiplev here :

/JSR 133 Cookbook only requires StoreStore, but might also require LoadStore
barriers. This covers for a corner case when the final field is getting
initialized off some other field which experiences a racy update. *This
corner case can be enabled by runtime optimization which figures out the
final store is not needed, puts the value in the local variable, and hence
breaks out of ordering guarantees of StoreStore alone*/

How runtime can figure out that final Store is not needed, also if load is
getting passed/reorder the StoreStore Barrier then Store to local Variable
also is getting passed/reorder with the storeStore barrier, *This is the
part i don't quite understand , why store to local variable can reorder with
StoreStore Barrier. *

It would be very help full if anybody can explain in more details what is
the race condition they both are mentioning by some simple example.




--
View this message in context: http://jsr166-concurrency.10961.n7.nabble.com/LoadStore-and-StoreStore-are-both-are-required-before-lazySet-and-volatile-write-tp11291.html
Sent from the JSR166 Concurrency mailing list archive at Nabble.com.
_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest



_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest




_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: LoadStore and StoreStore , are both are required before lazySet and volatile write

Vitaly Davidovich

As I understood it, Thread 1 can see 42 before the increment only because of reordering of its own operations, allowing x_init to be set earlier and giving Thread 2 a chance to run before the increment in Thread 1.  I'm not sure what x == null has to do with it.

Sent from my phone

On Sep 23, 2014 12:13 PM, "Oleksandr Otenko" <[hidden email]> wrote:
When discussing reordering of the dependencies, it was made clear that Thread 2 may observe x==null, and still the store of x.a=42 may succeed. Depending on the magic involved, Thread 1 may observe x.a==42. So my question is mostly about what magic is feasible in modern day technology.


Alex


On 23/09/2014 17:04, Vitaly Davidovich wrote:
That would be somewhat hard to believe as this means something (compiler or cpu) speculated about x_init.load_acquire() returning true but wrote to memory beforehand? A cpu can speculate like this, I suppose, but it wouldn't be able to actually retire the store of 42 until the speculation is proven correct, at which point the store is committed (but then this doesn't change observed behavior).  Or did you mean something else?

On Tue, Sep 23, 2014 at 11:54 AM, Oleksandr Otenko <[hidden email]> wrote:
Is this sufficient?

Without LoadStore in the second thread, Thread 2 is allowed to observe x after writing x.a=42. I am not sure what that means to Thread 1.

Alex


On 22/09/2014 21:31, Vitaly Davidovich wrote:
For Hans' example, he's basically saying this:
x.a = 0;
tmp = x.a; << read of x itself
tmp = tmp + 1;
x.a = tmp;
x.init_store_write_release(true);

The argument is that if only stores are ordered and loads can be reordered with the stores, then the read of x.a could be reordered with the releasing store, i.e.:
x.a = 0;
x.init_store_write_release(true);
tmp = x.a; << this load re-ordered with the store and may now see x.a = 42 if thread 2 ran
tmp = tmp + 1;
x.a = tmp; << results in x.a = 43

As Hans mentions, this is somewhat esoteric and strange.  It would require something (e.g. cpu, compiler) determining/allowing reordering dependent memory operations; in terms of cpu, only one I know that allows this is the Alpha.  But he has a separate argument about not relying on memory dependence to enforce any ordering ...






On Mon, Sep 22, 2014 at 3:14 PM, vikas <[hidden email]> wrote:

I was reading a article from Hans and he argues that LoadStore is also
needed before lazySet or a final variable write (i assume then also before
volatile write).

He demonstrated a particular race condition which i couldn't understand
(link below).

http://www.hboehm.info/c++mm/no_write_fences.html see *Recipient writes to
object*.

Its very counter intuitive how  * store of x.a = 43* can be done past the
StoreStore barrier in thread 1.
*
x.a = 0; x.a++;
x_init.store_write_release(true);

and the code that uses x in thread 2 updates it, with e.g.

if (x_init.load_acquire())
    x.a = 42;*


Similar argument here http://shipilev.net/blog/2014/all-fields-are-final/

Copying Shiplev here :

/JSR 133 Cookbook only requires StoreStore, but might also require LoadStore
barriers. This covers for a corner case when the final field is getting
initialized off some other field which experiences a racy update. *This
corner case can be enabled by runtime optimization which figures out the
final store is not needed, puts the value in the local variable, and hence
breaks out of ordering guarantees of StoreStore alone*/

How runtime can figure out that final Store is not needed, also if load is
getting passed/reorder the StoreStore Barrier then Store to local Variable
also is getting passed/reorder with the storeStore barrier, *This is the
part i don't quite understand , why store to local variable can reorder with
StoreStore Barrier. *

It would be very help full if anybody can explain in more details what is
the race condition they both are mentioning by some simple example.




--
View this message in context: http://jsr166-concurrency.10961.n7.nabble.com/LoadStore-and-StoreStore-are-both-are-required-before-lazySet-and-volatile-write-tp11291.html
Sent from the JSR166 Concurrency mailing list archive at Nabble.com.
_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest



_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest




_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: LoadStore and StoreStore , are both are required before lazySet and volatile write

Andrew Haley
In reply to this post by Vitaly Davidovich
The key phrase is "it doesn't make sense to enforce ordering based on
dependencies, since they cannot be reliably defined *at this level*."

In other words, once you start translating the example into machine
instructions you can nail it down in ways that aren't imposed by the
memory model.  As I understand it, there is nothing in a memory model
to prevent stores in Thread 2 after the load_acquire from floating up
past the x_init.store_write_release in Thread 1.

Andrew.


On 09/23/2014 05:04 PM, Vitaly Davidovich wrote:

> That would be somewhat hard to believe as this means something (compiler or
> cpu) speculated about x_init.load_acquire() returning true but wrote to
> memory beforehand? A cpu can speculate like this, I suppose, but it
> wouldn't be able to actually retire the store of 42 until the speculation
> is proven correct, at which point the store is committed (but then this
> doesn't change observed behavior).  Or did you mean something else?
>
> On Tue, Sep 23, 2014 at 11:54 AM, Oleksandr Otenko <
> [hidden email]> wrote:
>
>>  Is this sufficient?
>>
>> Without LoadStore in the second thread, Thread 2 is allowed to observe x
>> after writing x.a=42. I am not sure what that means to Thread 1.
>>
>> Alex
>>
>>
>> On 22/09/2014 21:31, Vitaly Davidovich wrote:
>>
>> For Hans' example, he's basically saying this:
>> x.a = 0;
>> tmp = x.a; << read of x itself
>> tmp = tmp + 1;
>> x.a = tmp;
>> x.init_store_write_release(true);
>>
>>  The argument is that if only stores are ordered and loads can be
>> reordered with the stores, then the read of x.a could be reordered with the
>> releasing store, i.e.:
>>  x.a = 0;
>> x.init_store_write_release(true);
>>  tmp = x.a; << this load re-ordered with the store and may now see x.a =
>> 42 if thread 2 ran
>> tmp = tmp + 1;
>> x.a = tmp; << results in x.a = 43
>>
>>  As Hans mentions, this is somewhat esoteric and strange.  It would
>> require something (e.g. cpu, compiler) determining/allowing reordering
>> dependent memory operations; in terms of cpu, only one I know that allows
>> this is the Alpha.  But he has a separate argument about not relying on
>> memory dependence to enforce any ordering ...
>>
>>
>>
>>
>>
>>
>> On Mon, Sep 22, 2014 at 3:14 PM, vikas <[hidden email]> wrote:
>>
>>>
>>> I was reading a article from Hans and he argues that LoadStore is also
>>> needed before lazySet or a final variable write (i assume then also before
>>> volatile write).
>>>
>>> He demonstrated a particular race condition which i couldn't understand
>>> (link below).
>>>
>>> http://www.hboehm.info/c++mm/no_write_fences.html see *Recipient writes
>>> to
>>> object*.
>>>
>>> Its very counter intuitive how  * store of x.a = 43* can be done past the
>>> StoreStore barrier in thread 1.
>>> *
>>> x.a = 0; x.a++;
>>> x_init.store_write_release(true);
>>>
>>> and the code that uses x in thread 2 updates it, with e.g.
>>>
>>> if (x_init.load_acquire())
>>>     x.a = 42;*
>>>
>>>
>>> Similar argument here http://shipilev.net/blog/2014/all-fields-are-final/
>>>
>>> Copying Shiplev here :
>>>
>>> /JSR 133 Cookbook only requires StoreStore, but might also require
>>> LoadStore
>>> barriers. This covers for a corner case when the final field is getting
>>> initialized off some other field which experiences a racy update. *This
>>> corner case can be enabled by runtime optimization which figures out the
>>> final store is not needed, puts the value in the local variable, and hence
>>> breaks out of ordering guarantees of StoreStore alone*/
>>>
>>> How runtime can figure out that final Store is not needed, also if load is
>>> getting passed/reorder the StoreStore Barrier then Store to local Variable
>>> also is getting passed/reorder with the storeStore barrier, *This is the
>>> part i don't quite understand , why store to local variable can reorder
>>> with
>>> StoreStore Barrier. *
>>>
>>> It would be very help full if anybody can explain in more details what is
>>> the race condition they both are mentioning by some simple example.
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://jsr166-concurrency.10961.n7.nabble.com/LoadStore-and-StoreStore-are-both-are-required-before-lazySet-and-volatile-write-tp11291.html
>>> Sent from the JSR166 Concurrency mailing list archive at Nabble.com.
>>> _______________________________________________
>>> Concurrency-interest mailing list
>>> [hidden email]
>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>
>>
>>
>>
>> _______________________________________________
>> Concurrency-interest mailing [hidden email]://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>
>>
>>
>
>
>
> _______________________________________________
> Concurrency-interest mailing list
> [hidden email]
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: LoadStore and StoreStore , are both are required before lazySet and volatile write

oleksandr otenko
In reply to this post by Vitaly Davidovich
x==null is evidence that Thread 2 observed the operations in the opposite order (opposite to "program order"). Then the question is whether Thread 1 can see them in the opposite order too.

Alex

On 23/09/2014 17:35, Vitaly Davidovich wrote:

As I understood it, Thread 1 can see 42 before the increment only because of reordering of its own operations, allowing x_init to be set earlier and giving Thread 2 a chance to run before the increment in Thread 1.  I'm not sure what x == null has to do with it.

Sent from my phone

On Sep 23, 2014 12:13 PM, "Oleksandr Otenko" <[hidden email]> wrote:
When discussing reordering of the dependencies, it was made clear that Thread 2 may observe x==null, and still the store of x.a=42 may succeed. Depending on the magic involved, Thread 1 may observe x.a==42. So my question is mostly about what magic is feasible in modern day technology.


Alex


On 23/09/2014 17:04, Vitaly Davidovich wrote:
That would be somewhat hard to believe as this means something (compiler or cpu) speculated about x_init.load_acquire() returning true but wrote to memory beforehand? A cpu can speculate like this, I suppose, but it wouldn't be able to actually retire the store of 42 until the speculation is proven correct, at which point the store is committed (but then this doesn't change observed behavior).  Or did you mean something else?

On Tue, Sep 23, 2014 at 11:54 AM, Oleksandr Otenko <[hidden email]> wrote:
Is this sufficient?

Without LoadStore in the second thread, Thread 2 is allowed to observe x after writing x.a=42. I am not sure what that means to Thread 1.

Alex


On 22/09/2014 21:31, Vitaly Davidovich wrote:
For Hans' example, he's basically saying this:
x.a = 0;
tmp = x.a; << read of x itself
tmp = tmp + 1;
x.a = tmp;
x.init_store_write_release(true);

The argument is that if only stores are ordered and loads can be reordered with the stores, then the read of x.a could be reordered with the releasing store, i.e.:
x.a = 0;
x.init_store_write_release(true);
tmp = x.a; << this load re-ordered with the store and may now see x.a = 42 if thread 2 ran
tmp = tmp + 1;
x.a = tmp; << results in x.a = 43

As Hans mentions, this is somewhat esoteric and strange.  It would require something (e.g. cpu, compiler) determining/allowing reordering dependent memory operations; in terms of cpu, only one I know that allows this is the Alpha.  But he has a separate argument about not relying on memory dependence to enforce any ordering ...






On Mon, Sep 22, 2014 at 3:14 PM, vikas <[hidden email]> wrote:

I was reading a article from Hans and he argues that LoadStore is also
needed before lazySet or a final variable write (i assume then also before
volatile write).

He demonstrated a particular race condition which i couldn't understand
(link below).

http://www.hboehm.info/c++mm/no_write_fences.html see *Recipient writes to
object*.

Its very counter intuitive how  * store of x.a = 43* can be done past the
StoreStore barrier in thread 1.
*
x.a = 0; x.a++;
x_init.store_write_release(true);

and the code that uses x in thread 2 updates it, with e.g.

if (x_init.load_acquire())
    x.a = 42;*


Similar argument here http://shipilev.net/blog/2014/all-fields-are-final/

Copying Shiplev here :

/JSR 133 Cookbook only requires StoreStore, but might also require LoadStore
barriers. This covers for a corner case when the final field is getting
initialized off some other field which experiences a racy update. *This
corner case can be enabled by runtime optimization which figures out the
final store is not needed, puts the value in the local variable, and hence
breaks out of ordering guarantees of StoreStore alone*/

How runtime can figure out that final Store is not needed, also if load is
getting passed/reorder the StoreStore Barrier then Store to local Variable
also is getting passed/reorder with the storeStore barrier, *This is the
part i don't quite understand , why store to local variable can reorder with
StoreStore Barrier. *

It would be very help full if anybody can explain in more details what is
the race condition they both are mentioning by some simple example.




--
View this message in context: http://jsr166-concurrency.10961.n7.nabble.com/LoadStore-and-StoreStore-are-both-are-required-before-lazySet-and-volatile-write-tp11291.html
Sent from the JSR166 Concurrency mailing list archive at Nabble.com.
_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest



_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest





_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: LoadStore and StoreStore , are both are required before lazySet and volatile write

Vitaly Davidovich
In reply to this post by Andrew Haley
As I understand it, there is nothing in a memory model
to prevent stores in Thread 2 after the load_acquire from floating up
past the x_init.store_write_release in Thread 1.

Maybe nothing in memory model, but this would violate control flow dependency though, wouldn't it? In order for thread 2 to execute the store, it has to observe x_init being set to true -- there's explicit control dependency there.  If thread 1 wrote that with a releasing store, then its preceding stores are also visible at this point, so I don't see how thread 2's store can float above thread 1's releasing store; thread 1's releasing store can get reordered within its own operations (like the x.a++ example), causing thread 2 to perform the write at an "earlier" point in thread 1's program order (is that what you meant?).

On Tue, Sep 23, 2014 at 1:21 PM, Andrew Haley <[hidden email]> wrote:
The key phrase is "it doesn't make sense to enforce ordering based on
dependencies, since they cannot be reliably defined *at this level*."

In other words, once you start translating the example into machine
instructions you can nail it down in ways that aren't imposed by the
memory model.  As I understand it, there is nothing in a memory model
to prevent stores in Thread 2 after the load_acquire from floating up
past the x_init.store_write_release in Thread 1.

Andrew.


On 09/23/2014 05:04 PM, Vitaly Davidovich wrote:
> That would be somewhat hard to believe as this means something (compiler or
> cpu) speculated about x_init.load_acquire() returning true but wrote to
> memory beforehand? A cpu can speculate like this, I suppose, but it
> wouldn't be able to actually retire the store of 42 until the speculation
> is proven correct, at which point the store is committed (but then this
> doesn't change observed behavior).  Or did you mean something else?
>
> On Tue, Sep 23, 2014 at 11:54 AM, Oleksandr Otenko <
> [hidden email]> wrote:
>
>>  Is this sufficient?
>>
>> Without LoadStore in the second thread, Thread 2 is allowed to observe x
>> after writing x.a=42. I am not sure what that means to Thread 1.
>>
>> Alex
>>
>>
>> On 22/09/2014 21:31, Vitaly Davidovich wrote:
>>
>> For Hans' example, he's basically saying this:
>> x.a = 0;
>> tmp = x.a; << read of x itself
>> tmp = tmp + 1;
>> x.a = tmp;
>> x.init_store_write_release(true);
>>
>>  The argument is that if only stores are ordered and loads can be
>> reordered with the stores, then the read of x.a could be reordered with the
>> releasing store, i.e.:
>>  x.a = 0;
>> x.init_store_write_release(true);
>>  tmp = x.a; << this load re-ordered with the store and may now see x.a =
>> 42 if thread 2 ran
>> tmp = tmp + 1;
>> x.a = tmp; << results in x.a = 43
>>
>>  As Hans mentions, this is somewhat esoteric and strange.  It would
>> require something (e.g. cpu, compiler) determining/allowing reordering
>> dependent memory operations; in terms of cpu, only one I know that allows
>> this is the Alpha.  But he has a separate argument about not relying on
>> memory dependence to enforce any ordering ...
>>
>>
>>
>>
>>
>>
>> On Mon, Sep 22, 2014 at 3:14 PM, vikas <[hidden email]> wrote:
>>
>>>
>>> I was reading a article from Hans and he argues that LoadStore is also
>>> needed before lazySet or a final variable write (i assume then also before
>>> volatile write).
>>>
>>> He demonstrated a particular race condition which i couldn't understand
>>> (link below).
>>>
>>> http://www.hboehm.info/c++mm/no_write_fences.html see *Recipient writes
>>> to
>>> object*.
>>>
>>> Its very counter intuitive how  * store of x.a = 43* can be done past the
>>> StoreStore barrier in thread 1.
>>> *
>>> x.a = 0; x.a++;
>>> x_init.store_write_release(true);
>>>
>>> and the code that uses x in thread 2 updates it, with e.g.
>>>
>>> if (x_init.load_acquire())
>>>     x.a = 42;*
>>>
>>>
>>> Similar argument here http://shipilev.net/blog/2014/all-fields-are-final/
>>>
>>> Copying Shiplev here :
>>>
>>> /JSR 133 Cookbook only requires StoreStore, but might also require
>>> LoadStore
>>> barriers. This covers for a corner case when the final field is getting
>>> initialized off some other field which experiences a racy update. *This
>>> corner case can be enabled by runtime optimization which figures out the
>>> final store is not needed, puts the value in the local variable, and hence
>>> breaks out of ordering guarantees of StoreStore alone*/
>>>
>>> How runtime can figure out that final Store is not needed, also if load is
>>> getting passed/reorder the StoreStore Barrier then Store to local Variable
>>> also is getting passed/reorder with the storeStore barrier, *This is the
>>> part i don't quite understand , why store to local variable can reorder
>>> with
>>> StoreStore Barrier. *
>>>
>>> It would be very help full if anybody can explain in more details what is
>>> the race condition they both are mentioning by some simple example.
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://jsr166-concurrency.10961.n7.nabble.com/LoadStore-and-StoreStore-are-both-are-required-before-lazySet-and-volatile-write-tp11291.html
>>> Sent from the JSR166 Concurrency mailing list archive at Nabble.com.
>>> _______________________________________________
>>> Concurrency-interest mailing list
>>> [hidden email]
>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>
>>
>>
>>
>> _______________________________________________
>> Concurrency-interest mailing [hidden email]://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>
>>
>>
>
>
>
> _______________________________________________
> Concurrency-interest mailing list
> [hidden email]
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>



_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: LoadStore and StoreStore , are both are required before lazySet and volatile write

Andrew Haley
On 09/23/2014 06:38 PM, Vitaly Davidovich wrote:

>>
>> As I understand it, there is nothing in a memory model
>> to prevent stores in Thread 2 after the load_acquire from floating up
>> past the x_init.store_write_release in Thread 1.
>
>
> Maybe nothing in memory model, but this would violate control flow
> dependency though, wouldn't it? In order for thread 2 to execute the store,
> it has to observe x_init being set to true -- there's explicit control
> dependency there.

That's right.  However, he goes on to say

"it doesn't make sense to enforce ordering based on dependencies"
http://www.hboehm.info/c++mm/dependencies.html

I think he's saying that the problem is that it while it is easy
enough to define what "dependence" means at the hardware level it's
much harder to define in the context of a memory model for a high-level
language.

But, to be honest, I am unaware of any compiler transformation which
would make the effect he's describing appear on any hardware of which
I'm aware.

> If thread 1 wrote that with a releasing store, then its
> preceding stores are also visible at this point, so I don't see how thread
> 2's store can float above thread 1's releasing store; thread 1's releasing
> store can get reordered within its own operations (like the x.a++ example),
> causing thread 2 to perform the write at an "earlier" point in thread 1's
> program order (is that what you meant?).

I am sorry, I was careless.  AIUI, it's the load in x.a++ which moves
after the the assignment x_init.

Thread 1:

x.a = 0; x.a++;
x_init.store_write_release(true);

Thread 2:

if (x_init.load_acquire())
    x.a = 42;

Andrew.
_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: LoadStore and StoreStore , are both are required before lazySet and volatile write

Vitaly Davidovich

Ah yes, he does mention control dependency as well (I thought he was talking about data only).  This whole thing is nasty ...

Yeah ok, load in x.a++ I got.

Sent from my phone

On Sep 23, 2014 1:53 PM, "Andrew Haley" <[hidden email]> wrote:
On 09/23/2014 06:38 PM, Vitaly Davidovich wrote:
>>
>> As I understand it, there is nothing in a memory model
>> to prevent stores in Thread 2 after the load_acquire from floating up
>> past the x_init.store_write_release in Thread 1.
>
>
> Maybe nothing in memory model, but this would violate control flow
> dependency though, wouldn't it? In order for thread 2 to execute the store,
> it has to observe x_init being set to true -- there's explicit control
> dependency there.

That's right.  However, he goes on to say

"it doesn't make sense to enforce ordering based on dependencies"
http://www.hboehm.info/c++mm/dependencies.html

I think he's saying that the problem is that it while it is easy
enough to define what "dependence" means at the hardware level it's
much harder to define in the context of a memory model for a high-level
language.

But, to be honest, I am unaware of any compiler transformation which
would make the effect he's describing appear on any hardware of which
I'm aware.

> If thread 1 wrote that with a releasing store, then its
> preceding stores are also visible at this point, so I don't see how thread
> 2's store can float above thread 1's releasing store; thread 1's releasing
> store can get reordered within its own operations (like the x.a++ example),
> causing thread 2 to perform the write at an "earlier" point in thread 1's
> program order (is that what you meant?).

I am sorry, I was careless.  AIUI, it's the load in x.a++ which moves
after the the assignment x_init.

Thread 1:

x.a = 0; x.a++;
x_init.store_write_release(true);

Thread 2:

if (x_init.load_acquire())
    x.a = 42;

Andrew.

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: LoadStore and StoreStore , are both are required before lazySet and volatile write

Aleksey Shipilev-2
In reply to this post by Vitaly Davidovich
On 09/23/2014 12:50 AM, Vitaly Davidovich wrote:
> As for Aleksey's point, I don't quite follow why initializing a final
> field off some racy field *requires* a LoadStore.  Eliminating
> StoreStore may happen, I guess, if something like escape analysis
> removes the allocation of an object with final field and scalar replaces
> it.  But again, unsure why another racy memory location requires
> LoadStore (unless this is talking about same thing as Hans in different
> terms).

Ah yes, I do indeed reference Hans' example there: allowing loads to be
satisfied after the publication store may set us to "observe" the racy
update before the object is published. As Hans warns, trying to
interpret this on transformation level is confusing, and instead I
should have resorted to describe the high-level behavior. Fixed in my
text now.

Also, Hans moved from HP to Google, and that means all my HP links are
broken, argh. Fixed that as well.

-Aleksey.


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest

signature.asc (836 bytes) Download Attachment