About putOrdered and its meaning

classic Classic list List threaded Threaded
21 messages Options
Reply | Threaded
Open this post in threaded view

Re: About putOrdered and its meaning

Gregg Wonderly-3
Its really interesting to see that the early work on Dataflow architectures, some 30 years ago is starting to appear in conversations like this.  Relating the computational paths of individual values is convenient for our brains, but convoluted for hardware which has become focused on parallelization.  

I am still firmly committed to totally ordered execution for languages which depict such relationships with code structure.

But when I was working with some of the AT&T EMSP investigators in college in the '80s, I was hoping that we would eventually get to nothing but Dataflow based programming where complex objects were considered single values with hardware support for them instead of hardware support for only words or pages.

Right now the hardware and the software systems are struggling for a unifying concept which reaches beyond the current "memory" concepts expressed in each. We struggle with very different viewpoints on what exactly a value is and exactly how to manage the view that the software needs vs the view that the hardware provides.

Instruction level management of details which are actually storage attributes (volatile, shared etc) grates pretty hard against the software concepts.


Sent from my iPhone

On May 4, 2016, at 4:52 PM, Hans Boehm <[hidden email]> wrote:

The answers are pretty consistent across Java, C, and C++ (and one or two others, notably OpenCL). An acquire load guarantees that all memory effects preceding the corresponding release store are visible (and none of the memory affects following the acquire load are visible before the release store). That's essentially all it guarantees. In my opinion, it's usually best not to think in terms of fences, though fence-based thinking sometimes exposes some useful rough intuitions.

There may be subtle differences/uncertainties as to what happens when an acquire load actually sees the results of a later (in coherence order) store that is not itself ordered. The C++ rules (see "release sequence") predate a modern hardware understanding, and the Java memory model description probably isn't as general as it now needs to be. But these are relatively esoteric issues that typically don't matter.

On Wed, May 4, 2016 at 2:27 PM, Peter Levart <[hidden email]> wrote:

On 05/04/2016 11:20 PM, Peter Levart wrote:

On 05/04/2016 07:35 PM, Hans Boehm wrote:

On Wed, May 4, 2016 at 9:02 AM, Andrew Haley <[hidden email][hidden email]> wrote:
On 05/04/2016 03:40 PM, thurstonn wrote:
> I realize that I'm assuming that the barriers are emitted *after* the
> respective memory actions, so above code becomes:
>  A global;
> -----------------------------------------------------------------------
>     A a = <alloc>;                  |  A a = global;
>     a.x = 1                            |  LoadLoad()
>     StoreStore()                     |   r1 = a.x;
>     global = a;                      |
> Maybe that assumption is wrong?

StoreRelease is (LoadStore|StoreStore ; store)
LoadAcquire is (load ; LoadStore|LoadLoad)

But only when that abstraction works :-)

x = 1;
y =release 1;
z = 1;

does not order the stores to x and z.  (Neither in theory nor in practice.)

In the C++ model at least,

Thread 1: y =release 2; x =release 1;

Thread 2: x =release 2; y =release 1;

allows a final state of x = y = 2.  Memory_order_release doesn't mean anything in the absence of a corresponding acquire or consume load.  (Hardware implementations are unlikely to allow that; compiler optimizations might.) Acquire/release make the "message passing" idiom work, not much more than that.

Ok, but in the presence of store-release / load-acquire pairs, does a single such pair guarantee ordering of other relaxed load/stores that are in program order before store-release to be strictly before load/stores that are in program order after corresponding load-acquire. For example:

Thread1: construct an object graph with relaxed load/stores then publish the reference to data structure via store-release to 'global'

Thread2: load a reference from 'global' via load-acquire then use relaxed load/stores to read/modify the data structure navigated through the reference

Does this guarantee that:
- Thread2 sees all stores performed on data structure by Thread1 before publication
- Thread1 sees no modifications of data structure performed by Thread2 after loading the reference to it

Or, very similar, but not quite the same:

Thread1: process some shared state with relaxed load/stores then store-release a true value into a 'global' flag (that was initially false)

Thread2: after observing 'global' flag read via load-acquire to be true, perform relaxed load/stores of the shared state

Does this guarantee that:
- Thread2 sees all stores performed on shared state by Thread1 before storing true to 'global'
- Thread1 sees no modifications of shared state performed by Thread2 after loading true from 'global'


Regards, Peter

Ok, I see this already answered by Aleksey. Is this true for Java VarHandles only or for C++ too?

Regards, Peter

Concurrency-interest mailing list
[hidden email]

Concurrency-interest mailing list
[hidden email]