Quantcast

Re: RFR: 8166842: String.hashCode() has a non-benign data race

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR: 8166842: String.hashCode() has a non-benign data race

David Holmes-2
On 28/09/2016 11:24 PM, Peter Levart wrote:

>
> On 09/28/2016 03:05 PM, David Holmes wrote:
>> On 28/09/2016 10:44 PM, Peter Levart wrote:
>>> Hi,
>>>
>>> According to discussion here:
>>>
>>> http://cs.oswego.edu/pipermail/concurrency-interest/2016-September/015414.html
>>>
>>> it seems compact strings introduced (at least theoretical) non-benign
>>> data race into String.hasCode() method.
>>>
>>> Here is a proposed patch:
>>>
>>> http://cr.openjdk.java.net/~plevart/jdk9-dev/8166842_String.hashCode/webrev.01/
>>>
>>
>> I'm far from convinced that the bug exists - theoretical or otherwise
>> - but the "fix" is harmless.
>>
>> When we were working on JSR-133 one of the conceptual models is that
>> every write to a variable goes into the set of values that a read may
>> potentially return (so no out-of-thin-air for example). happens-before
>> establishes constraints on which value can legally be returned - the
>> most recent. An additional property was that once a value was
>> returned, a later read could not return an earlier value - in essence
>> once a read returns a given value, all earlier written values are
>> removed from the set of potential values that can be read.
>
> That would be a nice property, yes.
>
>>
>> Your bug requires that the code act as-if written:
>>
>> int temp = hash;
>> if (temp == 0) {
>>    hash = ...
>> }
>> return temp;
>>
>> and I do not believe that is allowed.
>>
>> David
>
> Well, I can't find anything like that in JMM description. Could you
> point me to it? Above example only reads once from hash. The code in
> question is this:
>
> if (hash == 0) { // 1st read
>     hash = ...
> }
> return hash; // 2nd read
>
> And the bug requires the code to act like:
>
> int temp2 = hash; // 2nd read
> int temp1 = hash; // 1st read
> if (temp1 == 0) {
>     return (hash = ...);
> }
> return temp2;

Given hash is not a volatile field there is nothing that requires the
compiler to emit code that reads it twice - of course javac will issue
getfields as per the source code.

So the compiler is of course allowed to read it twice, but you then want
it to reorder those reads such that the second use uses the value of the
first read, and the first use uses the value of the second read.

I recall discussions of such perverse compiler behaviour in the past. :(

> Is this allowed?

Sadly yes. What we ended up with in JLS Ch17 permits this - it is a
variation of the aliasing example in 17.4.

There are places where the formalization of the JMM "threw the baby out
with the bath water" IMHO. Some of the less formal descriptions made far
more sense - like the notion of there being a set of values a read may
return, and how happens-before constrains that, and how once a value is
returned you cant then "read" an earlier value. <sigh>

Thanks,
David

> Regards, Peter
>
>>
>>>
>>> For the bug:
>>>
>>>     https://bugs.openjdk.java.net/browse/JDK-8166842
>>>
>>>
>>>
>>> JDK 8 did not have this problem, so no back-porting necessary.
>>>
>>> Regards, Peter
>>>
>
_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR: 8166842: String.hashCode() has a non-benign data race

Vitaly Davidovich


On Wednesday, September 28, 2016, David Holmes <[hidden email]> wrote:
On 28/09/2016 11:24 PM, Peter Levart wrote:

On 09/28/2016 03:05 PM, David Holmes wrote:
On 28/09/2016 10:44 PM, Peter Levart wrote:
Hi,

According to discussion here:

http://cs.oswego.edu/pipermail/concurrency-interest/2016-September/015414.html

it seems compact strings introduced (at least theoretical) non-benign
data race into String.hasCode() method.

Here is a proposed patch:

http://cr.openjdk.java.net/~plevart/jdk9-dev/8166842_String.hashCode/webrev.01/


I'm far from convinced that the bug exists - theoretical or otherwise
- but the "fix" is harmless.

When we were working on JSR-133 one of the conceptual models is that
every write to a variable goes into the set of values that a read may
potentially return (so no out-of-thin-air for example). happens-before
establishes constraints on which value can legally be returned - the
most recent. An additional property was that once a value was
returned, a later read could not return an earlier value - in essence
once a read returns a given value, all earlier written values are
removed from the set of potential values that can be read.

That would be a nice property, yes.


Your bug requires that the code act as-if written:

int temp = hash;
if (temp == 0) {
   hash = ...
}
return temp;

and I do not believe that is allowed.

David

Well, I can't find anything like that in JMM description. Could you
point me to it? Above example only reads once from hash. The code in
question is this:

if (hash == 0) { // 1st read
    hash = ...
}
return hash; // 2nd read

And the bug requires the code to act like:

int temp2 = hash; // 2nd read
int temp1 = hash; // 1st read
if (temp1 == 0) {
    return (hash = ...);
}
return temp2;

Given hash is not a volatile field there is nothing that requires the compiler to emit code that reads it twice - of course javac will issue getfields as per the source code.
I don't think there's anything preventing it from reading it any number of times so long as intra-thread semantics are preserved.

So the compiler is of course allowed to read it twice, but you then want it to reorder those reads such that the second use uses the value of the first read, and the first use uses the value of the second read.
I gave a pseudo example upthread.  It would be weird codegen but allowed.

I recall discussions of such perverse compiler behaviour in the past. :(
Be thankful that at least an explicit read into a local is preserved - C++ compilers are allowed to reload in that case too unless the local is marked volatile. 

None of this is an issue when not using data races.  If you choose to engage in a data race, gotta play by the rules :).

Is this allowed?

Sadly yes. What we ended up with in JLS Ch17 permits this - it is a variation of the aliasing example in 17.4.

There are places where the formalization of the JMM "threw the baby out with the bath water" IMHO. Some of the less formal descriptions made far more sense - like the notion of there being a set of values a read may return, and how happens-before constrains that, and how once a value is returned you cant then "read" an earlier value. <sigh>

Thanks,
David

Regards, Peter



For the bug:

    https://bugs.openjdk.java.net/browse/JDK-8166842



JDK 8 did not have this problem, so no back-porting necessary.

Regards, Peter




--
Sent from my phone

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR: 8166842: String.hashCode() has a non-benign data race

Peter Levart
In reply to this post by David Holmes-2
Hi,

Thank you all for reviews and comments, especially instructive was Hans Boehm's explanation about why JMM must allow reordering of plain reads from the same location.

The fix is now pushed.

Regards, Peter


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Loading...