Fences and card tables

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Fences and card tables

Andrew Haley
Just to reassure me: a card table write in a generational collector
only needs a StoreStore fence, not a release.  Is that right?

Thanks,
Andrew.
_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Fences and card tables

Doug Lea
On 08/20/2015 01:50 PM, Andrew Haley wrote:
> Just to reassure me: a card table write in a generational collector
> only needs a StoreStore fence, not a release.  Is that right?
>

Definitive answers might be collector-specific.
So you might try asking on hotspot-gc-dev?
   http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-dev

-Doug

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Fences and card tables

Andrew Haley
On 08/21/2015 02:16 PM, Doug Lea wrote:
> On 08/20/2015 01:50 PM, Andrew Haley wrote:
>> Just to reassure me: a card table write in a generational collector
>> only needs a StoreStore fence, not a release.  Is that right?
>>
>
> Definitive answers might be collector-specific.
> So you might try asking on hotspot-gc-dev?
>    http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-dev

Sure, I understand that.  My question was more general, following
Hans's sometimes surprising observations of the failures of StoreStore
fences.  A card table store is merely a constant written into a table,
and that constant is fixed.  All that is required is that the
preceding reference store and the card table store are observable to
all threads in the order they were written.  Hence a StoreStore fence.

Andrew.
_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Fences and card tables

Gil Tene
Andrew,

For generational collection the ordering required by a collector's reference store barrier depends on the collector mechanisms. The purpose of the reference store barrier in generational collectors is to maintain remembered sets from older to younger generations. A card table is a good example of such a remembered set.

When the collector performs younger generation collections in monolithic stop-the-world pauses, the only ordering requirement between a card mark store and the associated reference store is that they cannot be separated by a safepoint. They can occur in any order as long as that requirement is maintained. This describes the situation for all current young generation collectors in HotSpot (AFAIK Zing's C4 is still the only non-STW young generation collector in a shipping JVM).

When the young generation collection can run concurrently with mutator execution the typical ordering requirement would be for the recording of generational remembered set information (e.g. a card table store) to be visible to GC threads before the reference store itself. Hence a logical StoreStore fence exists between the two, and must be enforced by all code generation and executing CPUs. Note that the requirement for the two stores to not be separated by a safepoint is usually still there as well (and that requirement is not satisfied by a StoreStore barrier). Also note that I use "typically" and "usually" above, because correct concurrent card scanning algorithms that do not carry these specific requirements certainly exist. As noted, this is not relevant for HotSpot's generational behavior because the younggens are all STW. But in the specific example of Zing/C4 we do use a card table, we do scan it concurrently, and we do require both the StoreStore order and the not-s!
 eparated-but-safepoint requirement.

Beyond generational collection, It is VERY important to remember that some collectors use reference store barriers for purposes other than generational remembered set tracking.

E.g. to my understanding CMS uses the same card table store to *also* track mutations during concurrent marking of the oldgen. So while the generational part of the collector has no concurrency or ordering issues, the oldgen marking functionality does require the StoreStore ordering in addition to the not-separated-by-safepoint requirement.

Similarly, G1 uses a reference store barrier (not just a card table store) to enforce its mostly concurrent marker's snapshot-at-the-beginning (SATB) invariants (in addition to doing reference set tracking). As such, it too *might* need the StoreStore order. I'm less sure about that one, because I think SATB might do fine without the orderinggucen the information captured by the barrier.

HTH.

Sent from Gil's iPhone

> On Aug 23, 2015, at 2:23 AM, Andrew Haley <[hidden email]> wrote:
>
>> On 08/21/2015 02:16 PM, Doug Lea wrote:
>>> On 08/20/2015 01:50 PM, Andrew Haley wrote:
>>> Just to reassure me: a card table write in a generational collector
>>> only needs a StoreStore fence, not a release.  Is that right?
>>
>> Definitive answers might be collector-specific.
>> So you might try asking on hotspot-gc-dev?
>>   http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-dev
>
> Sure, I understand that.  My question was more general, following
> Hans's sometimes surprising observations of the failures of StoreStore
> fences.  A card table store is merely a constant written into a table,
> and that constant is fixed.  All that is required is that the
> preceding reference store and the card table store are observable to
> all threads in the order they were written.  Hence a StoreStore fence.
>
> Andrew.
> _______________________________________________
> Concurrency-interest mailing list
> [hidden email]
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Fences and card tables

Gil Tene
To add/amend the below a bit:

Depending on the collector implementation, a StoreStore fence between a card table store and the related reference store may be an alternative to the requirement for the two stores to not be separated by a safepoint: If the StoreStore order is maintained and the reference value being stored remains live in registers, and the collector mechanism scans the stacks at a safepoint before completing the card scanning, then having the two stores separated by a safepoint is acceptable.

In checking on the HotSpot younggen collectors, I believe this means that a StoreStore fence is sufficient in itself. I.e. that you need either a StoreStore fence *or* no-safepoint-between-the-stores, but not both.

— Gil.

> On Aug 23, 2015, at 3:34 AM, Gil Tene <[hidden email]> wrote:
>
> Andrew,
>
> For generational collection the ordering required by a collector's reference store barrier depends on the collector mechanisms. The purpose of the reference store barrier in generational collectors is to maintain remembered sets from older to younger generations. A card table is a good example of such a remembered set.
>
> When the collector performs younger generation collections in monolithic stop-the-world pauses, the only ordering requirement between a card mark store and the associated reference store is that they cannot be separated by a safepoint. They can occur in any order as long as that requirement is maintained. This describes the situation for all current young generation collectors in HotSpot (AFAIK Zing's C4 is still the only non-STW young generation collector in a shipping JVM).
>
> When the young generation collection can run concurrently with mutator execution the typical ordering requirement would be for the recording of generational remembered set information (e.g. a card table store) to be visible to GC threads before the reference store itself. Hence a logical StoreStore fence exists between the two, and must be enforced by all code generation and executing CPUs. Note that the requirement for the two stores to not be separated by a safepoint is usually still there as well (and that requirement is not satisfied by a StoreStore barrier). Also note that I use "typically" and "usually" above, because correct concurrent card scanning algorithms that do not carry these specific requirements certainly exist. As noted, this is not relevant for HotSpot's generational behavior because the younggens are all STW. But in the specific example of Zing/C4 we do use a card table, we do scan it concurrently, and we do require both the StoreStore order and the not-s!
> eparated-but-safepoint requirement.
>
> Beyond generational collection, It is VERY important to remember that some collectors use reference store barriers for purposes other than generational remembered set tracking.
>
> E.g. to my understanding CMS uses the same card table store to *also* track mutations during concurrent marking of the oldgen. So while the generational part of the collector has no concurrency or ordering issues, the oldgen marking functionality does require the StoreStore ordering in addition to the not-separated-by-safepoint requirement.
>
> Similarly, G1 uses a reference store barrier (not just a card table store) to enforce its mostly concurrent marker's snapshot-at-the-beginning (SATB) invariants (in addition to doing reference set tracking). As such, it too *might* need the StoreStore order. I'm less sure about that one, because I think SATB might do fine without the orderinggucen the information captured by the barrier.
>
> HTH.
>
> Sent from Gil's iPhone
>
>> On Aug 23, 2015, at 2:23 AM, Andrew Haley <[hidden email]> wrote:
>>
>>> On 08/21/2015 02:16 PM, Doug Lea wrote:
>>>> On 08/20/2015 01:50 PM, Andrew Haley wrote:
>>>> Just to reassure me: a card table write in a generational collector
>>>> only needs a StoreStore fence, not a release.  Is that right?
>>>
>>> Definitive answers might be collector-specific.
>>> So you might try asking on hotspot-gc-dev?
>>>  http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-dev
>>
>> Sure, I understand that.  My question was more general, following
>> Hans's sometimes surprising observations of the failures of StoreStore
>> fences.  A card table store is merely a constant written into a table,
>> and that constant is fixed.  All that is required is that the
>> preceding reference store and the card table store are observable to
>> all threads in the order they were written.  Hence a StoreStore fence.
>>
>> Andrew.
>> _______________________________________________
>> Concurrency-interest mailing list
>> [hidden email]
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>
> _______________________________________________
> Concurrency-interest mailing list
> [hidden email]
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest

signature.asc (859 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Fences and card tables

Andrew Haley
In reply to this post by Gil Tene
OK, thanks for a very insightful and detailed answer.

I will raise this one on the gc-dev lists.  My best guess ATM is that
it's only CMS which requires a StoreStore (G1 has entirely separate
code) but I'll reserve judgement until I've had that discussion.

Thanks again,

Andrew.
_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Fences and card tables

Gil Tene
In reply to this post by Gil Tene
To re-ammend (de-ammend?) my below ammendment: After some more checking and conversations, I'm going back to my original assertions: A reference store and it's associated card table store must not be separated by a safepoint. Several "bad things" can happen if they do, and a StoreStore fence alone is not enough to avoid those "bad things".

For two specific "bad thing" examples:

1) Imagine the sequence was: ref_store; StoreStore_fence; card_store;

The ref_store is free to float back across a previous safepoint polling opportunity. And the card_store is free to float forward past a following safepoint polling opportunity. Either of these can result in a possible safepoint occurring with a newgen root in oldgen that is not identified in the card table (boom).

2) Imagine the sequence was: card_store; StoreStore_fence; ref_store;

The card_store is free to float back across a previous safepoint polling opportunity. And the ref_store is free to float forward past a following safepoint polling opportunity. Either of these can result in a possible safepoint occurring with an "already dirtied card" that will have an associated reference stored into only after the safepoint is completed. If the safepoint cleans the card table (which can commonly happen), the reference store that follows the safepoint will not be tracked in the card table examined by future safepoints (boom).

This makes the StoreStore_fence useless for generational card table marking safety. It may not be needed at all as a result (depending on what else the card mark and the collector algorithms are doing), but the safepoint-crossing scheduling rules are absolutely required.

— Gil.

> On Aug 23, 2015, at 9:12 AM, Gil Tene <[hidden email]> wrote:
>
> To add/amend the below a bit:
>
> Depending on the collector implementation, a StoreStore fence between a card table store and the related reference store may be an alternative to the requirement for the two stores to not be separated by a safepoint: If the StoreStore order is maintained and the reference value being stored remains live in registers, and the collector mechanism scans the stacks at a safepoint before completing the card scanning, then having the two stores separated by a safepoint is acceptable.
>
> In checking on the HotSpot younggen collectors, I believe this means that a StoreStore fence is sufficient in itself. I.e. that you need either a StoreStore fence *or* no-safepoint-between-the-stores, but not both.
>
> — Gil.
>
>> On Aug 23, 2015, at 3:34 AM, Gil Tene <[hidden email]> wrote:
>>
>> Andrew,
>>
>> For generational collection the ordering required by a collector's reference store barrier depends on the collector mechanisms. The purpose of the reference store barrier in generational collectors is to maintain remembered sets from older to younger generations. A card table is a good example of such a remembered set.
>>
>> When the collector performs younger generation collections in monolithic stop-the-world pauses, the only ordering requirement between a card mark store and the associated reference store is that they cannot be separated by a safepoint. They can occur in any order as long as that requirement is maintained. This describes the situation for all current young generation collectors in HotSpot (AFAIK Zing's C4 is still the only non-STW young generation collector in a shipping JVM).
>>
>> When the young generation collection can run concurrently with mutator execution the typical ordering requirement would be for the recording of generational remembered set information (e.g. a card table store) to be visible to GC threads before the reference store itself. Hence a logical StoreStore fence exists between the two, and must be enforced by all code generation and executing CPUs. Note that the requirement for the two stores to not be separated by a safepoint is usually still there as well (and that requirement is not satisfied by a StoreStore barrier). Also note that I use "typically" and "usually" above, because correct concurrent card scanning algorithms that do not carry these specific requirements certainly exist. As noted, this is not relevant for HotSpot's generational behavior because the younggens are all STW. But in the specific example of Zing/C4 we do use a card table, we do scan it concurrently, and we do require both the StoreStore order and the not-s!
>> eparated-but-safepoint requirement.
>>
>> Beyond generational collection, It is VERY important to remember that some collectors use reference store barriers for purposes other than generational remembered set tracking.
>>
>> E.g. to my understanding CMS uses the same card table store to *also* track mutations during concurrent marking of the oldgen. So while the generational part of the collector has no concurrency or ordering issues, the oldgen marking functionality does require the StoreStore ordering in addition to the not-separated-by-safepoint requirement.
>>
>> Similarly, G1 uses a reference store barrier (not just a card table store) to enforce its mostly concurrent marker's snapshot-at-the-beginning (SATB) invariants (in addition to doing reference set tracking). As such, it too *might* need the StoreStore order. I'm less sure about that one, because I think SATB might do fine without the orderinggucen the information captured by the barrier.
>>
>> HTH.
>>
>> Sent from Gil's iPhone
>>
>>> On Aug 23, 2015, at 2:23 AM, Andrew Haley <[hidden email]> wrote:
>>>
>>>> On 08/21/2015 02:16 PM, Doug Lea wrote:
>>>>> On 08/20/2015 01:50 PM, Andrew Haley wrote:
>>>>> Just to reassure me: a card table write in a generational collector
>>>>> only needs a StoreStore fence, not a release.  Is that right?
>>>>
>>>> Definitive answers might be collector-specific.
>>>> So you might try asking on hotspot-gc-dev?
>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-dev
>>>
>>> Sure, I understand that.  My question was more general, following
>>> Hans's sometimes surprising observations of the failures of StoreStore
>>> fences.  A card table store is merely a constant written into a table,
>>> and that constant is fixed.  All that is required is that the
>>> preceding reference store and the card table store are observable to
>>> all threads in the order they were written.  Hence a StoreStore fence.
>>>
>>> Andrew.
>>> _______________________________________________
>>> Concurrency-interest mailing list
>>> [hidden email]
>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>
>> _______________________________________________
>> Concurrency-interest mailing list
>> [hidden email]
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest

signature.asc (859 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Fences and card tables

Andrew Haley
On 08/28/2015 01:27 AM, Gil Tene wrote:
> For two specific "bad thing" examples:
>
> 1) Imagine the sequence was: ref_store; StoreStore_fence; card_store;
>
> The ref_store is free to float back across a previous safepoint polling opportunity.

I don't see how.  Any safepoint poll which actually traps is a full barrier; if
it doesn't trap it doesn't matter.  I must be missing something...

Andrew.

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Fences and card tables

Gil Tene
The compiler doesn't treat safepoint polling opportunities as a fence at all (at least in the normal store & load ordering sense). Unless otherwise prevented, loads and stores are free to float across safepoint polling opportunities even when they are preceded or followed by the various fences (acquire/release/StoreStore/etc.).

While an actually taken safepoint poll does apply a full fence from the CPU's point of view the ref_store in the example below could have floated back across that safepoint and show up (in the CPUs instruction stream) before the safepoint poll was made (if all that was ordering it was the StoreStore in the example).

To safely avoid separating a ref store and its related card table store with a safepoint, additional "safepoint related fencing" capabilities are needed in the JIT. Note that these never apply to program semantics, and only to the JVM's internal implementation of bytecode sub-parts. So only Runtime and JIT implementers need to be aware of them.

The simplest way to handle this is/was for JITs to consider the entire sequence (ref store and card mark store) to be inseparable from scheduling point of view. Often by using a single IR node to represent the combo. But as scheduling benefits for the two parts are sought (and there are some) and the sequence is split into separate IR nodes, scheduling rules relating to those nodes' ability to cross safepoints apply.

One way to achieve this ordering restriction is for the JIT to treat all safepoint polling opportunities as if they both consume and potentially modify all references stored to memory, as well as all cards stored to memory. This then prevents them from being reordered with either regardless of fences. Unfortunately, this strong approach prevents some useful optimizations. As you weaken the restriction (allowing e.g. hoisting repeated reference loads out of loops that have safepoints in their back edge), stating the actual ordering needed for things like card marks becomes more specific.

Sent from Gil's iPhone

> On Aug 28, 2015, at 1:13 AM, Andrew Haley <[hidden email]> wrote:
>
>> On 08/28/2015 01:27 AM, Gil Tene wrote:
>> For two specific "bad thing" examples:
>>
>> 1) Imagine the sequence was: ref_store; StoreStore_fence; card_store;
>>
>> The ref_store is free to float back across a previous safepoint polling opportunity.
>
> I don't see how.  Any safepoint poll which actually traps is a full barrier; if
> it doesn't trap it doesn't matter.  I must be missing something...
>
> Andrew.
>

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest