Timestamps-based ConcurrentMap

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Timestamps-based ConcurrentMap

Jean Morissette-2
Hi,

In ConcurrentHashMap, iterators are designed to be used by only one
thread at a time.  However, I would like to have a ConcurrentMap
implementation where iterators could be used by many threads at time,
in a consistent way.  These iterators should reflect the state of the
map at the creation of the iterator, independently of the map update
operations.

Is-there a class somewhere that achieve this behaviour?  IMO, it would
be a nice addition to j.u.concurrent package.  Or, is there any plan
to create it in the future?

Otherwise, I will try to develop it myself, maybe by assigning a
timestamps to each entry and iterator. Or, do you have others
suggestions/advices?

Regards,
Jean

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Timestamps-based ConcurrentMap

Bob Lee-8
new HashMap(concurrentHashMap).iterator()?

Bob

On 3/9/06, Jean Morissette <[hidden email]> wrote:

> Hi,
>
> In ConcurrentHashMap, iterators are designed to be used by only one
> thread at a time.  However, I would like to have a ConcurrentMap
> implementation where iterators could be used by many threads at time,
> in a consistent way.  These iterators should reflect the state of the
> map at the creation of the iterator, independently of the map update
> operations.
>
> Is-there a class somewhere that achieve this behaviour?  IMO, it would
> be a nice addition to j.u.concurrent package.  Or, is there any plan
> to create it in the future?
>
> Otherwise, I will try to develop it myself, maybe by assigning a
> timestamps to each entry and iterator. Or, do you have others
> suggestions/advices?
>
> Regards,
> Jean
>
> _______________________________________________
> Concurrency-interest mailing list
> [hidden email]
> http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
>

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

RE: Timestamps-based ConcurrentMap

David Holmes-3
In reply to this post by Jean Morissette-2
It sounds like you want to make an atomic snapshot of the map and iterate
over that. But atomic access to ConcurrentHashMap can't be done - there is
no global lock to grab, nor do you want one.

The timestamp idea seems only a partial solution depending on what your
constraints actually are: you can't stop entries being removed, you can only
skip entries that have been added since, if they appear.

Why not just use an iterator to make your own immutable snapshot that can
then be accessed by other threads. Any thread using the snapshot would have
to realize that entries may have since been removed from the map (assuming
application logic permits that). The snapshot would reflect the state of the
map between the iterator being constructed and the iteration being
completed.

Cheers,
David Holmes

> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]]On Behalf Of Jean
> Morissette
> Sent: Friday, 10 March 2006 9:55 AM
> To: [hidden email]
> Subject: [concurrency-interest] Timestamps-based ConcurrentMap
>
>
> Hi,
>
> In ConcurrentHashMap, iterators are designed to be used by only one
> thread at a time.  However, I would like to have a ConcurrentMap
> implementation where iterators could be used by many threads at time,
> in a consistent way.  These iterators should reflect the state of the
> map at the creation of the iterator, independently of the map update
> operations.
>
> Is-there a class somewhere that achieve this behaviour?  IMO, it would
> be a nice addition to j.u.concurrent package.  Or, is there any plan
> to create it in the future?
>
> Otherwise, I will try to develop it myself, maybe by assigning a
> timestamps to each entry and iterator. Or, do you have others
> suggestions/advices?
>
> Regards,
> Jean
>
> _______________________________________________
> Concurrency-interest mailing list
> [hidden email]
> http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Timestamps-based ConcurrentMap

Chris Purcell
> But atomic access to ConcurrentHashMap can't be done - there is no
> global lock to grab, nor do you want one.

As I understood it, though, updates *are* done under mutual exclusion.
Why can't a "global lock" operation just grab all the fine-grained
locks and hold them simultaneously?

Chris

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Timestamps-based ConcurrentMap

Jean Morissette-2
In reply to this post by David Holmes-3
2006/3/9, David Holmes <[hidden email]>:
> It sounds like you want to make an atomic snapshot of the map and iterate
> over that.

Right.

> The timestamp idea seems only a partial solution depending on what your
> constraints actually are: you can't stop entries being removed, you can only
> skip entries that have been added since, if they appear.

Each entry could have two timestamps, one for its creation and one set
at its removal. Removed entries would remain in the map until there is
no iterator that can see them, after which they would be (lazily?)
removed.

> Why not just use an iterator to make your own immutable snapshot that can
> then be accessed by other threads.

In my scenario, there is as mush map update (put/remove) than querying
(iterator) operations. Also, the size of these map can be really big.
So, using a copy strategy to create a snapshot seems costly: it would
involve a lot of copying. What do you think?

Thanks,
Jean

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Timestamps-based ConcurrentMap

Bob Lee-8
On 3/10/06, Jean Morissette <[hidden email]> wrote:
> What do you think?

If you absolutely can't bare any modifications during the creation of
the iterator, I think I'd just use a synchronized HashMap and optimize
the hashCode()/equals() methods on my key classes.

Bob

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

RE: Timestamps-based ConcurrentMap

David Holmes-3
In reply to this post by Chris Purcell
Chris Purcell writes:
> > But atomic access to ConcurrentHashMap can't be done - there is no
> > global lock to grab, nor do you want one.
>
> As I understood it, though, updates *are* done under mutual exclusion.
> Why can't a "global lock" operation just grab all the fine-grained
> locks and hold them simultaneously?

You can if you modify the internals of CHM, otherwise those locks are not
accessible.

Cheers,
David Holmes

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

RE: Timestamps-based ConcurrentMap

David Holmes-3
In reply to this post by Jean Morissette-2
Jean Morissette writes:
> Each entry could have two timestamps, one for its creation and one set
> at its removal. Removed entries would remain in the map until there is
> no iterator that can see them, after which they would be (lazily?)
> removed.

They could, but now you are adding additional housekeeping and coordination
between the iterators and the map.

> In my scenario, there is as mush map update (put/remove) than querying
> (iterator) operations. Also, the size of these map can be really big.
> So, using a copy strategy to create a snapshot seems costly: it would
> involve a lot of copying. What do you think?

At this stage I'd ask exactly why you need an iterator that can be shared
amongst threads. :)

The desired semantics seem to be leading you into a no-win situation.
Ultimately you will have to pay for it somewhere.

Cheers,
David Holmes

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Timestamps-based ConcurrentMap

Jean Morissette-2
2006/3/10, David Holmes <[hidden email]>:
> At this stage I'd ask exactly why you need an iterator that can be shared
> amongst threads. :)
>
> The desired semantics seem to be leading you into a no-win situation.
> Ultimately you will have to pay for it somewhere.

Sorry, I made a mistake in my first post. In fact, iterators *don't*
need to be shared amongst threads. However, they need to reflect the
state of the map at the creation of the iterator, independently of the
map update operations.

> The desired semantics seem to be leading you into a no-win situation.
> Ultimately you will have to pay for it somewhere.

Knowing that all is a question of compromises, I'm yet searching the
cheapest one :)

Thanks,
Jean

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

RE: Timestamps-based ConcurrentMap

David Holmes-3
Jean Morissette writes:
> Sorry, I made a mistake in my first post. In fact, iterators *don't*
> need to be shared amongst threads. However, they need to reflect the
> state of the map at the creation of the iterator, independently of the
> map update operations.

That is still an interesting requirement if the map is in use when the
iterator is created, as there are potential races adding or removing
elements.

As there is no way to make an atomic snapshot then you will have to have
some way of telling when each entry was added and removed from the map. Your
timestamps seem the only option - albeit a complicated one. It would be a
fair bit simpler if you didn't need to worry about elements that are removed
after the iterator is created. :)

Good luck.

David Holmes

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Timestamps-based ConcurrentMap

Bob Lee-8
On 3/11/06, David Holmes <[hidden email]> wrote:
> That is still an interesting requirement if the map is in use when the
> iterator is created, as there are potential races adding or removing
> elements.
>
> As there is no way to make an atomic snapshot then you will have to have
> some way of telling when each entry was added and removed from the map. Your
> timestamps seem the only option - albeit a complicated one.

Actually, first, is it OK for additions/removals during the actual
creation (but not after)? If so, my original solution will work.

If not, in lieu of timestamps, you can decorate your map and record
undo objects. At the point where you want your "snapshot," you ask the
map for an undo version (this will be an index into the list of undo
objects, the map will start recording them if it isn't already). Next,
create your copy by iterating over the map and take another undo
version (this also tells the map you're done which means it may be
able to stop recording). Now execute the undo objects against your map
copy and you should have a snapshot of the map at the point your took
the first undo version.

As this is a concurrent map, entries can be added/removed at the time
you take the actual snapshot (which means the snapshot would include
the change). Is this acceptable?

The neat thing about this is you can implement it as a decorator and
it will work with any concurrent map (i.e. no hacking the internals of
CHM).

Bob

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Timestamps-based ConcurrentMap

Bob Lee-8
In reply to this post by Jean Morissette-2
On 3/10/06, Jean Morissette <[hidden email]> wrote:
> However, they need to reflect the
> state of the map at the creation of the iterator, independently of the
> map update operations.

Can you give a little more detail on your requirements? I've
implemented most of what you've said so far, but I'm not sure why
something like this would be useful. Not tested:

/**
 * Does not support null values.
 */
public class SnapshotMap<K, V> implements ConcurrentMap<K, V> {

  ConcurrentMap<K, V> delegate;
  volatile Undos undos;

  public SnapshotMap(ConcurrentMap<K, V> delegate) {
    this.delegate = delegate;
  }

  public V putIfAbsent(final K key, final V value) {
    V v = delegate.putIfAbsent(key, value);
    // assume absence if old value is null.
    if (v == null && undos != null)
      undos.add(new Undo<K, V>() {
        public void undo(Map<K, V> map) {
          map.remove(key);
        }
      });
    return v;
  }

  public boolean remove(final Object key, final Object value) {
    boolean removed = delegate.remove(key, value);
    if (removed && undos != null)
      undos.add(new Undo<K, V>() {
        public void undo(Map<K, V> map) {
          map.put((K) key, (V) value);
        }
      });
    return removed;
  }

  public boolean replace(final K key, final V oldValue, V newValue) {
    boolean replaced = delegate.replace(key, oldValue, newValue);
    if (replaced && undos != null)
      undos.add(new Undo<K, V>() {
        public void undo(Map<K, V> map) {
          map.put(key, oldValue);
        }
      });
    return replaced;
  }

  public V replace(final K key, V value) {
    final V oldValue = delegate.replace(key, value);
    if (oldValue != null && undos != null)
      undos.add(new Undo<K, V>() {
        public void undo(Map<K, V> map) {
          map.put(key, oldValue);
        }
      });
    return oldValue;
  }

  public int size() {
    return delegate.size();
  }

  public boolean isEmpty() {
    return delegate.isEmpty();
  }

  public boolean containsKey(Object key) {
    return delegate.containsKey(key);
  }

  public boolean containsValue(Object value) {
    return delegate.containsValue(value);
  }

  public V get(Object key) {
    return delegate.get(key);
  }

  public V put(final K key, V value) {
    final V oldValue = delegate.put(key, value);
    if (oldValue != null && undos != null)
      undos.add(new Undo<K, V>() {
        public void undo(Map<K, V> map) {
          map.put(key, oldValue);
        }
      });
    return oldValue;
  }

  public V remove(final Object key) {
    final V oldValue = delegate.remove(key);
    if (oldValue != null && undos != null)
      undos.add(new Undo<K, V>() {
        public void undo(Map<K, V> map) {
          map.put((K) key, oldValue);
        }
      });
    return oldValue;
  }

  public void putAll(Map<? extends K, ? extends V> t) {
    // not sure what to do here. synchronize?
    delegate.putAll(t);
  }

  public void clear() {
    // ditto.
    delegate.clear();
  }

  public Set<K> keySet() {
    return snapshot().keySet();
  }

  public Collection<V> values() {
    return snapshot().values();
  }

  public Set<Entry<K, V>> entrySet() {
    return snapshot().entrySet();
  }

  Map<K, V> snapshot() {
    Undos undos;
    Map<K, V> snapshot;
    synchronized (this) {
      this.undos = new Undos();
      snapshot = new HashMap<K, V>(delegate);
      this.undos.disable();
      undos = this.undos;
      this.undos = null;
    }
    for (Undo<K, V> undo : undos)
      undo.undo(snapshot);
    return snapshot;
  }

  public boolean equals(Object o) {
    return delegate.equals(o);
  }

  public int hashCode() {
    return delegate.hashCode();
  }

  class Undos extends LinkedList<Undo<K, V>> {

    boolean enabled = true;

    public synchronized boolean add(Undo<K, V> undo) {
      return enabled && super.add(undo);
    }

    synchronized void disable() {
      enabled = false;
    }
  }

  interface Undo<K, V> {
    void undo(Map<K, V> map);
  }
}

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Timestamps-based ConcurrentMap

tpeierls
On 3/10/06, Jean Morissette <[hidden email]> wrote:
> However, they need to reflect the
> state of the map at the creation of the iterator, independently of the
> map update operations.


Maybe what you're looking for is an implementation of ConcurrentMap based on purely functional data structures. The current state of the map could be maintained as an AtomicReference to such a data structure. Mutative operations could optimistically create a new state and CAS the AtomicReference with it. Non-mutative operations, including iterator creation, would hold a reference to a consistent but possibly out-of-date immutable data structure. This approach is only a win if you expect low contention by mutative operations, i.e., if the CAS operations will usually succeed.

But this doesn't use CHM at all.

--tim

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest