More control on 'full' with blockingqueue.

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

More control on 'full' with blockingqueue.

Peter Veentjer - Anchor Men
For a searchengine platform I`m currently developing (based on Lucene) I need better control on BlockingQueues when they are full. If I only check on the number of items in the queue it could happen that I would have 10 documents of 100 mb in a Queue (and this is a bad thing). But if I restrict the size of the Queue too much, processing 100 documents of 1kb would take too long. Therefore I need better control if it is allowed to add an item to the Queue and that is why I have written the ConditionalBlockingQueue and it uses a strategy: IsFull (the names have to be improved) and this allows better control. The behaviour of BlockingQueues that block on the capacity (the number of items in the Queue) could be made with the IsFull strategy.
 
My question is what you think of this solution and if there are others that had the same problem.
 
At the moment (I haven`t added the new ConditionalBlockingQueue implementation) I have solved the problem by creating 2 LinkedBlockingQueues: small document queue (with a big capacity) and a big document queue (with a small capacity) where I have some control on the total amount of documents being processed but in some cases it could bring the system down to its knees (if there are a lot of big documents).
 
Another thing I was wondering about, is why final class fields are places in local variables? Functionally they are equivalent so the only reason I can think of is that it would be faster. But isn`t this an optimalisation the compiler could do (if it is faster)?
 
example from the LinkedBlockingQueue:
 
 public E take() throws InterruptedException {
        E x;
        int c = -1;
        final AtomicInteger count = this.count;//Why??
        final ReentrantLock takeLock = this.takeLock;//Why??
        takeLock.lockInterruptibly();
        try {
            try {
                while (count.get() == 0)
                    notEmpty.await();
            } catch (InterruptedException ie) {
                notEmpty.signal(); // propagate to a non-interrupted thread
                throw ie;
            }
            x = extract();
            c = count.getAndDecrement();
            if (c > 1)
                notEmpty.signal();
        } finally {
            takeLock.unlock();
        }
        if (c == capacity)
            signalNotFull();
        return x;
    }

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: More control on 'full' with blockingqueue.

Bill Scherer
On Sun, 7 Aug 2005, Peter Veentjer - Anchor Men wrote:

> For a searchengine platform I`m currently developing (based on Lucene) I
> need better control on BlockingQueues when they are full. If I only
> check on the number of items in the queue it could happen that I would
> have 10 documents of 100 mb in a Queue (and this is a bad thing). But if
> I restrict the size of the Queue too much, processing 100 documents of
> 1kb would take too long. Therefore I need better control if it is
> allowed to add an item to the Queue and that is why I have written the
> ConditionalBlockingQueue and it uses a strategy: IsFull (the names have
> to be improved) and this allows better control. The behaviour of
> BlockingQueues that block on the capacity (the number of items in the
> Queue) could be made with the IsFull strategy.
>
> My question is what you think of this solution and if there are others
> that had the same problem.

I've not had the same problem, but if I'm understanding you correctly, it
seems that a nice solution might be something along the lines of the
following:

Modify the various offer() and take() (and related) calls to accept a
weight parameter (which would also need to be added into the Node class
(see the insert() method). In your case, you'd supply this as a function
of the document length. Then, instead of using getAndIncrement() and
getAndDecrement() calls inside offer() and take(), use getAndAdd() (which
can take a negative operand) to add the weight of the current document.

With all this done, you have a straightforward way to bound the total size
of documents being processed at any one time. In particular, your queue
size plus the size of the largest document you ever process is your
worst-case size. (It can go over if you remove one document, getting just
under the queue size, then add something really big. You could also addd
conditional logic to forbid adding big doucments until the queue is small
enough to hold them, but I doubt that this would gain you anything in
practice.

If you need help getting this blocking queue variant working, let me know.

Cheers,
  -- Bill
_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: More control on 'full' with blockingqueue.

tpeierls
In reply to this post by Peter Veentjer - Anchor Men
Peter Veentjer - Anchor Men wrote:
> Another thing I was wondering about, is why final class fields are places
> in local variables? Functionally they are equivalent so the only reason I
> can think of is that it would be faster. But isn`t this an optimalisation
> the compiler could do (if it is faster)?

Doug Lea once explained it like this: "... it is a hack to work around a
hotspot [limitation].  It currently doesn't understand that a final field can
be cached across a code block."

--tim

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

RE: More control on 'full' with blockingqueue.

Peter Veentjer - Anchor Men
In reply to this post by Peter Veentjer - Anchor Men
 

I've not had the same problem, but if I'm understanding you correctly, it
seems that a nice solution might be something along the lines of the
following:

Modify the various offer() and take() (and related) calls to accept a
weight parameter (which would also need to be added into the Node class
(see the insert() method). In your case, you'd supply this as a function
of the document length. Then, instead of using getAndIncrement() and
getAndDecrement() calls inside offer() and take(), use getAndAdd() (which
can take a negative operand) to add the weight of the current document.
------------------
This could be a solution and this would give an indication for the remaining capacity.
 
The advantage of totally externalizing the 'isfull' functionality is
- I can share the instance between multiple queues and this make communication between them possible.
- It is less restrictive. It doesn`t impose a numeric weight to each item. It seperates the what from the how better I think.

If you need help getting this blocking queue variant working, let me know.
-------------------
Thanks for the offer, but I already have implemented a version based on the LinkedBlockingQueue. Maybe you want to have a look at it?

And do you know if there are extensions available for the concurrency library? For the Collection Framework there are numerous extensions available, but I haven`t found any for the concurrency library.



_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

RE: More control on 'full' with blockingqueue.

Bill Scherer
On Sun, 7 Aug 2005, Peter Veentjer - Anchor Men wrote:

>
>
> I've not had the same problem, but if I'm understanding you correctly, it
> seems that a nice solution might be something along the lines of the
> following:
>
> Modify the various offer() and take() (and related) calls to accept a
> weight parameter (which would also need to be added into the Node class
> (see the insert() method). In your case, you'd supply this as a function
> of the document length. Then, instead of using getAndIncrement() and
> getAndDecrement() calls inside offer() and take(), use getAndAdd() (which
> can take a negative operand) to add the weight of the current document.
> ------------------
> This could be a solution and this would give an indication for the
> remaining capacity.
>
> The advantage of totally externalizing the 'isfull' functionality is
> - I can share the instance between multiple queues and this make
>   communication between them possible.
> - It is less restrictive. It doesn`t impose a numeric weight to each
>   item. It seperates the what from the how better I think.
>
> If you need help getting this blocking queue variant working, let me know.
> -------------------
> Thanks for the offer, but I already have implemented a version based on
> the LinkedBlockingQueue. Maybe you want to have a look at it?

I would be happy to take a look at it.

> And do you know if there are extensions available for the concurrency
> library? For the Collection Framework there are numerous extensions
> available, but I haven`t found any for the concurrency library.

I'm not aware of any, but I haven't gone looking for them either (so don't
read too much into this).

Cheers,
  -- Bill
_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest