Faking time

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Faking time

Josh Humphries
A number of times, I've found myself battling unit tests for code that uses multiple threads, schedules tasks, observes the system clock, etc. These tests can be difficult to make deterministic, so they end up being flaky... or very slow... or both.

I've built some fairly complicated machinery that handles many cases in a way that "just works". We have a FakeClockScheduledExecutorService that can be injected into code that needs to schedule things. The FakeClock on which it is based has a fairly complicated internal implementation, due to the explicit intent that it be thread-safe.

But there are still a lot of cases where it doesn't really work. And cases where it is excessively difficult to make it work: very intrusive changes required to the code-under-test, the tests themselves end up too "white box" where it must know many trivial intermediate steps (points in time) and assert them along the way, etc.

I've toyed with the idea of using a class loader and/or agent that could transform classes so that accesses to Thread.sleep, LockSupport.park, etc could be re-written to use "fake time" mechanisms. But the more I think about it, the more really nasty sharp corners I see on an approach like this (and all other approaches I've considered, really).


So... does anyone here know of any prior art for mocking the notion of time in tests? Is there anything that does it really well? Is it even feasible that something that does it well could exist? :)


----
Josh Humphries
Manager, Shared Systems  |  Platform Engineering
Atlanta, GA  |  678-400-4867

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Faking time

David Holmes-6

Hi Josh,
 
I don't understand what you mean by "faking time" here.
 
David
-----Original Message-----
From: [hidden email] [mailto:[hidden email]]On Behalf Of Josh Humphries
Sent: Wednesday, 1 April 2015 2:34 PM
To: [hidden email]
Subject: [concurrency-interest] Faking time

A number of times, I've found myself battling unit tests for code that uses multiple threads, schedules tasks, observes the system clock, etc. These tests can be difficult to make deterministic, so they end up being flaky... or very slow... or both.

I've built some fairly complicated machinery that handles many cases in a way that "just works". We have a FakeClockScheduledExecutorService that can be injected into code that needs to schedule things. The FakeClock on which it is based has a fairly complicated internal implementation, due to the explicit intent that it be thread-safe.

But there are still a lot of cases where it doesn't really work. And cases where it is excessively difficult to make it work: very intrusive changes required to the code-under-test, the tests themselves end up too "white box" where it must know many trivial intermediate steps (points in time) and assert them along the way, etc.

I've toyed with the idea of using a class loader and/or agent that could transform classes so that accesses to Thread.sleep, LockSupport.park, etc could be re-written to use "fake time" mechanisms. But the more I think about it, the more really nasty sharp corners I see on an approach like this (and all other approaches I've considered, really).


So... does anyone here know of any prior art for mocking the notion of time in tests? Is there anything that does it really well? Is it even feasible that something that does it well could exist? :)


----
Josh Humphries
Manager, Shared Systems  |  Platform Engineering
Atlanta, GA  |  678-400-4867

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Faking time

Joe Bowbeer
I haven't taken this as far as I'd like to, but after some effort trying to test asynchronous components of real-world (Android) applications, I decided that Solution 3 worked the best: Test in One Thread.


This technique relies on a Timer interface, which is implemented by a deterministic timer in the unit tests:

deterministicTimer.elapseSeconds(3);

I suspect that the ability to elapse-seconds, as above, is what you mean by "faking time" -- except that you're trying to fake some of the System time methods so that everything else "just works", without a lot of refactoring.

On Tue, Mar 31, 2015 at 9:49 PM, David Holmes <[hidden email]> wrote:
Hi Josh,
 
I don't understand what you mean by "faking time" here.
 
David
-----Original Message-----
From: [hidden email] [mailto:[hidden email]]On Behalf Of Josh Humphries
Sent: Wednesday, 1 April 2015 2:34 PM
To: [hidden email]
Subject: [concurrency-interest] Faking time

A number of times, I've found myself battling unit tests for code that uses multiple threads, schedules tasks, observes the system clock, etc. These tests can be difficult to make deterministic, so they end up being flaky... or very slow... or both.

I've built some fairly complicated machinery that handles many cases in a way that "just works". We have a FakeClockScheduledExecutorService that can be injected into code that needs to schedule things. The FakeClock on which it is based has a fairly complicated internal implementation, due to the explicit intent that it be thread-safe.

But there are still a lot of cases where it doesn't really work. And cases where it is excessively difficult to make it work: very intrusive changes required to the code-under-test, the tests themselves end up too "white box" where it must know many trivial intermediate steps (points in time) and assert them along the way, etc.

I've toyed with the idea of using a class loader and/or agent that could transform classes so that accesses to Thread.sleep, LockSupport.park, etc could be re-written to use "fake time" mechanisms. But the more I think about it, the more really nasty sharp corners I see on an approach like this (and all other approaches I've considered, really).


So... does anyone here know of any prior art for mocking the notion of time in tests? Is there anything that does it really well? Is it even feasible that something that does it well could exist? :)


----
Josh Humphries
Manager, Shared Systems  |  Platform Engineering
Atlanta, GA  |  <a href="tel:678-400-4867" value="+16784004867" target="_blank">678-400-4867

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest



_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Faking time

Josh Humphries
In reply to this post by David Holmes-6
Sorry if I didn't explain it well.

For code that cares about time, I want to be able to use mocks or some other mechanism to make it deterministic.

What we have so far is a Clock interface implemented on top of System methods. Instead of reading System for current time (whether it be currentTimeMillis() or nanoTime()), we try to inject a clock and use that. It also provides factory methods for other means by which code can examine the current time, like for Joda DateTimes and Instants and even j.u.Date. To make tests faster, we also provide a Clock.sleep. The system version delegates to Thread.sleep.

We also have a FakeClock implementation. It provides a monotonic nano-time counter and conversions for querying millisecond-resolution (potentially non-monotonic) counter. FakeClock.sleep actually just advances the clock without blocking.

FakeClock also has some complicated mechanism for scheduling. This basically tracks a queue of scheduled events. When you advance the FakeClock, it runs tasks that were scheduled for a given point in time as that point in time is passed. The interface for scheduling is a FakeClockScheduledExecutorService, so basically the ScheduledExecutorService API.

But there are several patterns of async / multi-threaded execution where this is not sufficient (at least not without very intrusive changes, and the resulting tests still end up brittle and often require knowledge of non-obvious implementation details, which makes for poor tests).



----
Josh Humphries
Manager, Shared Systems  |  Platform Engineering
Atlanta, GA  |  678-400-4867

On Wed, Apr 1, 2015 at 12:49 AM, David Holmes <[hidden email]> wrote:
Hi Josh,
 
I don't understand what you mean by "faking time" here.
 
David
-----Original Message-----
From: [hidden email] [mailto:[hidden email]]On Behalf Of Josh Humphries
Sent: Wednesday, 1 April 2015 2:34 PM
To: [hidden email]
Subject: [concurrency-interest] Faking time

A number of times, I've found myself battling unit tests for code that uses multiple threads, schedules tasks, observes the system clock, etc. These tests can be difficult to make deterministic, so they end up being flaky... or very slow... or both.

I've built some fairly complicated machinery that handles many cases in a way that "just works". We have a FakeClockScheduledExecutorService that can be injected into code that needs to schedule things. The FakeClock on which it is based has a fairly complicated internal implementation, due to the explicit intent that it be thread-safe.

But there are still a lot of cases where it doesn't really work. And cases where it is excessively difficult to make it work: very intrusive changes required to the code-under-test, the tests themselves end up too "white box" where it must know many trivial intermediate steps (points in time) and assert them along the way, etc.

I've toyed with the idea of using a class loader and/or agent that could transform classes so that accesses to Thread.sleep, LockSupport.park, etc could be re-written to use "fake time" mechanisms. But the more I think about it, the more really nasty sharp corners I see on an approach like this (and all other approaches I've considered, really).


So... does anyone here know of any prior art for mocking the notion of time in tests? Is there anything that does it really well? Is it even feasible that something that does it well could exist? :)


----
Josh Humphries
Manager, Shared Systems  |  Platform Engineering
Atlanta, GA  |  <a href="tel:678-400-4867" value="+16784004867" target="_blank">678-400-4867


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Faking time

Jeremy Whiting
In reply to this post by Josh Humphries
http://tempusfugitlibrary.org

 Take a look at the docs section "Time Sensitive Code" for test examples using the library. It might be what you are looking for.

Regards,
Jeremy
-- 
Jeremy Whiting
Senior Software Engineer, JBoss Performance Team
Red Hat
On 01/04/15 05:33, Josh Humphries wrote:
A number of times, I've found myself battling unit tests for code that uses multiple threads, schedules tasks, observes the system clock, etc. These tests can be difficult to make deterministic, so they end up being flaky... or very slow... or both.

I've built some fairly complicated machinery that handles many cases in a way that "just works". We have a FakeClockScheduledExecutorService that can be injected into code that needs to schedule things. The FakeClock on which it is based has a fairly complicated internal implementation, due to the explicit intent that it be thread-safe.

But there are still a lot of cases where it doesn't really work. And cases where it is excessively difficult to make it work: very intrusive changes required to the code-under-test, the tests themselves end up too "white box" where it must know many trivial intermediate steps (points in time) and assert them along the way, etc.

I've toyed with the idea of using a class loader and/or agent that could transform classes so that accesses to Thread.sleep, LockSupport.park, etc could be re-written to use "fake time" mechanisms. But the more I think about it, the more really nasty sharp corners I see on an approach like this (and all other approaches I've considered, really).


So... does anyone here know of any prior art for mocking the notion of time in tests? Is there anything that does it really well? Is it even feasible that something that does it well could exist? :)


----
Josh Humphries
Manager, Shared Systems  |  Platform Engineering
Atlanta, GA  |  678-400-4867


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest



_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Faking time

Josh Humphries
Hey, Jeremy,
Thanks for the link. But that looks like a strict subset of what we already have. Maybe the stuff we've built is already "state of the art" for this sort of purpose. But I occasionally get stymied when trying to fix flaky time-sensitive tests: tools of this sort just aren't always enough.


----
Josh Humphries
Manager, Shared Systems  |  Platform Engineering
Atlanta, GA  |  678-400-4867

On Wed, Apr 1, 2015 at 12:53 PM, Jeremy Whiting <[hidden email]> wrote:
http://tempusfugitlibrary.org

 Take a look at the docs section "Time Sensitive Code" for test examples using the library. It might be what you are looking for.

Regards,
Jeremy
-- 
Jeremy Whiting
Senior Software Engineer, JBoss Performance Team
Red Hat
On 01/04/15 05:33, Josh Humphries wrote:
A number of times, I've found myself battling unit tests for code that uses multiple threads, schedules tasks, observes the system clock, etc. These tests can be difficult to make deterministic, so they end up being flaky... or very slow... or both.

I've built some fairly complicated machinery that handles many cases in a way that "just works". We have a FakeClockScheduledExecutorService that can be injected into code that needs to schedule things. The FakeClock on which it is based has a fairly complicated internal implementation, due to the explicit intent that it be thread-safe.

But there are still a lot of cases where it doesn't really work. And cases where it is excessively difficult to make it work: very intrusive changes required to the code-under-test, the tests themselves end up too "white box" where it must know many trivial intermediate steps (points in time) and assert them along the way, etc.

I've toyed with the idea of using a class loader and/or agent that could transform classes so that accesses to Thread.sleep, LockSupport.park, etc could be re-written to use "fake time" mechanisms. But the more I think about it, the more really nasty sharp corners I see on an approach like this (and all other approaches I've considered, really).


So... does anyone here know of any prior art for mocking the notion of time in tests? Is there anything that does it really well? Is it even feasible that something that does it well could exist? :)


----
Josh Humphries
Manager, Shared Systems  |  Platform Engineering
Atlanta, GA  |  <a href="tel:678-400-4867" value="+16784004867" target="_blank">678-400-4867


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest




_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Faking time

Shevek
In reply to this post by Josh Humphries
Guava contains a Ticker class which is Google's approach for faking
time. I haven't traced how far down into the various schedulers and so
forth they have pushed it, though. However, it sounds the same as your
FakeClock, so may be worth using.

S.

On 03/31/2015 09:33 PM, Josh Humphries wrote:

> A number of times, I've found myself battling unit tests for code that
> uses multiple threads, schedules tasks, observes the system clock, etc.
> These tests can be difficult to make deterministic, so they end up being
> flaky... or very slow... or both.
>
> I've built some fairly complicated machinery that handles many cases in
> a way that "just works". We have a FakeClockScheduledExecutorService
> that can be injected into code that needs to schedule things. The
> FakeClock on which it is based has a fairly complicated internal
> implementation, due to the explicit intent that it be thread-safe.
>
> But there are still a lot of cases where it doesn't really work. And
> cases where it is excessively difficult to make it work: very intrusive
> changes required to the code-under-test, the tests themselves end up too
> "white box" where it must know many trivial intermediate steps (points
> in time) and assert them along the way, etc.
>
> I've toyed with the idea of using a class loader and/or agent that could
> transform classes so that accesses to Thread.sleep, LockSupport.park,
> etc could be re-written to use "fake time" mechanisms. But the more I
> think about it, the more really nasty sharp corners I see on an approach
> like this (and all other approaches I've considered, really).
>
>
> So... does anyone here know of any prior art for mocking the notion of
> time in tests? Is there anything that does it really well? Is it even
> feasible that something that does it well /could/ exist? :)
>
>
> ----
> *Josh Humphries*
> Manager, Shared Systems  |  Platform Engineering
> Atlanta, GA  |  678-400-4867
> *Square* (www.squareup.com <http://www.squareup.com>)
>
>
> _______________________________________________
> Concurrency-interest mailing list
> [hidden email]
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>
_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Faking time

Josh Humphries
Indeed they do. But it's not used widely, just in Stopwatch and internally (package-private, for their own unit tests) in RateLimiter, IIRC.

As a matter of fact, we use Stopwatch in our code for measuring time, and our Clock interface has a factory method for a Ticker so that Stopwatches also respect our notion of fake time.

Given the kind of answers I've gotten so far, I'm inclined to believe that we may already be on the "bleeding edge" for this sort of work. I've been wanting to open-source it for a while, so this gives me greater reason to do so since I now have reason to believe it's novel.

I was mainly asking in this forum in case there was something more ambitious that could move us in this direction even more. This seemed like a good audience for this kind of subject.


----
Josh Humphries
Manager, Shared Systems  |  Platform Engineering
Atlanta, GA  |  678-400-4867

On Fri, Apr 3, 2015 at 12:10 PM, Shevek <[hidden email]> wrote:
Guava contains a Ticker class which is Google's approach for faking time. I haven't traced how far down into the various schedulers and so forth they have pushed it, though. However, it sounds the same as your FakeClock, so may be worth using.

S.


On 03/31/2015 09:33 PM, Josh Humphries wrote:
A number of times, I've found myself battling unit tests for code that
uses multiple threads, schedules tasks, observes the system clock, etc.
These tests can be difficult to make deterministic, so they end up being
flaky... or very slow... or both.

I've built some fairly complicated machinery that handles many cases in
a way that "just works". We have a FakeClockScheduledExecutorService
that can be injected into code that needs to schedule things. The
FakeClock on which it is based has a fairly complicated internal
implementation, due to the explicit intent that it be thread-safe.

But there are still a lot of cases where it doesn't really work. And
cases where it is excessively difficult to make it work: very intrusive
changes required to the code-under-test, the tests themselves end up too
"white box" where it must know many trivial intermediate steps (points
in time) and assert them along the way, etc.

I've toyed with the idea of using a class loader and/or agent that could
transform classes so that accesses to Thread.sleep, LockSupport.park,
etc could be re-written to use "fake time" mechanisms. But the more I
think about it, the more really nasty sharp corners I see on an approach
like this (and all other approaches I've considered, really).


So... does anyone here know of any prior art for mocking the notion of
time in tests? Is there anything that does it really well? Is it even
feasible that something that does it well /could/ exist? :)


----
*Josh Humphries*
Manager, Shared Systems  |  Platform Engineering
Atlanta, GA  |  <a href="tel:678-400-4867" value="+16784004867" target="_blank">678-400-4867
*Square* (www.squareup.com <http://www.squareup.com>)


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest



_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Faking time

Dávid Karnok
In reply to this post by Josh Humphries
Hi Josh,

We faced a similar problem in RxJava and we implemented a TestScheduler whose Scheduler API is similar to a ScheduledExecutorService plus offers methods to advance a virtual time. Operators that require a Scheduler then can be simply use this TestScheduler instead of a "real" one. However, since RxJava is non-blocking and the scheduling dimension is orthogonal, we don't deal with sleeps, parks or waits.

Josh Humphries <[hidden email]> ezt írta (2015. április 1., szerda):
A number of times, I've found myself battling unit tests for code that uses multiple threads, schedules tasks, observes the system clock, etc. These tests can be difficult to make deterministic, so they end up being flaky... or very slow... or both.

I've built some fairly complicated machinery that handles many cases in a way that "just works". We have a FakeClockScheduledExecutorService that can be injected into code that needs to schedule things. The FakeClock on which it is based has a fairly complicated internal implementation, due to the explicit intent that it be thread-safe.

But there are still a lot of cases where it doesn't really work. And cases where it is excessively difficult to make it work: very intrusive changes required to the code-under-test, the tests themselves end up too "white box" where it must know many trivial intermediate steps (points in time) and assert them along the way, etc.

I've toyed with the idea of using a class loader and/or agent that could transform classes so that accesses to Thread.sleep, LockSupport.park, etc could be re-written to use "fake time" mechanisms. But the more I think about it, the more really nasty sharp corners I see on an approach like this (and all other approaches I've considered, really).


So... does anyone here know of any prior art for mocking the notion of time in tests? Is there anything that does it really well? Is it even feasible that something that does it well could exist? :)


----
Josh Humphries
Manager, Shared Systems  |  Platform Engineering
Atlanta, GA  |  678-400-4867


--
Best regards,
David Karnok


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Faking time

sanne.grinovero
Hi Josh,
we had similar needs in Infinispan. Sometimes I would use Byteman [1],
but generally Infinispan's code was refactored to never invoke the
time methods from the JDK directly (nor helpers such as Thread#sleep)
but rather rely on its own TimeService interface.
At runtime there will be only one implementation loaded, and mostly
perform straight forward delegation to the JDK methods.
Several tests will use the capability to replace implementation so
you'll get to mock the clock, be it wall clock or other notions; this
practice proved indeed very useful for us, not only to make it easier
to simulate specific race conditions but also to generally speedup the
testsuite as you can get rid of most cases in which some thread needs
to wait for "some time".

1 - http://byteman.jboss.org/

Regards,
Sanne


On 3 April 2015 at 17:33, Dávid Karnok <[hidden email]> wrote:

> Hi Josh,
>
> We faced a similar problem in RxJava and we implemented a TestScheduler
> whose Scheduler API is similar to a ScheduledExecutorService plus offers
> methods to advance a virtual time. Operators that require a Scheduler then
> can be simply use this TestScheduler instead of a "real" one. However, since
> RxJava is non-blocking and the scheduling dimension is orthogonal, we don't
> deal with sleeps, parks or waits.
>
>
> Josh Humphries <[hidden email]> ezt írta (2015. április 1., szerda):
>>
>> A number of times, I've found myself battling unit tests for code that
>> uses multiple threads, schedules tasks, observes the system clock, etc.
>> These tests can be difficult to make deterministic, so they end up being
>> flaky... or very slow... or both.
>>
>> I've built some fairly complicated machinery that handles many cases in a
>> way that "just works". We have a FakeClockScheduledExecutorService that can
>> be injected into code that needs to schedule things. The FakeClock on which
>> it is based has a fairly complicated internal implementation, due to the
>> explicit intent that it be thread-safe.
>>
>> But there are still a lot of cases where it doesn't really work. And cases
>> where it is excessively difficult to make it work: very intrusive changes
>> required to the code-under-test, the tests themselves end up too "white box"
>> where it must know many trivial intermediate steps (points in time) and
>> assert them along the way, etc.
>>
>> I've toyed with the idea of using a class loader and/or agent that could
>> transform classes so that accesses to Thread.sleep, LockSupport.park, etc
>> could be re-written to use "fake time" mechanisms. But the more I think
>> about it, the more really nasty sharp corners I see on an approach like this
>> (and all other approaches I've considered, really).
>>
>>
>> So... does anyone here know of any prior art for mocking the notion of
>> time in tests? Is there anything that does it really well? Is it even
>> feasible that something that does it well could exist? :)
>>
>>
>> ----
>> Josh Humphries
>> Manager, Shared Systems  |  Platform Engineering
>> Atlanta, GA  |  678-400-4867
>> Square (www.squareup.com)
>
>
>
> --
> Best regards,
> David Karnok
>
>
> _______________________________________________
> Concurrency-interest mailing list
> [hidden email]
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>

_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Faking time

Josh Humphries
Hi, Sanne,
Thanks for the reply! That's basically the strategy we've employed as well.

Problems arise when testing multi-threaded stuff with complex scheduling going on. The main issue is if you schedule task for time X, that task then starts up work on some other thread, and then that other thread schedules another task for time Y.

We have some cases of this in (fairly sophisticated) code that deals with failures and retries. So here's a particularly gnarly case that has come up. It is in thread-safe, concurrent code (see statement above, where task at time X skips a thread before scheduling a task at time Y).

I hope I can explain it. Bear with me:

Our "fake clock" scheduler allows you to advance from time X to Y, no problem. And, if there are multiple tasks scheduled along the way, they are executed sequentially in FIFO order (based on order in which they were scheduled, which isn't always but often is deterministic).

But let's say we want to move from time W to Y. Ideally, our task would run at time X (as described above) and then we could assert that the subsequent scheduled task ran at time Y. The issue is the non-deterministic nature of the task running at time X. Since it wants to schedule a task for time Y on another thread, our scheduler doesn't know about it. When the main task completes, the scheduler keeps moving the clock forward, racing with the thread is trying to schedule something for time Y.

So there's the rub. The test is now flaky. We do have a solution. We have synchronization mechanisms that can be used where we advance the clock to time X and then must wait until we see the task scheduled at time Y before actually proceeding to time Y. But that requires very "whitebox" tests, where each step is advancing the clock a little at a time, based on intimate knowledge of underlying implementation of how various bits are getting scheduled. Ick.

Anyhow, the code in question is not easy to re-write to make it more non-deterministic (in terms of the hand off to another thread prior to scheduling the next task). So I think the answer for me is "deal with the whitebox test, or do something at a higher level that is resilient to time-sensitive flakiness".



----
Josh Humphries
Manager, Shared Systems  |  Platform Engineering
Atlanta, GA  |  678-400-4867

On Fri, Apr 3, 2015 at 6:32 PM, Sanne Grinovero <[hidden email]> wrote:
Hi Josh,
we had similar needs in Infinispan. Sometimes I would use Byteman [1],
but generally Infinispan's code was refactored to never invoke the
time methods from the JDK directly (nor helpers such as Thread#sleep)
but rather rely on its own TimeService interface.
At runtime there will be only one implementation loaded, and mostly
perform straight forward delegation to the JDK methods.
Several tests will use the capability to replace implementation so
you'll get to mock the clock, be it wall clock or other notions; this
practice proved indeed very useful for us, not only to make it easier
to simulate specific race conditions but also to generally speedup the
testsuite as you can get rid of most cases in which some thread needs
to wait for "some time".

1 - http://byteman.jboss.org/

Regards,
Sanne


On 3 April 2015 at 17:33, Dávid Karnok <[hidden email]> wrote:
> Hi Josh,
>
> We faced a similar problem in RxJava and we implemented a TestScheduler
> whose Scheduler API is similar to a ScheduledExecutorService plus offers
> methods to advance a virtual time. Operators that require a Scheduler then
> can be simply use this TestScheduler instead of a "real" one. However, since
> RxJava is non-blocking and the scheduling dimension is orthogonal, we don't
> deal with sleeps, parks or waits.
>
>
> Josh Humphries <[hidden email]> ezt írta (2015. április 1., szerda):
>>
>> A number of times, I've found myself battling unit tests for code that
>> uses multiple threads, schedules tasks, observes the system clock, etc.
>> These tests can be difficult to make deterministic, so they end up being
>> flaky... or very slow... or both.
>>
>> I've built some fairly complicated machinery that handles many cases in a
>> way that "just works". We have a FakeClockScheduledExecutorService that can
>> be injected into code that needs to schedule things. The FakeClock on which
>> it is based has a fairly complicated internal implementation, due to the
>> explicit intent that it be thread-safe.
>>
>> But there are still a lot of cases where it doesn't really work. And cases
>> where it is excessively difficult to make it work: very intrusive changes
>> required to the code-under-test, the tests themselves end up too "white box"
>> where it must know many trivial intermediate steps (points in time) and
>> assert them along the way, etc.
>>
>> I've toyed with the idea of using a class loader and/or agent that could
>> transform classes so that accesses to Thread.sleep, LockSupport.park, etc
>> could be re-written to use "fake time" mechanisms. But the more I think
>> about it, the more really nasty sharp corners I see on an approach like this
>> (and all other approaches I've considered, really).
>>
>>
>> So... does anyone here know of any prior art for mocking the notion of
>> time in tests? Is there anything that does it really well? Is it even
>> feasible that something that does it well could exist? :)
>>
>>
>> ----
>> Josh Humphries
>> Manager, Shared Systems  |  Platform Engineering
>> Atlanta, GA  |  <a href="tel:678-400-4867" value="+16784004867">678-400-4867
>> Square (www.squareup.com)
>
>
>
> --
> Best regards,
> David Karnok
>
>
> _______________________________________________
> Concurrency-interest mailing list
> [hidden email]
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Faking time

Josh Humphries
On Fri, Apr 3, 2015 at 6:54 PM, Josh Humphries <[hidden email]> wrote:
Hi, Sanne,
Thanks for the reply! That's basically the strategy we've employed as well.

Problems arise when testing multi-threaded stuff with complex scheduling going on. The main issue is if you schedule task for time X, that task then starts up work on some other thread, and then that other thread schedules another task for time Y.

We have some cases of this in (fairly sophisticated) code that deals with failures and retries. So here's a particularly gnarly case that has come up. It is in thread-safe, concurrent code (see statement above, where task at time X skips a thread before scheduling a task at time Y).

I hope I can explain it. Bear with me:

Our "fake clock" scheduler allows you to advance from time X to Y, no problem. And, if there are multiple tasks scheduled along the way, they are executed sequentially in FIFO order (based on order in which they were scheduled, which isn't always but often is deterministic).

But let's say we want to move from time W to Y. Ideally, our task would run at time X (as described above) and then we could assert that the subsequent scheduled task ran at time Y. The issue is the non-deterministic nature of the task running at time X. Since it wants to schedule a task for time Y on another thread, our scheduler doesn't know about it. When the main task completes, the scheduler keeps moving the clock forward, racing with the thread is trying to schedule something for time Y.

So there's the rub. The test is now flaky. We do have a solution. We have synchronization mechanisms that can be used where we advance the clock to time X and then must wait until we see the task scheduled at time Y before actually proceeding to time Y. But that requires very "whitebox" tests, where each step is advancing the clock a little at a time, based on intimate knowledge of underlying implementation of how various bits are getting scheduled. Ick.

Anyhow, the code in question is not easy to re-write to make it more non-deterministic

uh... more deterministic I meant :)
 
(in terms of the hand off to another thread prior to scheduling the next task). So I think the answer for me is "deal with the whitebox test, or do something at a higher level that is resilient to time-sensitive flakiness".



----
Josh Humphries
Manager, Shared Systems  |  Platform Engineering
Atlanta, GA  |  <a href="tel:678-400-4867" value="+16784004867" target="_blank">678-400-4867

On Fri, Apr 3, 2015 at 6:32 PM, Sanne Grinovero <[hidden email]> wrote:
Hi Josh,
we had similar needs in Infinispan. Sometimes I would use Byteman [1],
but generally Infinispan's code was refactored to never invoke the
time methods from the JDK directly (nor helpers such as Thread#sleep)
but rather rely on its own TimeService interface.
At runtime there will be only one implementation loaded, and mostly
perform straight forward delegation to the JDK methods.
Several tests will use the capability to replace implementation so
you'll get to mock the clock, be it wall clock or other notions; this
practice proved indeed very useful for us, not only to make it easier
to simulate specific race conditions but also to generally speedup the
testsuite as you can get rid of most cases in which some thread needs
to wait for "some time".

1 - http://byteman.jboss.org/

Regards,
Sanne


On 3 April 2015 at 17:33, Dávid Karnok <[hidden email]> wrote:
> Hi Josh,
>
> We faced a similar problem in RxJava and we implemented a TestScheduler
> whose Scheduler API is similar to a ScheduledExecutorService plus offers
> methods to advance a virtual time. Operators that require a Scheduler then
> can be simply use this TestScheduler instead of a "real" one. However, since
> RxJava is non-blocking and the scheduling dimension is orthogonal, we don't
> deal with sleeps, parks or waits.
>
>
> Josh Humphries <[hidden email]> ezt írta (2015. április 1., szerda):
>>
>> A number of times, I've found myself battling unit tests for code that
>> uses multiple threads, schedules tasks, observes the system clock, etc.
>> These tests can be difficult to make deterministic, so they end up being
>> flaky... or very slow... or both.
>>
>> I've built some fairly complicated machinery that handles many cases in a
>> way that "just works". We have a FakeClockScheduledExecutorService that can
>> be injected into code that needs to schedule things. The FakeClock on which
>> it is based has a fairly complicated internal implementation, due to the
>> explicit intent that it be thread-safe.
>>
>> But there are still a lot of cases where it doesn't really work. And cases
>> where it is excessively difficult to make it work: very intrusive changes
>> required to the code-under-test, the tests themselves end up too "white box"
>> where it must know many trivial intermediate steps (points in time) and
>> assert them along the way, etc.
>>
>> I've toyed with the idea of using a class loader and/or agent that could
>> transform classes so that accesses to Thread.sleep, LockSupport.park, etc
>> could be re-written to use "fake time" mechanisms. But the more I think
>> about it, the more really nasty sharp corners I see on an approach like this
>> (and all other approaches I've considered, really).
>>
>>
>> So... does anyone here know of any prior art for mocking the notion of
>> time in tests? Is there anything that does it really well? Is it even
>> feasible that something that does it well could exist? :)
>>
>>
>> ----
>> Josh Humphries
>> Manager, Shared Systems  |  Platform Engineering
>> Atlanta, GA  |  <a href="tel:678-400-4867" value="+16784004867" target="_blank">678-400-4867
>> Square (www.squareup.com)
>
>
>
> --
> Best regards,
> David Karnok
>
>
> _______________________________________________
> Concurrency-interest mailing list
> [hidden email]
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>



_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Faking time

William Louth
I use the same discrete event simulation that powers our JVM machine simulation technology (in both Simz and Stenos) to accomplish this but in a very different manner.

To construct the simulation I first use the Probes Open API (http://autoletics.github.io/probes-api/) to create/begin/end mocked named activities across threads with a "clock.time" meter that is set within the metering scope of the activities (a probe in the api). The clock.time meter acts like a thread specific counter. The code I want to test is not actually present during the run, instead a recording is made of the run. I am in fact creating a scaffolding (or software episodic memory) that will be augmented at a later point in the playback of the software memory. Then in the playback I hook in interceptors that map the named probes to the code that I would like executed. Time itself within the metering engine is based on the meter readings as in controlled progress of execution across threads using a timesync extension. I use this same approach to turn a microbenchmark into something far more representative and meaningful. For a number of years I keep saying I would write an article explaining this in greater detail but somehow I keep getting distracted (or just bored with it).

http://www.autoletics.com/posts/a-time-lord-for-the-java-and-jvm-universe

William Louth
Founder, Autoletics
http://www.autoletics.com

On 04/04/2015 01:01, Josh Humphries wrote:
On Fri, Apr 3, 2015 at 6:54 PM, Josh Humphries <[hidden email]> wrote:
Hi, Sanne,
Thanks for the reply! That's basically the strategy we've employed as well.

Problems arise when testing multi-threaded stuff with complex scheduling going on. The main issue is if you schedule task for time X, that task then starts up work on some other thread, and then that other thread schedules another task for time Y.

We have some cases of this in (fairly sophisticated) code that deals with failures and retries. So here's a particularly gnarly case that has come up. It is in thread-safe, concurrent code (see statement above, where task at time X skips a thread before scheduling a task at time Y).

I hope I can explain it. Bear with me:

Our "fake clock" scheduler allows you to advance from time X to Y, no problem. And, if there are multiple tasks scheduled along the way, they are executed sequentially in FIFO order (based on order in which they were scheduled, which isn't always but often is deterministic).

But let's say we want to move from time W to Y. Ideally, our task would run at time X (as described above) and then we could assert that the subsequent scheduled task ran at time Y. The issue is the non-deterministic nature of the task running at time X. Since it wants to schedule a task for time Y on another thread, our scheduler doesn't know about it. When the main task completes, the scheduler keeps moving the clock forward, racing with the thread is trying to schedule something for time Y.

So there's the rub. The test is now flaky. We do have a solution. We have synchronization mechanisms that can be used where we advance the clock to time X and then must wait until we see the task scheduled at time Y before actually proceeding to time Y. But that requires very "whitebox" tests, where each step is advancing the clock a little at a time, based on intimate knowledge of underlying implementation of how various bits are getting scheduled. Ick.

Anyhow, the code in question is not easy to re-write to make it more non-deterministic

uh... more deterministic I meant :)
 
(in terms of the hand off to another thread prior to scheduling the next task). So I think the answer for me is "deal with the whitebox test, or do something at a higher level that is resilient to time-sensitive flakiness".



----
Josh Humphries
Manager, Shared Systems  |  Platform Engineering
Atlanta, GA  |  <a moz-do-not-send="true" href="tel:678-400-4867" value="+16784004867" target="_blank">678-400-4867

On Fri, Apr 3, 2015 at 6:32 PM, Sanne Grinovero <[hidden email]> wrote:
Hi Josh,
we had similar needs in Infinispan. Sometimes I would use Byteman [1],
but generally Infinispan's code was refactored to never invoke the
time methods from the JDK directly (nor helpers such as Thread#sleep)
but rather rely on its own TimeService interface.
At runtime there will be only one implementation loaded, and mostly
perform straight forward delegation to the JDK methods.
Several tests will use the capability to replace implementation so
you'll get to mock the clock, be it wall clock or other notions; this
practice proved indeed very useful for us, not only to make it easier
to simulate specific race conditions but also to generally speedup the
testsuite as you can get rid of most cases in which some thread needs
to wait for "some time".

1 - http://byteman.jboss.org/

Regards,
Sanne


On 3 April 2015 at 17:33, Dávid Karnok <[hidden email]> wrote:
> Hi Josh,
>
> We faced a similar problem in RxJava and we implemented a TestScheduler
> whose Scheduler API is similar to a ScheduledExecutorService plus offers
> methods to advance a virtual time. Operators that require a Scheduler then
> can be simply use this TestScheduler instead of a "real" one. However, since
> RxJava is non-blocking and the scheduling dimension is orthogonal, we don't
> deal with sleeps, parks or waits.
>
>
> Josh Humphries <[hidden email]> ezt írta (2015. április 1., szerda):
>>
>> A number of times, I've found myself battling unit tests for code that
>> uses multiple threads, schedules tasks, observes the system clock, etc.
>> These tests can be difficult to make deterministic, so they end up being
>> flaky... or very slow... or both.
>>
>> I've built some fairly complicated machinery that handles many cases in a
>> way that "just works". We have a FakeClockScheduledExecutorService that can
>> be injected into code that needs to schedule things. The FakeClock on which
>> it is based has a fairly complicated internal implementation, due to the
>> explicit intent that it be thread-safe.
>>
>> But there are still a lot of cases where it doesn't really work. And cases
>> where it is excessively difficult to make it work: very intrusive changes
>> required to the code-under-test, the tests themselves end up too "white box"
>> where it must know many trivial intermediate steps (points in time) and
>> assert them along the way, etc.
>>
>> I've toyed with the idea of using a class loader and/or agent that could
>> transform classes so that accesses to Thread.sleep, LockSupport.park, etc
>> could be re-written to use "fake time" mechanisms. But the more I think
>> about it, the more really nasty sharp corners I see on an approach like this
>> (and all other approaches I've considered, really).
>>
>>
>> So... does anyone here know of any prior art for mocking the notion of
>> time in tests? Is there anything that does it really well? Is it even
>> feasible that something that does it well could exist? :)
>>
>>
>> ----
>> Josh Humphries
>> Manager, Shared Systems  |  Platform Engineering
>> Atlanta, GA  |  <a moz-do-not-send="true" href="tel:678-400-4867" value="+16784004867" target="_blank">678-400-4867
>> Square (www.squareup.com)
>
>
>
> --
> Best regards,
> David Karnok
>
>
> _______________________________________________
> Concurrency-interest mailing list
> [hidden email]
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>




_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
Reply | Threaded
Open this post in threaded view
|

Re: Faking time

Jeff Hain
In reply to this post by Josh Humphries


Hi.

>For code that cares about time, I want to be able to use mocks
>or some other mechanism to make it deterministic.

For determinism you need also to take care of :
- effect of hashCode() (which default implementation is non-deterministic)
  on logs (it it appears in) and iteration order for hash-based collections
  (typical workarounds : override hashCode(), or sort before iterating).
- optimized Math or floating-point operations which can give different
  results (typical workarounds : use StrictMath, and strictfp or -Xint).



>A number of times, I've found myself battling unit tests for code that uses multiple threads,
>schedules tasks, observes the system clock, etc. These tests can be difficult to make deterministic,
>so they end up being flaky... or very slow... or both.

Indeed, that kind of code is a pain to test and work with.



>I've built some fairly complicated machinery that handles many cases in a way that "just works".
>We have a FakeClockScheduledExecutorService that can be injected into code that needs to schedule
>things. The FakeClock on which it is based has a fairly complicated internal implementation, due
>to the explicit intent that it be thread-safe.
>But there are still a lot of cases where it doesn't really work.
>[...]
>So... does anyone here know of any prior art for mocking the notion of time in tests? Is there
>anything that does it really well? Is it even feasible that something that does it well could
>exist? :)

What you can do is not abstract time away only in your tests,
but both time and threading, and in your whole application,
as part of its architecture
(as I now do all the time - "determinism is like crack"
(I think Martin Thompson coined that)).

It has a lot of strong benefits, on the top of my head :
- Fast unit tests.
- Reproduce problems, quickly.
- Run long time sturdiness tests, quickly,
  or short time ones "instantly".
- Discriminate between domain code bugs
  and threading issues.
- Run with full logs, without trouble
  (other that the log file growing fast).

For example, how I do it is only scheduling treatments through
an interface with these 3 methods:
- void execute(Runnable) // i.e. can extend Executor (but not
                         // ScheduledExecutorService, too complicated
                         // to implement).
- void executeAt(Runnable,long) // Time is in nanoseconds.
- void executeAfter(Runnable,long) // Delay is in nanoseconds.
and having a clock interface with these 3 methods:
- double getTimeS() // Time in seconds, handy for physical-coordinates
                    // computations, no risk of integer wrapping.
- long getTimeNS() // No floating-point approximation, typically
                   // used for (low-level) scheduling.
- double getTimeSpeed() // Relative to wall clock time,
                        // or master clock time if any.
                        // Schedulers will use it to figure out how
                        // long they actually need to wait, if you want
                        // to allow for accelerated time.

The 3 execute(...) methods are simple but on top of them
you can still build periodic (or even aperiodic, why being so rigid?)
treatments, cancellable tasks, etc.

For "real" usage, you use a clock delegating to System.currentTimeMillis(),
and a scheduler delegating to some JDK's ScheduledExecutorService.
You can also use a clock which time is "timeSpeed * systemTime + offset",
with (timeSpeed,offset) dynamically modifiables, and make a scheduler that
can handle that, but it's not a trivial task (but possible, and if done
right can even be faster than STPE, which has additional constraints).
===> I call that "hard" scheduling, because time changes due to hardware
     reasons (chip ticks).

For determinist unit-testing you can use a clock which time does not
advance with wall clock time, but only due to calls to setTimeNS(...)
or such, and a unique scheduler which runs everything in one thread
and modifies the time accordingly (runs all schedules for t1, then
sets time to t2, then runs all schedules for t2, etc.).
===> I call that "soft" scheduling, because time changes due to
     software reasons (setter calls).
===> In that mode, when you call execute(Runnable), I think it's better
     for the execution to be postponed after all executions already
     planned for current time, but there might be reasons not to,
     depending on what you're doing.

Then, you can use a mixed mode, which is soft scheduling but with
a soft scheduler that tries to follow some hard clock, i.e. before
setting a new time and executing corresponding runnables it will wait
accordingly.
===> You end up with three main ways of scheduling your application :
- HARD (possibly multi-threaded,
        time = timeSpeed * systemTime + offset)
  ===> The "as usual" mode.
       Can be used in production, or to try reproduce a bug
       not encountered in soft scheduling, i.e. most likely
       a threading issue.
- SOFT_HARD_BASED (single-threaded (other than eventual local parallelism),
                   time = whatever time current runnable has been scheduled for,
                   timeSpeed = (I think master clock's time speed, but no access
                   to my code today))
  ===> Can be used in production if single-threaded is OK
       and you want determinism (be able to reproduce an eventual bug),
       or when debugging with slowed or accelerated time, etc.
- SOFT_AFAP (single-threaded (other than eventual local parallelism),
             time = whatever time current runnable has been scheduled for,
             timeSpeed = +Infinity)
  ===> Can be used to run unit tests deterministically and very fast,
       or to do "hours or days-long" sturdiness tests in a short time,
       or to do Monte-Carlo, or to do benchmarks (see what actual time
       speed you can reach), etc.

All that is only a subset of what you can do with norms like HLA
(http://www.cc.gatech.edu/computing/pads/PAPERS/Time_mgmt_High_Level_Arch.pdf)
but it already allows for a lot of flexibility.



>To make tests faster, we also provide a Clock.sleep. The system version delegates to Thread.sleep.
>FakeClock.sleep actually just advances the clock without blocking.

In the scheduling style I've been talking about, there much be no
such thing as a "sleep" in the code, or at least its only possible
implementation in soft scheduling would be a no-op.
Indeed it makes no sense to "wait" for something, when everything
is ran in a single thread, i.e. when nothing else can occur
concurrently.
Also, the "master" of the time must be the scheduler, not whatever
code playing with the clock. A Clock.sleep() that would advance
time would conflict with scheduler's work.
What you can do though is something like :
awaitWhileFalse(booleanSupplier,timeout)
===> In hard scheduling, it would wait while false or until timeout,
     and in soft scheduling, it would just throw if the boolean
     is not immediately true.



Side notes :
- In soft scheduling, you can't interfere from other threads
  with what is (deterministically) happening, but your scheduler
  can still allow for executing something in the "soft thread",
  as long as it only reads things (for example for display
  purpose).
- To compute the absolute time speed of an enslaved clock, you need
  to multiply its time speed with the one of backing master clock(s),
  but for example if you have "0 * +Infinity" you need to consider
  that it is 0, not NaN.
- You can use Locks in your code, but in soft mode you can use
  an implementation which does mostly nothing, since there is no
  need to lock.
  It can also have methods like checkHeldByCurrentThread(),
  checkNotHeldByCurrentThread(), etc., all returning true (does
  not matter if held since single-threaded), which works as long
  as you don't check them "negatively".
  ===> I call that a "passive lock".



-Jeff


_______________________________________________
Concurrency-interest mailing list
[hidden email]
http://cs.oswego.edu/mailman/listinfo/concurrency-interest