On Jun 19, 2013, at 12:44 PM, Bojan Janisch wrote:
> Sometimes ECJ runs with all threads waiting. I've checked if the problem comes from Drools again, but this time it's after/before
> Drools run. I've checked which threads are running with VisualVM, getting the result that only Attach Listener, Signal Dispatcher
> and some RMI TCP threads are running, while all my threads 0-14 are waiting.
>
> So what would I'd like to know is, what could cause all threads to wait? Doesn't the merger handle the threads after they're finished?
> By the way it seems that this bug occurs after Drools has finished and the rule has been evaluated, but I'm not sure because I don't
> know how many times it needs to run. (Depends on Generations and Population it needs)
It sounds like at least one of the threads isn't exiting, so ECJ sits there until it's done (which is never). But as to why it's not exiting I couldn't say.
It used to be that ECJ's multithreaded evaluator was very simple. ECJ spawned off N threads, assigned each of them a chunk of the population to deal with, then waited until they were done (using join()).
ECJ now has a new multithreaded evaluator. Now, ECJ spawns off N threads, and the threads iteratively enter a lock, grab a small piece of the population to work on, and process it. ECJ waits until they're done again (using join()).
It's possible that this new evaluator has a bug in it, though we've been banging on it pretty hard. I suspect you'll find that the reason a thread isn't exiting is probably not having anything to do with ECJ. But if you find a bug, drop me a note.
Sean
|