sorry, it seems I missed the exception in the log, after around ~ 2000 connects & disconnects I receive
Exception in thread "SlaveMonitor:: " java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
In the test case I started and shut down the slaves immediately (after 2s) - but I received the same exception in the log of the cluster nodes and there are usually hours between starting and stopping the slaves.
P.S. Just for completion
The others (regarding the socket(s)) are:
java.net.SocketException: Connection reset
java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
On Aug 26, 2013, at 17:20, Sean Luke <[log in to unmask]> wrote:
> On Aug 26, 2013, at 10:27 AM, Ralf Buschermöhle wrote:
>> These are just the running nodes. Previously there have been a few thousand connections from a cluster (handled successfully).
> Some more. Try executing the following command on your BSD box to see how many sockets and files (combined) you can have open at one time:
> sysctl kern.maxfilesperproc
> On my Mac (a BSD box) I get around 10K.
> I wonder if ECJ isn't properly closing the sockets, and so you're hitting a socket limit by repeatedly adding and removing clients. It looks correct to me though.