Print

Print


On 26.08.2013, at 17:08, Sean Luke <[log in to unmask]> wrote:

> On Aug 26, 2013, at 10:27 AM, Ralf Buschermöhle wrote:
> 
>>> This sounds a *lot* like your process is running out of ports; that is, it's probably not an ECJ error or even a Java problem but rather an OS problem.  But at just 129 slaves?  
>> 
>> These are just the running nodes. Previously there have been a few thousand connections from a cluster (handled successfully).
> 
> Fair enough.  Still: 129 is an unusual number, don't you think?

Acutally I think it's just random and some of the 129 nodes are not from the cluster (with a job scheduler) - thus are constantly computing.

> Also find it strange that the socket was successfully created (hence ECJ went on to fire up the read and write threads) but the error is likely in reading or writing to the socket.  
> 
> This sounds like an OS bug.  I don't think it's on the client side.  So have you tried using Debian on the server side just for grins?

Interesting. I will check. :)