Sunday, June 19, 2011

Fixing 'Too many open files' issue in Load Balancer

When load balancing a production system like Stratos with heavy load of requests, or while load testing it, you may face a 'Too many open files' issue as below.

[2011-06-09 20:48:31,852]  WARN - HttpCoreNIOListener System may be unstable: IOReactor encountered a checked exception : Too many open files
java.io.IOException: Too many open files
    at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:152)
    at org.apache.http.impl.nio.reactor.DefaultListeningIOReactor.processEvent(DefaultListeningIOReactor.java:129)
    at org.apache.http.impl.nio.reactor.DefaultListeningIOReactor.processEvents(DefaultListeningIOReactor.java:113)
    at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:315)
    at org.apache.synapse.transport.nhttp.HttpCoreNIOListener$2.run(HttpCoreNIOListener.java:253)
    at java.lang.Thread.run(Thread.java:662)


This is a common exception in load balancer, if the maximum allowed files  (ulimit) to be open is set to low.
You can get the number by, ulimit -Sn. By default it can be as low as 1024 for your desktop, server, or the ec2 instance. You can fix the above exception by increasing the ulimit by giving a higher value (given 655350 here).
ulimit -n 655350
This will fix the above issue.

2 comments:

  1. setting value of ulimit -n to a higher value however can affect overall performance of the ec2 instance. did you try doing any research on how setting ulimit to a higher value can affect overall performance ?

    ReplyDelete
  2. Hi Abhimanyu,
    Setting the limit to very high or unlimited might potentially has the risk of the user exhausting all the available system resources and even might lead to a system failure as you suspect, as ulimit is a per-user configuration. It should be tuned for serving a particular daemon.

    It should be chosen wisely, such that it is not too low to throw this error, while not unlimited to affect the overall stability. In my case, I was working with a single user system, and the chances of this limit being violated is pretty low.

    HTH.
    Regards,
    Pradeeban.

    ReplyDelete

You are welcome to provide your opinions in the comments. Spam comments and comments with random links will be deleted.