Archived

This forum has been archived. Please start a new discussion on GitHub.

Server Scalability & Asynchronous IO

It is my understanding that using select() with a large FD_SET is not very scalable. As a result, all the network servers I've written have used asynchronous IO and completion ports (on Windows obviously).

What mechanism does ICE use to allow scalability to thousands of simultaneous network connections? I can't imagine it uses a thread-per-connection model, and using grep I can't find any evidence of completion ports.

Thanks,


Ken Carpenter

Comments

  • Re: Server Scalability & Asynchronous IO
    Originally posted by Ken Carpenter
    It is my understanding that using select() with a large FD_SET is not very scalable. As a result, all the network servers I've written have used asynchronous IO and completion ports (on Windows obviously).

    That's right, and that's why we have a special optimization for WIN32. Have a look at ThreadPool.cpp. Search for the following comment:

    //
    // Optimization for WIN32 specific version of fd_set. Looping with a
    // FD_ISSET test like for Unix is very unefficient for WIN32.
    //
    Originally posted by Ken Carpenter
    What mechanism does ICE use to allow scalability to thousands of simultaneous network connections? I can't imagine it uses a thread-per-connection model, and using grep I can't find any evidence of completion ports.

    We use a thread pool model, using the leader-follower pattern. This means that the number of threads being used doesn't increase with the number of connections. Again, if you are interested in the details, have a look at ThreadPool.cpp.

    Furthermore, Ice uses "Active Connection Management" (ACM): Connections which have been idle for a certain time are automatically closed (gracefully, so that no messages get lost). When the connection is needed again, it is reestablished. (ACM is optional and can be switched on or off using configuration parameters.)
  • I just whipped out my copy of POSA2 to review the Leader/Followers pattern.

    Do you have any plans to change ICE to use WaitForMultipleObjects() on Windows?

    Thanks,


    Ken Carpenter
  • Originally posted by Ken Carpenter
    I just whipped out my copy of POSA2 to review the Leader/Followers pattern.

    Do you have any plans to change ICE to use WaitForMultipleObjects() on Windows?

    Thanks,

    Ken Carpenter

    I don't see any benefit in using WaitForMultipleObjects(). I usually try to avoid non-standard (i.e., non-posix) calls unless they provide some sort of significant benefit.

    Note that a change to WaitForMultipleObject() would have far reaching consequences. For example, all the transport plugins are currently select()-able. I guess we would have to make them compatible with WaitForMultipleObjects() then. This then also raises the question if third-party libraries, such as OpenSSL, can be used without modifications.
  • What would you say is the limit for the maximum number of connections per server, which still leaves enough processor time to actually do work?

    Obviously this varies with the machine in question and with the nature of the request, but a ballpark figure would be helpful, or if you have statistics for a particular server or request type (i.e., a simple database query/response).

    Thanks,


    Ken Carpenter
  • Originally posted by Ken Carpenter
    What would you say is the limit for the maximum number of connections per server, which still leaves enough processor time to actually do work?

    Obviously this varies with the machine in question and with the nature of the request, but a ballpark figure would be helpful, or if you have statistics for a particular server or request type (i.e., a simple database query/response).

    That's really difficult to say. The connections alone are probably not the problem. (Although you would have to configure your system so that it allows enough connections. For example, the Windows default for WaitForMultipleObjects() is just 64.)

    If all these connections are busy all the time, then it really depends on how much processing needs to be done for the requests arriving over these connections. To give an estimate on the number of connections is impossible in this case, without knowing more about the request processing.

    If only a few connections are busy at the same time, 10,000 connections shouldn't be a problem. Of course, to save resources, I would recommend Active Connection Management, so that idle connections are closed and re-established on demand.

    In general, I would avoid designs which require huge numbers of simultanous connections, by using a multi-tier architecture.
  • matthew
    matthew NL, Canada
    For example, the Windows default for WaitForMultipleObjects() is just 64

    Unless this recently changed I don't think 64 handles is a default - I think 64 is a hard limit!

    Regards, Matthew
  • That's right Matthew. You can wait on at most 64 event sources (e.g., sockets) per thread with WaitForMultipleObjects. So if you need more than 64, you need multiple threads.

    There is a default limit on FD_SETs of 64, but this can be changed by a #define before including the winsock header.

    One thing I still don't understand is how Ice avoids the overhead of checking which of, say 2000, sockets is readable. It looks to me like you loop over whole array of handles in the ThreadPool.cpp code.

    Can you tell me where I'm going wrong in the following scenario:

    - server hosts an Ice object and is waiting in select()
    - 2000 clients are connected to the server, therefore there are 2000 socket handles (accept() created one for each connection)
    - one client calls a method in the Ice object
    - server receives data and so select() returns
    - ThreadPool.cpp iterates over, on average, 2000/2 handles to determine which one is ready :confused:

    With WaitForMultipleObjects(), the index of the handle/socket that satisfied the wait condition is returned (or the one in the array with the lowest index if more than one handle satisfied the wait condition). There is, therefore, no need to iterate over the handles to check for readiness.

    I suspect I have missed something somewhere, since you guys seem to know what you're doing. Can you set me straight? :)


    Ken Carpenter
  • Have a look at the optimization for WIN32 in ThreadPool.cpp. We don't loop over 2000 connections for WIN32, but only over the connections which are marked as readable. So there is no 2000-loop for WIN32.

    In this case, WIN32 is faster than our Linux implementation, because we can make use of the known format of the WIN32 struct fd_set. However, even under Linux, if you have 2000 connections open simultanously, a tight 2000-loop should be your least worry.

    Any server that needs so many connections should use ACM (Active Connection Management), so that idle connections are closed, to save resources.

    If all 2000 connections are busy all the time, so that no idle connections can be closed, then a tight 2000-loop would even be less relevant, because the processing time is what counts. You would need a very fast machine for such a case.

    Finally, you can lower the number of handles to loop over in Linux, by simply using multiple thread pools. In Ice, you can give each object adapter a separate thread pool (optional), so if you have 2,000 connections and 20 adapters w/ separate pools, then you only need to loop over 100 handles in each of them - just like with multiple threads calling WaitForMultipleObjects().
  • Hi Marc,

    My attantion was catched by the following phrase from you:

    > In Ice, you can give each object adapter a separate thread pool
    I could not remember the relevant part in documentation about this feature. Am I just overlook it and should review the documentation more carefully or it is a kind of undocumented featrure. If the later, could you please point me to the examples (if any) and tell whether it possible to set the priority for each thread pool? Maybe it is also possible to create priority lanes ;) ?

    Thank you,
    Andrey.
  • It is documented, but not very prominently:
    C.3 Ice Object Adapter Properties

    [...]

    name.ThreadPool.Size

    Synopsis

    name.ThreadPool.Size=num

    Description

    If num is set to a value larger than zero, the object adapter creates its own, private thread pool with num threads for dispatching requests. This is useful to ensure that a minimum number of threads is available for dispatching requests on certain Ice objects, in order to avoid deadlocks because of thread starvation.

    We still must write a chapter that covers the object adapter and all its configuration parameters in detail.

    Regarding the priority lanes: Yes, I think we could fairly easily add such a feature. This would just be an additional thread pool configuration parameter, I guess.
  • Originally posted by marc
    Have a look at the optimization for WIN32 in ThreadPool.cpp. We don't loop over 2000 connections for WIN32, but only over the connections which are marked as readable. So there is no 2000-loop for WIN32.
    Ahhh. Now I see why I was confused. I didn't realize the array being iterated there was only a list of ready handles! Doh!

    Thanks for the clarification.


    Ken Carpenter
  • I have seen some comments about it in the past, but I would like to bring this issue again as for me it's still unresolved.

    Currently Ice only accepts FD_SETSIZE simultaneous connections for Windows. The default is 64 in Windows headers.

    You need to recompile Ice with FD_SETSIZE=4096 (for example) if you want to handle more connections. This has got some implications on the performances of select as I think select only loops on an array to find the corresponding socket. Changing from 64 to 4096 may imply a select that is 64 times slower.

    You might say "but we have a threadpool and we handle that case gracefully, you don't need to increase FD_SETSIZE."

    Except that this has got a very negative impact on performances as throwing an exception and retrying the connection is very expensive. In addition it may never succeed if you are overwhelmed with connection requests resulting in a failure on the client side.

    I submit that using overlapped I/O (WSAAsyncSelect et al.) would result in much better resilience and performances under Windows.

    In other words, it would be nice that the Windows specific implementation uses the best of what the OS offers. ;)
  • matthew
    matthew NL, Canada
    Ice 3.4 will use async IO under windows.
  • matthew wrote: »
    Ice 3.4 will use async IO under windows.

    That's great ! :)