Archived

This forum has been archived. Please start a new discussion on GitHub.

Suggestion for future version

"Opening' up the IceInternal networking code (ie, de-internalizing it).

A lot of my Ice servants end up masking backend TCP networking code that talks to other legacy programs and devices, and to communicate with them I end up needing some TCP code in my C++ programs, which I usually bring in in the form of another set of libraries.

There's a lot of good stuff in the IceInternal networking code, robustness in sending and receiving messages via TCP and UDP, and nice epoll code, that could be useful as part of IceUtil or another new networking library available via a public API.

Anyway, food for thought for Ice 4 perhaps.

Thanks,
Caleb

Comments

  • more support for java

    I still think that Ice for Java is hard to use, or hard to integrate with other container.
    how to use POJO as an ICE servant?
    how to integrate ice with spring?
    how to deploy the ice grid servant as an OSGi bundle?
  • ctennis wrote: »
    "Opening' up the IceInternal networking code (ie, de-internalizing it).

    The main issue here for us is that Ice isn't meant to provide a network abstraction API. If we were to do this, we'd end up with yet another API that becomes public and, therefore, cannot be changed to suit our internal needs. Also, as it stands, the functionality provided by the code in IceInternal is far from complete if you think of it as a general purpose network API. Instead, the code does only what we need for Ice internals, and we have the freedom to change this API as Ice evolves.

    So, I wouldn't hold my breath for such an API from us. It simply isn't core to our business.

    Also, there is nothing to stop you from using whatever favorite network API you like. (I believe quite a few people are using ACE and Ice together, for example.)

    Cheers,

    Michi.
  • An idea

    I recently encountered a similar situation, in which I was unable to compile Ice on a very old platform because of incompatibilities in an old threading library. I was, however, able to compile Boost::Asio, and built a bridge from my Ice application to the target legacy system. Not pretty, but it solved my problem. If you are able to use asio or boost::asio in your situation, you might give that a try, it worked for me and is a very nice library.

    Boost C++ Libraries
    WebHome < Asio
  • IOCP for windows dispatch, lightweight callbacks, efficient builtin stream, etc

    Hi, I also have some suggestions.
    1. IOCP.
    We are using ice in a very hard loaded pattern, so it is critical to us serve as much requests as possible. On windows ice uses simple select pool - it is not very efficient, so part of the kernel mode cpu time relatively large...
    2. Lightweight callbacks.
    We also often use AMI callbacks, many callbacks... but callback for now is a very hard object (monitor + shared) - so it is really unefficient to create new callback for each responce - but we often have to, for example, coz we can't do an AMI call with the current object from AMI callback. O'coz, we can create pool - but we need such pool for every callback - it is really unefficient again. I wonder why you are using so fat callback - surely, it can be more...more thinner.
    3. Builtin stream type.
    Really, currently it is hard to develop and design efficient stream between client and server. May be, some help from ice can solve this problem, f.e., as it have done by .net.
    4. Intrusive collections are more efficient in some cases.
    I briefly look through the ICE code - you often use stl collections. Sometimes in time critical tasks it can be unefficient. For example, see a timer code - you store a pointer to a task object in collection, instead of using intrusive collection with zero additional cost... You often use maps for search by string keys (in call dispatch, for example) - surely, it is not efficient. It is common to generate perfect hash for static content, and use hash tables for dynamic content.
    5. Builtin timeout sessions (DGC) support.
    No comment - just as it .net does.
    6. Asynchronous proxy obtaining.
    Now we have only sync stringToProxy to solve this task...
    7. Message ordering
    Now we can't predict order of the messages on server side. Use of the Serialize property doesn't solve this - it just serializes all messages on a connection, that very unefficient. Surely, it could be very useful to have a special attribute of the method or interface...

    Thanks!
  • matthew
    matthew NL, Canada
    Andrew S wrote: »
    Hi, I also have some suggestions.
    1. IOCP.
    We are using ice in a very hard loaded pattern, so it is critical to us serve as much requests as possible. On windows ice uses simple select pool - it is not very efficient, so part of the kernel mode cpu time relatively large...
    2. Lightweight callbacks.

    This is near the top of our priority list. Most likely Ice 3.4 will support IOCP under Windows. Note that on Linux and Mac Ice already uses epoll/kevent and so it is very efficient and highly scalable on these platforms.
    4. Intrusive collections are more efficient in some cases.
    I briefly look through the ICE code - you often use stl collections. Sometimes in time critical tasks it can be unefficient. For example, see a timer code - you store a pointer to a task object in collection, instead of using intrusive collection with zero additional cost... You often use maps for search by string keys (in call dispatch, for example) - surely, it is not efficient. It is common to generate perfect hash for static content, and use hash tables for dynamic content.

    I strongly suspect that any type of optimization along these lines will make virtually no visible difference in any real application.
    6. Asynchronous proxy obtaining.
    Now we have only sync stringToProxy to solve this task...

    Sorry, I don't understand what you mean here.
  • kwaclaw
    kwaclaw Oshawa, Canada
    matthew wrote: »
    This is near the top of our priority list. Most likely Ice 3.4 will support IOCP under Windows. Note that on Linux and Mac Ice already uses epoll/kevent and so it is very efficient and highly scalable on these platforms.

    Will you also enhance the async IO in .NET?

    Currently the C# code is using the old style async pattern, but a more performant API is available now:
    SocketAsyncEventArgs Class (System.Net.Sockets)

    Basically, it means replacing BeginXXX operations with XXXAsync operations.

    Karl
  • Matthew, thanks for answers.
    matthew wrote: »
    1. IOCP.
    We are using ice in a very hard loaded pattern, so it is critical to us serve as much requests as possible. On windows ice uses simple select pool - it is not very efficient, so part of the kernel mode cpu time relatively large...

    This is near the top of our priority list. Most likely Ice 3.4 will support IOCP under Windows. Note that on Linux and Mac Ice already uses epoll/kevent and so it is very efficient and highly scalable on these platforms.

    Yes, i know, but ours target platform is windows... BTW, what's about AMI callback objects - they are really can take appreciable amount of a kernel time in hard-loaded applications...
    matthew wrote: »
    4. Intrusive collections are more efficient in some cases.
    I briefly look through the ICE code - you often use stl collections. Sometimes in time critical tasks it can be unefficient. For example, see a timer code - you store a pointer to a task object in collection, instead of using intrusive collection with zero additional cost... You often use maps for search by string keys (in call dispatch, for example) - surely, it is not efficient. It is common to generate perfect hash for static content, and use hash tables for dynamic content.

    I strongly suspect that any type of optimization along these lines will make virtually no visible difference in any real application.

    May be you are right. But it is really simple to implement these optimizations - for example, boost::intrusive. Intrusive containers often more robast in dispatch - os kernels always uses this pattern with queues... And it can solve some problems with objects lifetime and container management. Remember, there is one important drawback of the stl::list like containers - obtaininig an iterator from the object has a linear time complexity... But with intrusive containers such tasks always has a constant complexity.
    matthew wrote: »
    5. Asynchronous proxy obtaining.
    Now we have only sync stringToProxy to solve this task...

    Sorry, I don't understand what you mean here.

    Sorry, it's my bad. I mean + checkedCast operation on object proxy - full cycle of the obtaining typed proxy from string or identity.

    Thanks!
  • Andrew S wrote: »
    May be you are right. But it is really simple to implement these optimizations - for example, boost::intrusive. Intrusive containers often more robast in dispatch - os kernels always uses this pattern with queues... And it can solve some problems with objects lifetime and container management. Remember, there is one important drawback of the stl::list like containers - obtaininig an iterator from the object has a linear time complexity... But with intrusive containers such tasks always has a constant complexity.

    The question is whether using this pattern would actually improve Ice performance overall. I strongly suspect that it will not--the invocation path is already so short that the network latency completely dominates the time it takes to complete a remote call (if that call transmits little data). For larger requests that contain lots of data, performance is limited by network bandwidth. So, improving the way Ice uses internal data structures is not likely to have a noticeable effect because the performance difference of using an intrusive container would drown in the overall delay.
    You often use maps for search by string keys (in call dispatch, for example) - surely, it is not efficient. It is common to generate perfect hash for static content, and use hash tables for dynamic content.

    Again, generating a perfect hash of operation names for the dispatch code is unlikely to improve performance. For one, for this to make any difference, an interface would have to contain hundreds of operations. But, even then, this is not going to improve latency because the network delay is an eternity compared to the time it takes to look up the operation name. Whether we use a map or a perfect hash is simply not noticeable.
    Sorry, it's my bad. I mean + checkedCast operation on object proxy - full cycle of the obtaining typed proxy from string or identity.

    If you know that the proxy is of the correct type, you can use an unchecked cast, which doesn't go on the wire. So, the choice is yours: use a checked cast if you want the server to verify that the object is of the expected type, or use an unchecked cast if you are sure of the type, or if you are prepared to get an error later, when you make the first invocation on that proxy.

    Cheers,

    Michi.
  • michi wrote: »
    The question is whether using this pattern would actually improve Ice performance overall. I strongly suspect that it will not--the invocation path is already so short that the network latency completely dominates the time it takes to complete a remote call (if that call transmits little data). For larger requests that contain lots of data, performance is limited by network bandwidth. So, improving the way Ice uses internal data structures is not likely to have a noticeable effect because the performance difference of using an intrusive container would drown in the overall delay.

    Michi, I 'm sorry, but i can't get tail or head of it... Try to look into it from the other side - you just load server part with unneeded work. So, if you have only one client - yes, you are right, there is no much difference.. But if you should handle thousand of requests per second - this can be a bottleneck, surely... For example, we see, that in hard loaded pattern both kernel and user cpu times one the server side rather high. Same thoughts and for a perfecrt hash. Another point - if you would done IOCP, you kernel time will be lower, and user time will be a big issue... I just wonder - if we can improve something with low cost - why not to do it. For example, many other frameworks (windows rpc, many corba frameworks) already have this optimizations.

    Thanks for answers!

    PS I just writing my thoughts, nothing more. Ice is a greate project, but if we can improve it - why not :)
  • Hi Andrew,

    I hear your plea and agree with you that wasted CPU cycles can be a big issue. But, to justify replacing the current split search with a perfect hash, you'd have to prove that this actually makes a measurable difference. Now, I cannot see how this can possibly be the case unless interfaces have a ridiculously large number of operations (in the hundreds).

    Moreover, doing a perfect hash may well make the lookup slower. That's because it requires a separate hash function for each interface, instead of being able to use the same search code for all interfaces. In turn, that can drive up working set size and negate any performance gain that might come from using a perfect hash.

    What I was suggesting in my previous post wasn't that performance improvements aren't worthwhile. They are, provided they make a difference. Off hand, I doubt that there would be an improvement that would be noticeable from making this change. It's a micro-optimization that isn't likely to make any difference in the majority of cases (namely, interfaces with fewer than one hundred operations or so).

    The same is the case for intrusive containers. For such an optimization to matter, it has to be on the critical path. If it isn't, we end up improving the performance of code that is executed a fraction of a percent by a fraction of a percent, and no-one will notice any difference.

    Cheers,

    Michi.
  • michi wrote: »
    Hi Andrew,
    The same is the case for intrusive containers. For such an optimization to matter, it has to be on the critical path. If it isn't, we end up improving the performance of code that is executed a fraction of a percent by a fraction of a percent, and no-one will notice any difference.

    Cheers,

    Michi.

    Michi, thanks for answers.
    In short, I think, that we can't estimate effects from perfect hash or intrusive containers without tests... But i'm sure, that with intrusive containers some of internals of the ICE will be simpler. For example, our's realization of a session manager is much simpler than your's in article... And no need to use map to store sessions, no need to use slow lookup.

    With stick to lookups... Btw, maybe, it's a good idea to introduce some kind of objects and methods session tokens - while first call we get an object\method address as a session token and transfer it to the other side, so while next calls we can use this tokens to call object methods directly, without lookup - i have already done it with my own RPC framework, and results were very cool. Maybe, you already done this - sorry, i just cross-looked into ICE code and protocol description...

    Thanks!

    --
    Andrew.
  • matthew
    matthew NL, Canada
    Remember, there is one important drawback of the stl::list like containers - obtaininig an iterator from the object has a linear time complexity... But with intrusive containers such tasks always has a constant complexity.

    I'm not quite sure what you mean here, but with an STL list you can keep iterators around to members contained within the list, just as with a regular linked list.
  • matthew wrote: »
    I'm not quite sure what you mean here, but with an STL list you can keep iterators around to members contained within the list, just as with a regular linked list.

    Sometimes (when you keep iterators somewhere) - yes, you can. But what if all you have is only contained item itself (it's generic situation, for example, when you use C-style callback api)? Answer - only slow lookup or additional link node, that maps iterator to a callback item. stl containers sometimes more generic, than really needed... :( For example, boost::multi_index containers have an ability to get iterator from item...

    But lookup not only issue - one more issue that with non-intrusive containers you often need additional memory allocations to store members (for example, list, set, map). I agreed that there is a thin line between "use intrusive" and "use non-intrusive" - but sometimes, using of the intrusive container just easier...

    It is up to you, o'coz, to decide how to implement - surely, you have much more information than me...