Archived

This forum has been archived. Please start a new discussion on GitHub.

OneWay Method invokation NOT returning

Hello

I'm using Ice for my final thesis and am encountering a strange problem, which I would consider a bug.

I'm using Ice 1.2 on a Pentium 4 with Redhat Linux.
The Kernel version is 2.4.20 and the compiler I'm using is GCC 3.3.1

If I'm testing my program to see how it behaves under stress (20+ Processes, 400+ Ice::Objects and A LOT of ongoing communication) I encounter a strange problem:
A methodinvokation on a oneway-Proxy does NOT return. As a result a Mutex remains locked and the whole program crashes. This does however ONLY happen, if the program is under stress and A LOT OF communication does occur. Under normal circumstances the program runs with no problems.

The relevant Code looks something like this:

void ObjectX_I::method1() {

IceUtil::RecMutex::Lock lock(_mutex);

//MORE CODE HERE

vector<ObjectZPrx>::iterator l;

for(l = UpdateObjects.begin(); l != UpdateObjects.end(); l++) {

try {

//just to clarify matters
ObjectZPrx tempProxy = ObjectZPrx::uncheckedCast(l->ice_oneway());

tempProxy->method2(updateInfo);

}catch(...) {
//CLEANING UP
}
}
}

The Stacktrace looks like this:

pthread_start_thread () at manager.c:291
startHook () at Thread.cpp:471
IceInternal::ThreadPool::EventHandlerThread::run() at ThreadPool.cpp:737
IceInternal::ThreadPool::run() at ThreadPool.cpp:564
IceInternal::Connection::message() at Connection.cpp:1276
IceInternal::Incoming::invoke() at Incoming.cpp:200
ObjectX::__dispatch() at *SLICE2CPP GENERATED FILE*:10306
ObjectX::___method1 at *SLICE2CPP GENERATED FILE*:10121
ObjectX_I::method1 at *MY CODE*
IceProxy::ObjectZ::method2 at *SLICE2CPP GENERATED FILE*:4929
IceProxy::ObjectZ::method2 at *SLICE2CPP GENERATED FILE*:4940
IceProxy::Ice::Object::__getDelegate() at Proxy.cpp:803
IceDelegateM::Ice::Object::setup() at Proxy.cpp:1119
IceInternal::OutgoingConnectionFactory::create() at ConnectionFactory:258
IceInternal::Connection::validate() at Connection.cpp:86
__ICR_LIST__ () from libIce.so.12
__select () at __select:-1

Is this a known bahaviour in certain circumstances or do you know of a workaround to prevent this behaviour?

Thanx in advance

Gerald

Comments

  • marc
    marc Florida
    Perhaps you run into a deadlock? Note that oneway methods can block, if the send buffer of the TCP/IP stack is full. One possible scenario is that your server runs out of threads to answer the oneway request, therefore the client blocks and the oneway never returns.
  • If it is a deadlock, it can't be in my code, since there are no cyclic dependencies.
    It is possible though, that the server runs out of threads, due to the amount of traffic.
    If so, what could I do to prevent the blocking?
  • marc
    marc Florida
    If there are no cyclic dependencies, does this mean that all your clients are pure clients (i.e., they don't receive requests), and all your servers are pure servers (i.e., they don't send requests), and there are no callbacks?

    If so, then it's not possible that a server runs out of threads, because each method execution must eventually return (except if you block the dispatch thread for some reason other than sending a request).

    One typical scenario that can lead to deadlocks is if the server calls back the client under heavy load. In this case, the server thread pool in both the client and the server could be exhausted, meaning that both the server and the client will block, because there are no threads available to answer the callback.

    If an upper limit for the number of requests is predictable, then the solution is to simply increase the number of threads in the thread pool (or the maximum only, since the thread pool is dynamic). If this is not possible, then you should use AMD (Asynchronous Method Dispatch) and AMI (Asynchronous Method Invocation) for the implementation. (See the Ice manual for details.)

    However, it's very difficult to say what's going on without knowing the exact details of your application. But I doubt that there is a bug in oneways, as they are used heavily by one of our customers in a very-large-scale application.
  • marc
    marc Florida
    By the way, looking at the stack trace, I can see that the client hangs in connection validation. That's almost certainly caused by the server running out of threads, i.e., there is no thread left in the server to validate the connection.
  • Thank you

    No, my Servers/Clients aren't pure Servers/Clients.
    However the synchronously called methods do only computation or if they communicate, they do with oneways to update other Objects about changes being made.

    Therefore I thought, I had no cyclic dependencies(at least not in my code)
    The scenario you described seems perfectly plausible, therefore I will give AMD a try.

    Thanx again..

    Gerald
  • marc
    marc Florida
    To find out what's going on, it might be helpful to set this property:

    Ice.ThreadPool.Server.SizeWarn

    With this property, you will get warnings if you run low on threads. Here is an example:

    # Initial and minimum number of threads
    Ice.ThreadPool.Server.Size=10

    # Maximum number of threads
    Ice.ThreadPool.Server.SizeMax=50

    # Print a warning if 40 or more threads are used
    Ice.ThreadPool.Server.SizeWarn=40
  • Yes, thanx again.

    It was exactly as you had described: All threads in use under heavy load and therefore no connection-validation possible.

    Gerald