Archived

This forum has been archived. Please start a new discussion on GitHub.

Freeze due to unheard reply

Since changing to Ice 3.0.0 we have experienced a strange lock-up behaviour.

Our client is stuck in a twoway call in IceInternal::Outgoing::invoke() on an infinite wait() (We deliberately don't have a timeout setup right now).

However our server process has completed the call as usual, and according to our packetsniffer the client did recieve the reply from the server.

All this happens while our client is in a critical region, so other incomming calls (the client is also a server process, and can get push-style updates, using oneway invocations) are blocked using (recursive) mutex exclusion. The client does typically receive a call in the situations where it gets blocked.

All the server processes run along as if nothing extraordinary has happened, and other connected clients can continue unaffected (Though they may later experience lock-ups of their own).

That is about all I can think of saying right now, but I'll be happy to provide stack traces and the like if that would be of any help.

Comments

  • benoit
    benoit Rennes, France
    Note that you can also use the --Ice.Trace.Protocol to trace requests/replies. It sounds like you have a deadlock in your client because there's no more threads (from the Ice client thread pool) available to read the reply.

    Is your client using bi-dir connections, Glacier2 or AMI? Calls received by a client over a bi-directional connection and AMI callbacks are executed by threads from the client thread pool. If all these threads are busy (waiting to acquire a mutex?) and if you didn't configure Ice to use a dynamic thread pool, replies from servers won't be read (and your twoway call will hang indefinitely if you didn't configure a timeout). By default the client thread pool has only 1 thread and it won't grow dynamically.

    You could try to increase Ice.ThreadPool.Client.Size and/or Ice.ThreadPool.Client.SizeMax and see if this prevents the deadlock from happening. If you post the stack traces here and the configuration of your client we'll be happy to look at them to make sure this is indeed the problem. If possible, I would also recommend to avoid making the twoway call from a critical section (it's not good for concurrency and this can be deadlock prone!).

    Also, if you didn't already read them, I would recommend to take a look at the "Avoiding Deadlocks" newsletter articles from the issue 4 and 5!

    Benoit.
  • benoit wrote:
    Is your client using bi-dir connections, Glacier2 or AMI? Calls received by a client over a bi-directional connection and AMI callbacks are executed by threads from the client thread pool. If all these threads are busy (waiting to acquire a mutex?) and if you didn't configure Ice to use a dynamic thread pool, replies from servers won't be read (and your twoway call will hang indefinitely if you didn't configure a timeout). By default the client thread pool has only 1 thread and it won't grow dynamically.

    Yes we are using Glacier2, and I think our problems started when we started to route callbacks back through the Glacier connection. We'll try to ajust the threadpool as suggested below, and se if that doesn't fix the problem.
    benoit wrote:
    If possible, I would also recommend to avoid making the twoway call from a critical section (it's not good for concurrency and this can be deadlock prone!).

    True, and we will be looking at that in any case, but it isn't currently a trivial fix, and it hand caused locking problems before (Only performance problems).
    benoit wrote:
    Also, if you didn't already read them, I would recommend to take a look at the "Avoiding Deadlocks" newsletter articles from the issue 4 and 5!

    Benoit.

    I'll look at those.

    Thanks for the quick reply.