Archived

This forum has been archived. Please start a new discussion on GitHub.

AssertionError in client thread pool

The Ice services on one of our back-end boxes became unresponsive recently. A trace through the system error revealed the following stack trace:

[ 8/18/09 01:54:02:913 Security: enabling SSL ciphersuites:
TLS_DHE_RSA_WITH_AES_256_CBC_SHA
TLS_DHE_RSA_WITH_AES_128_CBC_SHA
TLS_RSA_WITH_AES_256_CBC_SHA
TLS_RSA_WITH_AES_128_CBC_SHA ]
Exception in thread "HaloCentralGrid-Services-01-EventServices-Ice.ThreadPool.Client-79" java.lang.AssertionError
at IceInternal.ThreadPool.run(ThreadPool.java:499)
at IceInternal.ThreadPool.access$100(ThreadPool.java:12)
at IceInternal.ThreadPool$EventHandlerThread.run(ThreadPool.java:1242)

Our Ice services became unresponsive because our Ice services will almost always become clients of another Ice service to realize any given request (and thus use the default Ice client thread-pool (we don't configure adapter-specific client thread pools)). It appears that this assertion error incapacitated the client thread pool for the entire service set, and thus any Ice service which received a request, and invoked another Ice service in the process of satisfying this request, simply hung.

I would appreciate a root-cause diagnosis of the error given the stack trace.

Thanks

Dirk

Comments

  • mes
    mes California
    Hi Dirk,

    This looks similar to an issue we've seen before. Are you still using Ice 3.2.x?

    Regards,
    Mark
  • Mark-
    Yes - Ice 3.2.1.

    Dirk
  • Mark-
    So is this a known bug? Is the fix to roll to Ice 3.3.x?

    Thanks

    Dirk
  • mes
    mes California
    Hi,

    I suppose you could call it a bug in Ice 3.2, but it would be more accurate to say that it's an undocumented "feature" of the JVM that Ice doesn't work around. :)

    Ice 3.3 includes a fix, but we can also provide a patch for Ice 3.2.x if you prefer.

    Regards,
    Mark
  • Mark-
    I will let you know how we want to address this - whether by patch or by rolling to Ice 3.3.

    Can you give me more information on this undocumented "feature"? - I'd like to understand more of what is going on.

    Thanks

    Dirk
  • mes
    mes California
    Assuming it's the same issue (and I'm almost certain it is), the problem is caused by a rare spurious wakeup in Java's Selector class. The Ice thread pool implementation uses this class to monitor the activity on multiple sockets. The Selector's select method is only supposed to wakeup when a socket is ready for I/O, or if a timeout occurs.

    The thread pool in Ice 3.2 assumes that if the selector wakes up and no sockets are ready, then a timeout must have occurred. This is the reason for the assertion at line 499 in the code. When no timeout is set, Ice expects the call to select to block until a socket is ready. Unfortunately, sometimes the selector unexpectedly wakes up when no socket is ready for I/O, which triggers the assertion failure.

    The workaround is to sleep for a short time (1ms) and try again. The sleep helps to avoid a busy loop that we've noticed when we've been able to reproduce the problem.

    Mark
  • Mark-
    Could you provide me with the 3.2.1 patch?

    Thanks

    Dirk