Archived

This forum has been archived. Please start a new discussion on GitHub.

IceGrid::ActivationTimedOut

Hi,

I'm having some problems with IceGrid. I have written a program to ask IceGrid to start and stop servers. All servers are configured with activation=manual in the node. Occasionally I see a server activation time out. At the moment I'm just testing with super-simple dummy servers, so I'm pretty certain there's no problem in the server itself. The servers always start instantly when I run them manually.

The symptom is that when my program calls startServer(), it hangs for 15sec (=IceGrid.Node.WaitTime), then throws a ServerStartException with reason="The server activation timed out". The state of the server (as reported by getServerState()) then shifts to ActivationTimedOut. Subsequent calls to startServer() return immediately with the same exception.

When I look at the output of the server, it seems that it did start normally. So my questions are:

1) What are the criteria for successful server activation? Ie what is IceGrid looking for that it's not seeing? The documentation for IceGrid seems a little thin on this kind of stuff.
2) What can my program do to recover from this? Ie what can I do when the server state is ActivationTimedOut?


Thanks very much,

Alex

Comments

  • benoit
    benoit Rennes, France
    Hi,

    IceGrid waits for all the server object adapters with the attribute "server-lifetime" set to true (the default) to be activated to change the state of the server to "Active". If these object adapters are not activated within the timeout period, IceGrid considers that the server activation timed out.

    Does your server have registered adapters which are not activated or only activated after some time? If that's the case, you should set the server-lifetime attribute to "false" in the deployment descriptor if you don't want IceGrid to wait for the activation of these adapters. See here for more information on this attribute.

    Cheers,
    Benoit.
  • Hi Benoit,

    OK, thanks. This explains what IceGrid is looking for, but I don't understand why it's seeing a problem.
    My servers have exactly one adapter, which they activate almost immediately.
    I have trace statements either side of the call to 'activate()', so I'm sure that this is being executed without an exception being thrown.

    I don't see anything interesting in the node's output, with IceGrid.Node.Trace.Activator=2.

    Do you have any ideas what else I might look for?


    Alex

    P.S. Also, I have one request for IceGrid Exceptions: Would it make sense to output the 'reason' field in the streaming operator?
    For several exception types I just want to print a meaningful error message, then have the code take the same action. For eg IceGrid::ServerStartException, the streaming operator simply outputs "ServerStartException". However, more information is available in the 'reason' field of ServerStartException. The same is true for other exception types, eg. NodeUnreachableException. To get a meaningful error message I have to catch each exception type explicitly and print the 'reason' field manually.]
  • benoit
    benoit Rennes, France
    You should set IceGrid.Node.Trace.Adapter to 2 in the node configuration file and IceGrid.Registry.Trace.Locator to 2 in the registry configuration file to check whether or not the object adapter is correctly seen as active by the IceGrid node. You should see trace messages when the object adapter is being activated in your server (if you don't see any traces, this indicates that your server isn't properly configured, either Ice.Default.Locator or the AdapterId property for the object adapter is missing).

    Did you make sure the adapter name from your code (the one passed to createObjectAdapter()) is the same as the one from your deployment descriptor?

    If this doesn't give you more clues, could you perhaps try to reproduce the problem with one of the IceGrid demo and post the steps to reproduce it here? I'll be happy to take a closer look at this if I can reproduce it.

    I agree that it would be nice to print more information for IceGrid user exceptions. I'll discuss this internally to see if we can improve this.

    Cheers,
    Benoit.
  • Sorry for the confusion... I must have gotten confused with all the server starting and stopping and been looking at the wrong startServer instance (I have to run things for about half an hour on average to reproduce this fault): the server does _not_ activate successfully.

    The server hangs on the activate() call. Then IceGrid does the correct thing and terminates the failing-to-activate server. So the question is why the server hangs on activate().

    I have two nodes: xray and yankee. The xray node is colocated with the registry. Everything is running locally, in order can rule out the network as a problem.
    I have seen the problem on both xray and yankee.
    I set the trace levels you suggested, and see (long paths removed):
    [ 07/02/07 23:58:12.406 Server: changed server `xray.hibrick' state to `Activating' ]
    [ 07/02/07 23:58:12.406 Activator: activating server `xray.hibrick'
      path = xxxxx/TestSystem/orca2/bin/brick
      pwd = xxxxx/test
      uid/gid = 1000/1000
      args = xxxxx/TestSystem/orca2/bin/brick --Ice.Config=xxxxx/nodedb/servers/xray.hibrick/config/config ]
    [ 07/02/07 23:58:12.441 Server: changed server `xray.hibrick' state to `WaitForActivation' ]
    orca: Ice version: 3.2.0 libOrcaIce version: 2.3.0+
    orca: Brick: Loaded command line properties
    orca: Brick: Loaded component properties from 'xxxxx/test/nodedb/servers/xray.hibrick/config/config'
    orca: Brick: Loaded global properties from '/home/users/a.brooks/.orcarc'
    orca: Brick: Loaded factory default properties
    [ 07/02/07 23:58:12.459 Locator: registered server `yankee.lowbrick' process proxy: `8E2EC594-2668-4420-9E1E-8BFC36A93AC2 -t:tcp -h 172.17.7.39 -p 45210' ]
    [ 07/02/07 23:58:27.414 Server: changed server `xray.hibrick' state to `ActivationTimeout' ]
    [ 07/02/07 23:58:27.414 Server: server `xray.hibrick' activation timed out ]
    [ 07/02/07 23:58:27.414 Adapter: server `xray.hibrick' adapter `xray/hibrick' activation failed: server activation timed out ]
    
    ...
    
    [ 07/02/07 23:58:29.481 Server: changed server `xray.hibrick' state to `Deactivating' ]
    [ 07/02/07 23:58:29.481 Server: changed server `xray.hibrick' state to `DeactivatingWaitForProcess' ]
    [ 07/02/07 23:58:44.484 Activator: sent SIGKILL to server `xray.hibrick' (pid = 15050) ]
    [ 07/02/07 23:58:44.486 Activator: detected termination of server `xray.hibrick'
      signal = SIGKILL ]
    [ 07/02/07 23:58:44.486 Adapter: server `xray.hibrick' adapter `xray/hibrick' deactivated ]
    [ 07/02/07 23:58:44.487 Server: changed server `xray.hibrick' state to `Inactive' ]
    

    The SIGKILL is required because my program subsequently requested deactivation, and of course the server has hung on the activate() call.

    I think I can rule out configuration issues like name mismatches since it works 99% of the time. For comparison, here's what I see for a normal server activation:
    [ 07/02/07 23:58:10.233 Server: changed server `xray.midbrick' state to `Activating' ]
    [ 07/02/07 23:58:10.233 Activator: activating server `xray.midbrick'
      path = xxxxx/test/nodedb/distrib/TestSystem/orca2/bin/brick
      pwd = xxxxx/test
      uid/gid = 1000/1000
      args = xxxxx/test/nodedb/distrib/TestSystem/orca2/bin/brick --Ice.Config=xxxxx/test/nodedb/servers/xray.midbrick/config/config ]
    orca: Ice version: 3.2.0 libOrcaIce version: 2.3.0+
    orca: Brick: Loaded command line properties
    orca: Brick: Loaded component properties from 'xxxxx/test/nodedb/servers/xray.midbrick/config/config'
    orca: Brick: Loaded global properties from '/home/users/a.brooks/.orcarc'
    orca: Brick: Loaded factory default properties
    [ 07/02/07 23:58:10.279 Server: changed server `xray.midbrick' state to `WaitForActivation' ]
    [ 07/02/07 23:58:10.294 Server: changed server `xray.midbrick' state to `Active' ]
    [ 07/02/07 23:58:10.294 Adapter: server `xray.midbrick' adapter `xray/midbrick' activated: dummy -t:tcp -h 172.17.7.39 -p 37855 ]
    [ 07/02/07 23:58:10.300 Locator: registered adapter `xray/midbrick' endpoints: `dummy -t:tcp -h 172.17.7.39 -p 37855' ]
    [ 07/02/07 23:58:10.303 Locator: registered server `xray.midbrick' process proxy: `99B8A2DD-D9CE-4FA8-A91E-D10142943AA8 -t:tcp -h 172.17.7.39 -p 37855' ]
    

    Can you think of anything else I can do to give some clues about what's happening here before going to the trouble of reproducing with one of the IceGrid demos (my starting/stopping program relies on a lot of my infrastructure...)?


    Thanks,

    Alex
  • benoit
    benoit Rennes, France
    From your traces:
    [ 07/02/07 23:58:12.441 Server: changed server `xray.hibrick' state to `WaitForActivation' ]
    ...
    [ 07/02/07 23:58:12.459 Locator: registered server `yankee.lowbrick' process proxy: `8E2EC594-2668-4420-9E1E-8BFC36A93AC2 -t:tcp -h 172.17.7.39 -p 45210' ]
    

    Do you think it's just a coincidence that the 'yankee.lowbrick' server registers its process proxy right after the activation of the 'xray.hibrick' server?

    I'm afraid, I don't see why the object adapter activation would hang. The only possible cause would be a network issue. For example, the server could try to connect to an unreachable address to register the object adapter endpoints with the locator registry (setting Ice.Trace.Network=2 in the server configuration file will show if that's the case).

    If you're sure there's no network issues, the best way to look into this would be to reproduce the hang and:
    • Attach the debugger to the server and take a look at the server threads.
    • If a thread from the server is blocked invoking on the locator registry to register the adapter endpoints (or server process proxy) then you should attach the debugger to the registry and get the stack traces of the registry.
    • If a thread form the registry is blocked invoking on an IceGrid node, attach the debugger to the IceGrid node and get the stack traces of the node.

    Or, to make it simple, just get the strack traces of the server, registry and node when you get the hang :) I'll be happy to look at these traces to see if I can figure out what's wrong.

    Also, could you remind us which platform/OS you're using? And do you use the IceGrid registry replication?

    Thanks,

    Cheers,
    Benoit.
  • After an uphill battle including working around a gdb bug, I've characterised this much better:

    I don't think it's an IceGrid-related problem. I can reproduce the problem by starting my server from a set of scripts which just 'manually' start and stop the server repeatedly (ie not through a node). It's just so infrequent that I'd never noticed it until I started stress-testing my IceGrid-based server starter/stopper.

    The problem seems to occur when my server tries to simultaneously call activate() and IceGrid::RegistryPrx::checkedCast(), from two different threads. Some kind of deadlock occurs occasionally, which causes the program to hang forever (note: I have small values set for Ice.Override.Timeout and Ice.Override.ConnectTimeout).

    I've tried to reproduce the problem by modifying one of the Ice demo programs, but no luck yet. I'm not sure if this is because there's something fundamentally different about my server or just because the timings are particular. I'll keep walking my program towards the demo code and vice-versa, but at the same time I wonder if you have any ideas about what might be going on? I think it should be OK to make these two calls simultaneously, shouldn't it?

    The stack-trace of my program is attached below. And to answer your questions:
    - I'm running Linux (debian etch) on i386 hardware.
    - I'm not using any registry replication.


    Thanks again,

    Alex
    (gdb) thread 1
    [Switching to thread 1 (Thread -1492038784 (LWP 4204))]#0  0xffffe410 in __kernel_vsyscall ()
    (gdb) bt
    #0  0xffffe410 in __kernel_vsyscall ()
    #1  0xa711f5d6 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i686/cmov/libpthread.so.0
    #2  0xa721e14d in pthread_cond_wait () from /lib/i686/cmov/libc.so.6
    #3  0xa7e48053 in IceUtil::Cond::waitImpl<IceUtil::RecMutex> (this=0x806c104, mutex=@0x806c134)
        at ../../include/IceUtil/Cond.h:203
    #4  0xa7e48123 in IceUtil::Monitor<IceUtil::RecMutex>::wait (this=0x806c104)
        at ../../include/IceUtil/Monitor.h:152
    #5  0xa7e457ff in IceInternal::ObjectAdapterFactory::waitForShutdown (this=0x806c0f8)
        at ObjectAdapterFactory.cpp:67
    #6  0xa7dbb368 in Ice::CommunicatorI::waitForShutdown (this=0x806a550) at CommunicatorI.cpp:119
    #7  0xa7725e6b in orcaice::Application::run (this=0xaf971a24, argc=2, argv=0xaf971ad4)
        at /home/users/a.brooks/orca2/src/libs/orcaice/application.cpp:208
    #8  0xa773a2db in orcaice::IceApplication::main (this=0xaf971a24, argc=2, argv=0xaf971ad4,
        initData=@0xaf9718d4) at /home/users/a.brooks/orca2/src/libs/orcaice/iceapplication.cpp:369
    #9  0xa77270a2 in orcaice::Application::main (this=0xaf971a24, argc=2, argv=0xaf971ad4)
        at /home/users/a.brooks/orca2/src/libs/orcaice/application.cpp:149
    #10 0x0804c2ba in main (argc=2, argv=0xaf971ad4)
        at /home/users/a.brooks/orca2/src/tests/brick/main.cpp:23
    (gdb) thread 2
    [Switching to thread 2 (Thread -1525617776 (LWP 4209))]#0  0xffffe410 in __kernel_vsyscall ()
    (gdb) bt
    #0  0xffffe410 in __kernel_vsyscall ()
    #1  0xa7121dbe in __lll_mutex_lock_wait () from /lib/i686/cmov/libpthread.so.0
    #2  0xa711dad4 in _L_mutex_lock_414 () from /lib/i686/cmov/libpthread.so.0
    #3  0xa711d632 in pthread_mutex_lock () from /lib/i686/cmov/libpthread.so.0
    #4  0xa721e2b6 in pthread_mutex_lock () from /lib/i686/cmov/libc.so.6
    #5  0xa7c6c192 in IceUtil::RecMutex::lock (this=0x806c134) at RecMutex.cpp:243
    #6  0xa7e46f94 in IceUtil::Monitor<IceUtil::RecMutex>::lock (this=0x806c104)
        at ../../include/IceUtil/Monitor.h:89
    #7  0xa7e46fdd in LockT (this=0xa510dc00, mutex=@0x806c104) at ../../include/IceUtil/Lock.h:49
    #8  0xa7e44bc6 in IceInternal::ObjectAdapterFactory::findObjectAdapter (this=0x806c0f8,
        proxy=@0xa510dc64) at ObjectAdapterFactory.cpp:172
    #9  0xa7e7a2e0 in IceProxy::Ice::Object::__getDelegate (this=0x806c244) at Proxy.cpp:1052
    #10 0xa7e354c8 in IceProxy::Ice::Locator::getRegistry (this=0x806c240, __ctx=0x0)
        at Locator.cpp:1100
    #11 0xa7e2c325 in IceProxy::Ice::Locator::getRegistry (this=0x806c240)
        at ../../include/Ice/Locator.h:466
    #12 0xa7e240d7 in IceInternal::LocatorInfo::getLocatorRegistry (this=0x806cd58)
        at LocatorInfo.cpp:296
    #13 0xa7e49373 in Ice::ObjectAdapterI::updateLocatorRegistry (this=0x806c568,
        locatorInfo=@0xa510dfbc, proxy=@0xa510dfcc, registerProcess=false) at ObjectAdapterI.cpp:1113
    #14 0xa7e4d86d in Ice::ObjectAdapterI::activate (this=0x806c568) at ObjectAdapterI.cpp:113
    #15 0xa772ac34 in orcaice::Component::activate (this=0xaf9719dc)
        at /home/users/a.brooks/orca2/src/libs/orcaice/component.cpp:209
    #16 0xa7735a70 in orcaice::Context::activate (this=0x806d010)
        at /home/users/a.brooks/orca2/src/libs/orcaice/context.cpp:42
    #17 0x0804e237 in brick::Handler::run (this=0x806cfd0)
        at /home/users/a.brooks/orca2/src/tests/brick/handler.cpp:47
    #18 0xa7c6e203 in startHook (arg=0x806cfd0) at Thread.cpp:357
    #19 0xa711b31b in start_thread () from /lib/i686/cmov/libpthread.so.0
    #20 0xa72118ee in clone () from /lib/i686/cmov/libc.so.6
    (gdb) thread 3
    [Switching to thread 3 (Thread -1517225072 (LWP 4208))]#0  0xffffe410 in __kernel_vsyscall ()
    (gdb) bt
    #0  0xffffe410 in __kernel_vsyscall ()
    #1  0xa7121dbe in __lll_mutex_lock_wait () from /lib/i686/cmov/libpthread.so.0
    #2  0xa711dabc in _L_mutex_lock_88 () from /lib/i686/cmov/libpthread.so.0
    #3  0xa711d4ad in pthread_mutex_lock () from /lib/i686/cmov/libpthread.so.0
    #4  0xa721e2b6 in pthread_mutex_lock () from /lib/i686/cmov/libc.so.6
    #5  0xa7dc96d4 in IceUtil::Mutex::lock (this=0x806c250) at ../../include/IceUtil/Mutex.h:305
    #6  0xa7dde5fd in LockT (this=0xa590ea8c, mutex=@0x806c250) at ../../include/IceUtil/Lock.h:49
    #7  0xa7e7a21a in IceProxy::Ice::Object::__getDelegate (this=0x806c244) at Proxy.cpp:1042
    #8  0xa7e35934 in IceProxy::Ice::Locator::findObjectById (this=0x806c240, id=@0xa4800504, __ctx=0x0)
        at Locator.cpp:1050
    #9  0xa7e2b1d2 in IceProxy::Ice::Locator::findObjectById (this=0x806c240, id=@0xa4800504)
        at ../../include/Ice/Locator.h:436
    #10 0xa7e26447 in IceInternal::LocatorInfo::getEndpoints (this=0x806cd58, ref=@0xa590ee6c, ttl=-1,
        cached=@0xa590ee5f) at LocatorInfo.cpp:351
    #11 0xa7e4dec0 in Ice::ObjectAdapterI::isLocal (this=0x806c568, proxy=@0xa590ef74)
        at ObjectAdapterI.cpp:596
    #12 0xa7e44c39 in IceInternal::ObjectAdapterFactory::findObjectAdapter (this=0x806c0f8,
        proxy=@0xa590ef74) at ObjectAdapterFactory.cpp:183
    #13 0xa7e7a2e0 in IceProxy::Ice::Object::__getDelegate (this=0xa4800548) at Proxy.cpp:1052
    #14 0xa7e7b63f in IceProxy::Ice::Object::ice_isA (this=0xa4800548, typeId=@0xa7685768, context=0x0)
        at Proxy.cpp:139
    #15 0xa7e7f6f8 in IceProxy::Ice::Object::ice_isA (this=0xa4800548, typeId=@0xa7685768)
        at ../../include/Ice/Proxy.h:96
    #16 0xa7746192 in IceInternal::checkedCastImpl<IceInternal::ProxyHandle<IceProxy::IceGrid::Registry> > (b=@0xa590f1e4, context=0x0) at /opt/Ice-3.2/include/Ice/Proxy.h:436
    #17 0xa7746298 in IceInternal::checkedCastHelper<IceProxy::IceGrid::Registry, IceProxy::Ice::Object>
        (b=@0xa590f1e4, ctx=0x0) at /opt/Ice-3.2/include/Ice/ProxyHandle.h:91
    #18 0xa77462db in IceInternal::ProxyHandle<IceProxy::IceGrid::Registry>::checkedCast<IceProxy::Ice::Object> (r=@0xa590f1e4) at /opt/Ice-3.2/include/Ice/ProxyHandle.h:244
    #19 0xa7743838 in orcaice::ComponentThread::tryRegisterHome (this=0x806d730)
        at /home/users/a.brooks/orca2/src/libs/orcaice/detail/componentthread.cpp:84
    #20 0xa77442c5 in orcaice::ComponentThread::run (this=0x806d730)
        at /home/users/a.brooks/orca2/src/libs/orcaice/detail/componentthread.cpp:51
    #21 0xa7c6e203 in startHook (arg=0x806d730) at Thread.cpp:357
    #22 0xa711b31b in start_thread () from /lib/i686/cmov/libpthread.so.0
    #23 0xa72118ee in clone () from /lib/i686/cmov/libc.so.6
    (gdb) thread 4
    [Switching to thread 4 (Thread -1508832368 (LWP 4207))]#0  0xffffe410 in __kernel_vsyscall ()
    (gdb) bt
    #0  0xffffe410 in __kernel_vsyscall ()
    #1  0xa7211f56 in epoll_wait () from /lib/i686/cmov/libc.so.6
    #2  0xa7eb9447 in IceInternal::ThreadPool::run (this=0x806ca30) at ThreadPool.cpp:462
    #3  0xa7ebb058 in IceInternal::ThreadPool::EventHandlerThread::run (this=0x806c7a8)
        at ThreadPool.cpp:1066
    #4  0xa7c6e203 in startHook (arg=0x806c7a8) at Thread.cpp:357
    #5  0xa711b31b in start_thread () from /lib/i686/cmov/libpthread.so.0
    #6  0xa72118ee in clone () from /lib/i686/cmov/libc.so.6
    (gdb) thread 5
    [Switching to thread 5 (Thread -1500439664 (LWP 4206))]#0  0xffffe410 in __kernel_vsyscall ()
    (gdb) bt
    #0  0xffffe410 in __kernel_vsyscall ()
    #1  0xa711f85c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/i686/cmov/libpthread.so.0
    #2  0xa721e194 in pthread_cond_timedwait () from /lib/i686/cmov/libc.so.6
    #3  0xa7de09b5 in IceUtil::Cond::timedWaitImpl<IceUtil::Mutex> (this=0x806c44c, mutex=@0x806c47c,
        timeout=@0x806c49c) at ../../include/IceUtil/Cond.h:224
    #4  0xa7de0aaa in IceUtil::Monitor<IceUtil::Mutex>::timedWait (this=0x806c44c, timeout=@0x806c49c)
        at ../../include/IceUtil/Monitor.h:180
    #5  0xa7de0bfc in IceInternal::ConnectionMonitor::run (this=0x806c428) at ConnectionMonitor.cpp:78
    #6  0xa7c6e203 in startHook (arg=0x806c428) at Thread.cpp:357
    #7  0xa711b31b in start_thread () from /lib/i686/cmov/libpthread.so.0
    #8  0xa72118ee in clone () from /lib/i686/cmov/libc.so.6
    (gdb) thread 6
    [Switching to thread 6 (Thread -1492046960 (LWP 4205))]#0  0xffffe410 in __kernel_vsyscall ()
    (gdb) bt
    #0  0xffffe410 in __kernel_vsyscall ()
    #1  0xa716b35e in ?? () from /lib/i686/cmov/libc.so.6
    #2  0xa716b3ff in sigwait () from /lib/i686/cmov/libc.so.6
    #3  0xa7c5276f in sigwaitThread () at CtrlCHandler.cpp:124
    #4  0xa711b31b in start_thread () from /lib/i686/cmov/libpthread.so.0
    #5  0xa72118ee in clone () from /lib/i686/cmov/libc.so.6
    (gdb) thread 7
    Thread ID 7 not known.
    (gdb)
    

    [threads 2 and 3 are the ones that seem to be deadlocked.]
  • benoit
    benoit Rennes, France
    Hi Alex,

    I'm afraid you found a deadlock in the Ice adapter activation code :(. We'll look into fixing this.

    You should be able to work-around the problem by explicitly disabling collocation optimization for the locator proxy. If you use Ice 3.2.0, you can disable collocation optimization for the locator proxy by setting the following property:
    Ice.Default.Locator.CollocationOptimization=0

    You can also disable it with the following code (you should do this before creating the object adapter):
       communicator->setDefaultLocator(Ice::LocatorPrx::uncheckedCast(communicator->getDefaultLocator()->ice_collocationOptimized(false)));
    

    Cheers,
    Benoit.
  • Hi Benoit,

    Great, I'm relieved that I can stop tracking this thing down!

    I used the second of your two suggestions, and my test system has been running for a few hours now without any failures.


    Thanks for all the help,

    Alex
  • Hi Benoit,

    We were using the workaround you'd suggested all the way through 3.2.1 and 3.3.0 without a problem.

    Now it seems that the problem has re-surfaced in 3.3.1. Can you think of a possible reason? Was the original problem fixed?

    Two threads try to 1) activate the adapter and 2) add an object to the registry. Around 40% of the time I see symptoms of a deadlock. A quick look at the debugger shows the functions associated with the two activities above.

    I've tried disabling the following workaround line with no apparent change in behavior. communicator_->setDefaultLocator(Ice::LocatorPrx::uncheckedCast(communicator_->getDefaultLocator()->ice_collocationOptimized(false)));

    Any suggestions?

    Thanks,
    Alex

    Debian Linux, Ice 3.3.1 installed from packages.
  • benoit
    benoit Rennes, France
    Hi Alex,

    Could you post the stack traces of your client when it appears to deadlock?

    Cheers,
    Benoit.
  • (gdb) thread 1
    [Switching to thread 1 (Thread 0xb72036d0 (LWP 22649))]#0  0xb7f8a424 in __kernel_vsyscall ()
    (gdb) where
    #0  0xb7f8a424 in __kernel_vsyscall ()
    #1  0xb722cfb5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i686/cmov/libpthread.so.0
    #2  0xb732df2d in pthread_cond_wait () from /lib/i686/cmov/libc.so.6
    #3  0xb7a26b49 in IceInternal::ObjectAdapterFactory::waitForShutdown (this=0x8f55b28) at ../../include/IceUtil/Cond.h:215
    #4  0xb7996b95 in Ice::CommunicatorI::waitForShutdown (this=0x8f55780) at CommunicatorI.cpp:119
    #5  0xb75ced69 in orcaice::Application::run (this=0xbf9a7d10, argc=1, argv=0xbf9a7de4) at /home/makara/svn/mrsrc-trunk/orca/src/libs/orcaice/application.cpp:190
    #6  0xb79846bb in Ice::Application::mainInternal (this=0xbf9a7d10, argc=1, argv=0xbf9a7de4, initializationData=@0xbf9a7c48) at Application.cpp:576
    #7  0xb79851b8 in Ice::Application::main (this=0xbf9a7d10, argc=1, argv=0xbf9a7de4, initData=@0xbf9a7c48) at Application.cpp:331
    #8  0xb75d0fab in orcaice::Application::orcaMain (this=0xbf9a7d10, argc=1, argv=0xbf9a7de4) at /home/makara/svn/mrsrc-trunk/orca/src/libs/orcaice/application.cpp:107
    #9  0x0805b689 in main (argc=1, argv=0xbf9a7de4) at /home/makara/svn/mrsrc-trunk/build/orca/src/components/lasermon/autogen/main.cpp:23
    (gdb) thread 2
    [Switching to thread 2 (Thread 0xb7202b90 (LWP 22650))]#0  0xb7f8a424 in __kernel_vsyscall ()
    (gdb) where
    #0  0xb7f8a424 in __kernel_vsyscall ()
    #1  0xb722d2e2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/i686/cmov/libpthread.so.0
    #2  0xb732df84 in pthread_cond_timedwait () from /lib/i686/cmov/libc.so.6
    #3  0xb7b973bd in IceUtil::Timer::run (this=0x8f55fc0) at ../../include/IceUtil/Cond.h:241
    #4  0xb7b93f9b in startHook (arg=0x8f56058) at Thread.cpp:368
    #5  0xb72294e5 in start_thread () from /lib/i686/cmov/libpthread.so.0
    #6  0xb731f10e in clone () from /lib/i686/cmov/libc.so.6
    (gdb) thread 3
    [Switching to thread 3 (Thread 0xb6a01b90 (LWP 22651))]#0  0xb7f8a424 in __kernel_vsyscall ()
    (gdb) where
    #0  0xb7f8a424 in __kernel_vsyscall ()
    #1  0xb731f8d6 in epoll_wait () from /lib/i686/cmov/libc.so.6
    #2  0xb7ab7ef3 in IceInternal::ThreadPool::run (this=0x8f568f0) at ../Ice/Selector.h:259
    #3  0xb7ab974c in IceInternal::ThreadPool::EventHandlerThread::run (this=0x8f568b8) at ThreadPool.cpp:782
    #4  0xb7b93f9b in startHook (arg=0x8f568b8) at Thread.cpp:368
    #5  0xb72294e5 in start_thread () from /lib/i686/cmov/libpthread.so.0
    #6  0xb731f10e in clone () from /lib/i686/cmov/libc.so.6
    (gdb) thread 4
    [Switching to thread 4 (Thread 0xb6200b90 (LWP 22652))]#0  0xb7f8a424 in __kernel_vsyscall ()
    (gdb) where
    #0  0xb7f8a424 in __kernel_vsyscall ()
    #1  0xb722cfb5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i686/cmov/libpthread.so.0
    #2  0xb732df2d in pthread_cond_wait () from /lib/i686/cmov/libc.so.6
    #3  0xb79c693d in IceInternal::EndpointHostResolver::run (this=0x8f57338) at ../../include/IceUtil/Cond.h:215
    #4  0xb7b93f9b in startHook (arg=0x8f57338) at Thread.cpp:368
    #5  0xb72294e5 in start_thread () from /lib/i686/cmov/libpthread.so.0
    #6  0xb731f10e in clone () from /lib/i686/cmov/libc.so.6
    (gdb) thread 5
    [Switching to thread 5 (Thread 0xb59ffb90 (LWP 22653))]#0  0xb7f8a424 in __kernel_vsyscall ()
    (gdb) where
    #0  0xb7f8a424 in __kernel_vsyscall ()
    #1  0xb722fbc9 in __lll_lock_wait () from /lib/i686/cmov/libpthread.so.0
    #2  0xb722aff9 in _L_lock_89 () from /lib/i686/cmov/libpthread.so.0
    #3  0xb722a902 in pthread_mutex_lock () from /lib/i686/cmov/libpthread.so.0
    #4  0xb732e0f6 in pthread_mutex_lock () from /lib/i686/cmov/libc.so.6
    #5  0xb7b554d5 in IceUtil::Mutex::lock (this=0x8f564d4) at /usr/include/IceUtil/Mutex.h:211
    #6  0xb7a06638 in IceInternal::LocatorInfo::finishRequest (this=0x8f564c8, ref=@0x8f5aae8, wellKnownRefs=@0x8f5ab44, proxy=@0xb59fee38, notRegistered=false) at ../../include/IceUtil/Lock.h:49
    #7  0xb7a08115 in IceInternal::LocatorInfo::Request::response (this=0x8f5aae0, proxy=@0xb59fee38) at LocatorInfo.cpp:485
    #8  0xb7a08314 in ice_response (this=0x8f5aae0, proxy=@0xb59fee38) at LocatorInfo.cpp:44
    #9  0xb7a16bc4 in Ice::AMI_Locator_findObjectById::__response (this=0x8f5ab5c, __ok=true) at Locator.cpp:626
    #10 0xb7a42e49 in IceInternal::OutgoingAsync::__finished (this=0x8f5ab5c, is=@0xb59ff174) at OutgoingAsync.cpp:395
    #11 0xb79b3795 in Ice::ConnectionI::message (this=0xb4001580, stream=@0xb59ff174, threadPool=@0xb59ff264) at ConnectionI.cpp:1097
    #12 0xb7ab8b33 in IceInternal::ThreadPool::run (this=0x8f58b80) at ThreadPool.cpp:523
    #13 0xb7ab974c in IceInternal::ThreadPool::EventHandlerThread::run (this=0x8f58e10) at ThreadPool.cpp:782
    #14 0xb7b93f9b in startHook (arg=0x8f58e10) at Thread.cpp:368
    #15 0xb72294e5 in start_thread () from /lib/i686/cmov/libpthread.so.0
    #16 0xb731f10e in clone () from /lib/i686/cmov/libc.so.6
    (gdb) thread 6
    [Switching to thread 6 (Thread 0xb51feb90 (LWP 22654))]#0  0xb7f8a424 in __kernel_vsyscall ()
    (gdb) where
    #0  0xb7f8a424 in __kernel_vsyscall ()
    #1  0xb731f8d6 in epoll_wait () from /lib/i686/cmov/libc.so.6
    #2  0xb7a956be in IceInternal::SelectorThread::run (this=0x8f58ee0) at ../Ice/Selector.h:259
    #3  0xb7a96868 in IceInternal::SelectorThread::HelperThread::run (this=0x8f55dd0) at SelectorThread.cpp:279
    #4  0xb7b93f9b in startHook (arg=0x8f55dd0) at Thread.cpp:368
    #5  0xb72294e5 in start_thread () from /lib/i686/cmov/libpthread.so.0
    #6  0xb731f10e in clone () from /lib/i686/cmov/libc.so.6
    (gdb) thread 7
    [Switching to thread 7 (Thread 0xb49fdb90 (LWP 22655))]#0  0xb7f8a424 in __kernel_vsyscall ()
    (gdb) where
    #0  0xb7f8a424 in __kernel_vsyscall ()
    #1  0xb722cfb5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i686/cmov/libpthread.so.0
    #2  0xb732df2d in pthread_cond_wait () from /lib/i686/cmov/libc.so.6
    #3  0xb799baf1 in IceInternal::ConnectRequestHandler::getConnection (this=0x8f5a788, waitInit=true) at ../../include/IceUtil/Cond.h:215
    #4  0xb799d234 in IceInternal::ConnectRequestHandler::sendRequest (this=0x8f5a788, out=0xb49fcd7c) at ConnectRequestHandler.cpp:216
    #5  0xb7a49431 in IceInternal::Outgoing::invoke (this=0xb49fcd7c) at Outgoing.cpp:177
    #6  0xb7a72258 in IceDelegateM::Ice::Object::ice_isA (this=0x8f597f8, __id=@0xb787dbe8, context=0x0) at Proxy.cpp:1139
    #7  0xb7a70221 in IceProxy::Ice::Object::ice_isA (this=0x8f59d00, typeId=@0xb787dbe8, context=0x0) at Proxy.cpp:125
    warning: (Internal error: pc 0xb76029dd in read in psymtab, but not in symtab.)
    
    #8  0xb76029de in IceInternal::checkedCastImpl<IceInternal::ProxyHandle<IceProxy::IceGrid::Registry> > (b=warning: (Internal error: pc 0xb7602830 in read in psymtab, but not in symtab.)
    
    warning: (Internal error: pc 0xb76029dd in read in psymtab, but not in symtab.)
    
    @0xb49fd03c, context=0x0) at /usr/include/Ice/Proxy.h:92
    #9  0xb75fdda9 in orcaice::detail::registerHomeInterface (context=@0x8f5943c) at /usr/include/Ice/ProxyHandle.h:92
    #10 0xb75f4c7b in orcaice::detail::ComponentThread::tryRegisterHome (this=0x8f59410) at /home/makara/svn/mrsrc-trunk/orca/src/libs/orcaice/detail/componentthread.cpp:87
    #11 0xb75f5211 in orcaice::detail::ComponentThread::walk (this=0x8f59410) at /home/makara/svn/mrsrc-trunk/orca/src/libs/orcaice/detail/componentthread.cpp:60
    #12 0xb7b67804 in gbxiceutilacfr::SafeThread::run (this=0x8f59410) at /home/makara/svn/mrsrc-stable/gearbox/src/gbxsickacfr/gbxiceutilacfr/safethread.cpp:31
    #13 0xb7b93f9b in startHook (arg=0x8f59410) at Thread.cpp:368
    #14 0xb72294e5 in start_thread () from /lib/i686/cmov/libpthread.so.0
    #15 0xb731f10e in clone () from /lib/i686/cmov/libc.so.6
    (gdb) thread 8
    [Switching to thread 8 (Thread 0xb3fffb90 (LWP 22656))]#0  0xb7f8a424 in __kernel_vsyscall ()
    (gdb) where
    #0  0xb7f8a424 in __kernel_vsyscall ()
    #1  0xb722cfb5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i686/cmov/libpthread.so.0
    #2  0xb732df2d in pthread_cond_wait () from /lib/i686/cmov/libc.so.6
    #3  0xb7a49a11 in IceInternal::Outgoing::invoke (this=0xb3ffe4bc) at ../../include/IceUtil/Cond.h:215
    #4  0xb7a1950c in IceDelegateM::Ice::Locator::getRegistry (this=0x8f59c10, __context=0x0) at Locator.cpp:1427
    #5  0xb7a11b86 in IceProxy::Ice::Locator::getRegistry (this=0x8f55e40, __ctx=0x0) at Locator.cpp:1158
    #6  0xb7a0302a in IceInternal::LocatorInfo::getLocatorRegistry (this=0x8f564c8) at ../../include/Ice/Locator.h:447
    #7  0xb7a34242 in Ice::ObjectAdapterI::updateLocatorRegistry (this=0x8f573f0, locatorInfo=@0xb3ffecb0, proxy=@0xb3ffeca0, registerProcess=false) at ObjectAdapterI.cpp:1140
    #8  0xb7a36927 in Ice::ObjectAdapterI::activate (this=0x8f573f0) at ObjectAdapterI.cpp:111
    #9  0xb75e4283 in orcaice::Context::activate (this=0x8f59b50) at /home/makara/svn/mrsrc-trunk/orca/src/libs/orcaice/context.cpp:47
    #10 0xb75dc2a3 in orcaice::activate (context=@0x8f59b50, thread=0x8f59b18, subsysName=@0xb3fff160, retryInterval=2, retryNumber=-1) at /home/makara/svn/mrsrc-trunk/orca/src/libs/orcaice/iceutils.cpp:145
    #11 0x0805390b in lasermon::MainThread::initialise (this=0x8f59b18) at /home/makara/svn/mrsrc-trunk/orca/src/components/lasermon/mainthread.cpp:40
    #12 0xb75d771c in orcaice::SubsystemThread::walk (this=0x8f59b18) at /home/makara/svn/mrsrc-trunk/orca/src/libs/orcaice/subsystemthread.cpp:25
    #13 0xb75c8fcd in orcaice::SubstatusThread::run (this=0x8f59b18) at /home/makara/svn/mrsrc-trunk/orca/src/libs/orcaice/substatusthread.cpp:34
    #14 0xb7b93f9b in startHook (arg=0x8f59b18) at Thread.cpp:368
    #15 0xb72294e5 in start_thread () from /lib/i686/cmov/libpthread.so.0
    #16 0xb731f10e in clone () from /lib/i686/cmov/libc.so.6
    
  • benoit
    benoit Rennes, France
    Hi Alex,

    I'm afraid this is a new problem :( It should only occur if a thread resolves the endpoints of an adapter or well-known object while another thread activates an indirect object adapter (only the first indirect object adapter).

    You should be able to workaround it by increasing the client thread pool size or ensuring that no threads invoke on an indirect proxy while the first indirect object adapter is being activated. We'll look into fixing this for the next release.

    Cheers,
    Benoit.
  • ok, thanks for checking it out.