stopping an icebox server via the admin proxy

alverson · June 2013

I have recently upgraded from ICE 3.4.1 to ICE 3.5. With ICE 3.4.1, when I remove an application via the admin proxy, the corresponding icebox server is stopped. This does not occur with ICE 3.5.

I can see that all my services are being destroyed via log messages. However, with ICE 3.5 it is not returning from the removeApplication call until the deactivation-timeout has expired (default=60 secs). I have also set breakpoints to verify the server has received the stop signal. It's just not terminating for some reason.

This worked fine with ICE 3.4.1. Has the assumptions changed with ICE 3.5.0?

I also verified the same behavior by calling stopServer(serverId) via the admin proxy as well.

My current workaround is to set the deactivation-timeout on my icebox so that I shorten the period of time that it takes to terminate the server.

benoit · July 2013

Hi,

This should also work with Ice 3.5. It appears that your IceBox server is no longer shutting down gracefully but hangs when the IceBox server is stopped.

Did you try to attach to the IceBox server with the debugger to see the stack traces of all the threads? This might gives us a hint on what the IceBox server is doing. Does it occur all the time? For example, if you start the IceBox server and then immediately try to stop it with the IceGrid admin stop command, does it also hang?

Cheers,
Benoit.

alverson · July 2013

This is very strange.

When I break into the icebox server at ServiceManagerI::stop and step through it, everything shutdowns correctly, and the process exits but if I just let it run it appears that it is waiting on something to finish before exiting, and it never exits. Obviously, stopping the process in the debugger is changing the timing characteristics.

I have been able to verify that all my services are being "stopped" successfully via log messages.

I guess I will need to compile the ICE code so I can insert log messages there to determine where exactly it is blocking since breaking into it is causing the timing characteristics to change.

benoit · July 2013

Hi Dennis,

Did you also try to not run the process under the debugger but instead let it hang on shutdown and then attach the debugger? You should be able to get the stacks this way. Btw, which platform do you use to run the IceBox server?

Cheers,
Benoit.

alverson · July 2013

Though this does not explain how my application executed differently under 3.4.1 as it does under 3.5, I think I found the problem.

I have a logging singleton used by all of our back-end server code that contains an ICE thread to send log messages back to GUI client. The ICE thread uses a monitor lock to receive data over a queue. I am attempting to terminate the thread in the destructor by notifying the monitor lock to unblock the thread so it can gracefully exit. Since the object is a singleton, the destructor is not called until the process is about to completely exit, so the ICE communicator has been shutdown and I'm assuming all ICE threads, resources, etc. associated with that communicator. Therefore, I am blocking on notify indefinitely.

I appreciate the help.

benoit · July 2013

Hi Dennis

It's not clear if you found the solution to your problem. It would be interesting to figure out what caused the change in behavior. If I understand it correctly, your destructor isn't being called anymore and as a result the thread stays around preventing the process from shutting down. This could possibly be caused by a memory management issue. Is the Ice communicator keeping a reference to this shared logger? Is this something you could perhaps easily reproduce in a small test case?

Cheers,
Benoit.

alverson · July 2013

Attachment not found.

I've uploaded a snapshot of the debugger where it is blocked at. It is blocked in IceUtil::Cond::signal and the _cond variable seems to be null. I'm not sure if this is causing a problem. The queueing mechanism is working fine during normal operation. BTW - my solution is not perform termination processing in the destructor because the process is exitting.

You can also see that the main thread is the only thread remaining because the destructor is being called right before the process exits.

You can also see the stack trace showing the sequence of calls. The logging thread that blocks waiting for data on the queue has already exited.

I will try and develop a smaller example demonstrating the same problem.

benoit · July 2013

Hi Dennis,

Which Windows version do you use? The VS2012 build of Ice now uses native Windows condition variables whereas the VS2010 build still uses semaphores. This could explain the change in behavior between Ice 3.4 and Ice 3.5. It's not clear to me why the wake up of the condition variable hangs here however. I was unable reproduce the problem with one of our demo (modified IceBox hello service). It would be great if you could send us a small test case demonstrating the problem.

Cheers,
Benoit.

Archived

stopping an icebox server via the admin proxy

Comments

Categories