IceGrid server termination (stopServer) process

spsoni · April 2008

Hi There,

I have recently started using IceGrid for running my application.

My server application extended CtrlCHandler in C++, therefore, to handle the application termination (graceful server termination, cleaning up some data and threads).

When I manually find server process (automatically started by IceGrid) and kill it using unix kill command (this will invoke CtrlCHandler and invoke clean shutdown of server application), and later IceGrid will restart it. This is expected behaviour and fine.

But, when I stop running server from IceGrid Admin GUI tool, my server seem to be immediately isolated from any further communication and later kill'ed (possibly by kill -9) as I cannot see debug message from CtrlCHandler from my server.

My first curiosity is, how does this "server stop" thing work from IceGrid Admin GUI or in general?

Second, how can I configure IceGrid Admin tool to safely invoke termination of my server application(possibly by just sending Ctrl+C (i mean actual unix function call for it), but not Kill -9 equivalent unix function call.). As, I am missing async callbacks on my client, since IceGrid Admin "server stop" seem to be isolating my server appplication for any message communication first and then killing it. Please explain.

Thanks.

michi · April 2008

spsoni wrote: »

But, when I stop running server from IceGrid Admin GUI tool, my server seem to be immediately isolated from any further communication and later kill'ed (possibly by kill -9) as I cannot see debug message from CtrlCHandler from my server.

My first curiosity is, how does this "server stop" thing work from IceGrid Admin GUI or in general?

IceGrid stops servers by invoking the shutdown operation an the Process interface. Servers that have their adapter's RegisterProcess property set (in 3.2.1 and earlier) provide a servant that implements the Process interface. When the shutdown operation is invoked, the server deactivates itself normally by shutting down the object adapters and destroying the communicator.

IceGrid will kill the server with a signal only if it doesn't shut down within its deactivation timeout (60 seconds, by default). If the server does not have RegisterProcess set, IceGrid first tries to kill the server with SIGTERM and then, if it doesn't go away within the activation timout, sends it a SIGKILL. (Under Windows, it sends a Ctrl+Break event and, if the server does not go away after the timeout, kills the process outright.)

Second, how can I configure IceGrid Admin tool to safely invoke termination of my server application(possibly by just sending Ctrl+C (i mean actual unix function call for it), but not Kill -9 equivalent unix function call.). As, I am missing async callbacks on my client, since IceGrid Admin "server stop" seem to be isolating my server appplication for any message communication first and then killing it. Please explain.

If you want to intercept server deactivation for your application, you can clear RegisterProcess and register a CtrlCHandler. In that case, the handler's callback will be called when the signal arrives, and you can arrange for the application to clean up from there.

As of Ice 3.3 beta, the implementation of this has changed somewhat, in particular, you can replace the default Process facet with one of your own that implements server shutdown.

See http://www.zeroc.com/doc/Ice-3.3.0b/manual/IceGrid.36.21.html and http://www.zeroc.com/doc/Ice-3.3.0b/manual/Adv_server.29.18.html#124997 for more detail.

Cheers,

Michi.

spsoni · May 2008

I am using FreeBSD with Ice 3.1.1 and C++ for server and python for clients.

Thank you Michi.

If the server does not have RegisterProcess set, IceGrid first tries to kill the server with SIGTERM

I have implemented CtrlCHandler to handle termination signal from IceGrid and unchecked RegisterProcess thing so that IceGrid shall send SIGTERM to my application. Now my application terminates smoothly and restarts on next request from client.

Now, I am facing two problems as follows:

1. Occasionally, when my server application is terminating (using stop from IceGrid Admin tool), I can log that callback ( ice_response ) has been send from server, but my client application do not receive that callback. I am even using adapter.deactivate() and adapter.waitForDeactivate() things, so that it shall send all the callback responses back to clients. But, I am bit confused, how can we trace, whether callback has been successfully send to client. As ice_response() do not throw any exception!

2. When I am trying to terminate my server application by unix kill command using process id from icegrid admin GUI tool, it is throwing exception directly to my client application, rather than somehow icegrid noticing this server termination and restart new instance for client to be served. I have used different parameters with kill command (1,2,3,6,9,14,15) and I was getting NoEndpointExcpetion and ConnectionLostException with them. Please explain, how can I let icegrid interfere in such application termination and restart new instance of the server for clients to be served.

My client is connecting to server using registry through following code.

class ChannelManager:
    def __init__(self, argv,configFile=None):
        """
            Some data initialisation to load configurations from file.
        """
        initData = Ice.InitializationData()
        if configFile:
            initData.properties = Ice.createProperties()
            initData.properties.load(configFile)

        """
            Actual communicator object initialisation.
        """
        self.communicator = Ice.initialize(argv,initData)

    def _create_channel(self, stringified_proxy):
        try :
            channel=Dbice.DbInterfacePrx.checkedCast(self.communicator.stringToProxy(stringified_proxy))
        except Ice.NotRegisteredException:
            proxy = "IceGrid/Query"
            query = IceGrid.QueryPrx.checkedCast(self.communicator.stringToProxy(proxy))
            identity = Ice.Identity(stringified_proxy)
            channel = Dbice.DbInterfacePrx.checkedCast(query.findObjectById(identity))
        except Ice.ConnectionRefusedException, e :
            raise e
        except Ice.ConnectionLostException, e :
            raise e
        except :
            traceback.print_exc()

        if not channel :
            raise RuntimeError("invalid proxy")
        return channel

Thanks in advance.

spsoni · May 2008

2. When I am trying to terminate my server application by unix kill command using process id from icegrid admin GUI tool, it is throwing exception directly to my client application, rather than somehow icegrid noticing this server termination and restart new instance for client to be served. I have used different parameters with kill command (1,2,3,6,9,14,15) and I was getting NoEndpointExcpetion and ConnectionLostException with them. Please explain, how can I let icegrid interfere in such application termination and restart new instance of the server for clients to be served.

My apologise for the above doubt. What I found that, to fix problem 1, I deliberately added 2 seconds of delay in server termination process to make sure any pending callbacks are send for sure. But on my client, I was retrying to connect to the server immediately on failure of server, where IceGrid will not restart the server untill the first instance in fully terminated. Therefore, as soon I added delay in consecutive connection requests from client, solved the problem.

But I am still worried about my problem 1. Any clues?

Thanks.

benoit · May 2008

Hi,

Could you upgrade to Ice 3.2.1 or 3.3b? We only provide free support on the forums for the latest Ice version. Also without more information, it's difficult to say what the problem could be. Did you try running your client & server with --Ice.Trace.Protocol=2 to see if the response was sent to the client and if the client receives it?

Cheers,
Benoit.

spsoni · May 2008

Thank you Benoit,

Now I am using Ice3.2.1 on FreeBSD 6.2.

After using Ice.Trace.Protocol=2 property, I was able to see following messages on client and server respectively.

Client Protocol Trace.

[ 05/09/08 11:13:20.827 Protocol: sending asynchronous request
  message type = 0 (request)
  compression status = 0 (not compressed; do not compress response, if any)
  message size = 1052
  request id = 40
  identity = DbPutInterface
  facet =
  operation = put
  mode = 0 (normal)
  context =  ]
[ 05/09/08 11:13:20.842 Protocol: received close connection
  message type = 4 (close connection)
  compression status = 1 (not compressed; compress response, if any)
  message size = 14 ]

Server Protocol Trace.

[ 05/09/08 11:15:31.451 Protocol: received request during closing
  (ignored by server, client will retry)
  message type = 0 (request)
  compression status = 0 (not compressed; do not compress response, if any)
  message size = 1052
  request id = 40
  identity = DbPutInterface
  facet =
  operation = put
  mode = 0 (normal)

So, when I kill my server application, I was thinking that, server is properly sending callback to client and client is not receiving it. But, after looking at the above trace information, I came to know of the policy that server do not throw any exception to client to let it know that it has received the message after closing.

My client is not receiving any exception from Ice layer on receiving the following message

[ 05/09/08 11:13:20.842 Protocol: received close connection

Since I am using IceGrid, on receiving close connection message, client is requesting IceGrid again to restart the server application. But, how do I let my previous async call waiting thread know that there would be no callback, since server has terminated and new server instance will restart.

I do not wish to use timed wait on my client for callback, as it will affect overall performance. Please advice.

Thanks.

benoit · May 2008

Hi,

spsoni wrote: »

So, when I kill my server application, I was thinking that, server is properly sending callback to client and client is not receiving it. But, after looking at the above trace information, I came to know of the policy that server do not throw any exception to client to let it know that it has received the message after closing.

My client is not receiving any exception from Ice layer on receiving the following message

[ 05/09/08 11:13:20.842 Protocol: received close connection

That's not quite correct. The Ice runtime does throw Ice::CloseConnectionException for your twoway invocation upon receiving the close connection message. However, you don't get this exception because the Ice runtime also transparently retries the invocation since it's safe to do so. You can enable Ice.Trace.Retry=1 to see the retry. If you disable retries with Ice.RetryIntervals=-1, you will get the exception.

Since I am using IceGrid, on receiving close connection message, client is requesting IceGrid again to restart the server application. But, how do I let my previous async call waiting thread know that there would be no callback, since server has terminated and new server instance will restart.

I do not wish to use timed wait on my client for callback, as it will affect overall performance. Please advice.

Thanks.

It's not clear to me what this "previous async call waiting thread" and callback are. Is this another twoway AMI request? Or are you waiting for a callback from the server on one of the client's Ice object? Perhaps you could post the code of your client, it would help understanding what your client is doing.

Cheers,
Benoit.

spsoni · May 2008

Attached is the source code of my client application.

I am trying to make a higher level api for other developers to use it. The AMI call has been wrapped internal to my API so that, api user see it as synchronous call.

Therefore, between every AMI call and its callback objects ice_response/ice_exception invocation, when I terminate my server application, occassionally last message is received by server after termination process is started. Whereas, every AMI invocation on client application is waiting for its callback object ice_response or ice_exception to be invoked before it can proceed further.

Since, put_async and get_async AMI calls on my server are not IDEMPOTENT, i wonder why would ICE retry its invocation on receiving close connection message. Rather, it shall throw exception to callback object that AMI invocation was received after the server closing process has started or something.

I hope, source code for client would make it further clear.

Cheers.

benoit · May 2008

See here in the Ice manual for the reason why Ice retries after getting the close connection message.

In short, it's safe to retry without breaking at-most-once semantics and even if the invocation is not idempotent because the client runtime received the close connection message from the server and this indicates that the request wasn't dispatched (it was instead silently ignored as indicated by the trace).

I better understand what your client is doing now but it's still not clear to me what the problem is

. Are you saying that the AMI callback ice_response/ice_exception methods are never called when this occurs? This should definitely not be the case since the Ice client runtime is supposed to retry the invocation and eventually call on the AMI callback if the request succeeded/failed after this retry. You should enable Ice.Trace.Protocol=2, Ice.Trace.Retry=1 and Ice.Trace.Network=2 to figure out what occurs when the Ice runtime retries the invocation.

Cheers,
Benoit.

spsoni · May 2008

Hi,

Ice.RetryIntervals=-1 seem to have solved problem for me. I was not getting AMI callback object ice_response or ice_expcetion when Ice::CloseConnectionException was thrown, since it was suppose to retry.

My curiosity question, will Ice cache the callback object passed in the AMI call, hence can we rely on the Ice to use the same callback object passed in first invocation from user code when it retries on receiving CloseConnectionException.

Thanks.

benoit · May 2008

Hi,

Yes, the Ice runtime uses the AMI callback object for the retried invocation.

The Ice runtime should always call the ice_response/ice_exception methods on the callback object. If the first invocation fails and is retried, the Ice runtime will call the AMI callback object only once the retried invocation fails (and can't be retried anymore) or succeeds. It's the same as regular 2-way calls: a 2-way call returns only once it failed and can't be retried or once the response is received.

Note that under some circumstances, it might look like the AMI callback isn't called by the Ice runtime. This can occur for example if you don't use timeouts. In such a case it might take a very long time for the call to fail if the server is unreachable or unresponsive.

Cheers,
Benoit.

Archived

IceGrid server termination (stopServer) process

Comments

Categories