Archived

This forum has been archived. Please start a new discussion on GitHub.

Application Health/Status monitor in IceGrid

Something like this may already exist but this is what I'd like to see if it doesn't already.

IceGrid could monitor the health and status of applications that it starts. It could maybe do this by calling a “health” or “admin” interface on the application to pass a callback “health/status” interface that the application would then call back on at some configurable rate.

If the application stops calling this callback interface for some configurable time out period IceGrid would begin some recovery procedures. Such as to call the applications “admin” interface to attempt to shut it down cleanly. If that fails it could kill the application and restart it

This would detect if the application is still alive but not responding to requests. Using a test to check if the process is still alive may not detect this.

Comments

  • bernard
    bernard Jupiter, FL
    Hi John,

    A server started by IceGrid is really 'forked' by an IceGrid node, and each IceGrid node monitors its child processes ... and will for example detect its death. There is however to automatic pinging from the child process to the IceGrid node (or vice-versa).

    Each server managed with IceGrid also includes an admin object (see http://www.zeroc.com/doc/Ice-3.4.1-IceTouch/manual/Adv_server.33.18.html). IceGrid and your application can use this admin object for various purposes, for example to shut down cleanly this server, send a message to server's stderr or stdout, retrieve the server's properties (and more).

    If you want to write a health-monitoring service--that checks the status of your servers--it would make a lot of sense to use this admin object in all yours servers, and possibly add a new facet with additional functionality (e.g. the ability to provide a callback that will be called every x seconds).

    All the best,
    Bernard