What is the best setup for minimizing the effect when a grid node is down

aozarov · March 2008

When a grid node is down (crush, maintenance) what is the best strategy to minimize its effect on the clients (and save useless connection establishment tries). I should mention that we would like to distribute the load across the replicated nodes as much as possible and therefore would prefer not to cache the connection on the proxy and use the random selection strategy (as described in the manual section 37.3). I am aware that we can configure a frequent registry lookup to refresh the endpoints but wonder how quick the registry will be aware of the unavailability of the node and will it not include its endpoints. Also I wonder if there is a concept of "bad endpoint" which are not used temporarily in case of consecutive failures . Any other suggestions?
Thanks,
Arie.

benoit · March 2008

How quickly the registry detects that a node is down depends on the IceGrid.Registry.NodeSessionTimeout property, see the Ice manual for more information on this property. Once the registry detects that a node is down, it won't return endpoints for servers deployed on this node.

Ice doesn't have any concepts of "bad endpoints". If you have a commercial interest in such a feature please contact us at info@zeroc.com.

To minimize connection attempts to invalid endpoints, you'll need to configure the client to lookup for server endpoints with the registry often and configure the registry node session timeout to a suitable value in order to detect shutdown nodes in a timely manner.

Cheers,
Benoit.

aozarov · March 2008

Thanks!
I assume once the node is up again the registration (with the registry) is immediate (provided icegridnode is configured to run upon machine reboot)?

benoit · March 2008

Yes, the node establishes a session with the registry on startup.

Cheers,
Benoit.

Archived

What is the best setup for minimizing the effect when a grid node is down

Comments

Categories