Archived

This forum has been archived. Please start a new discussion on GitHub.

when shuting down one node in icegrid,the client wait a long time to reconnect

we configured icegird with two server(node).
if one of them shutted down, the client which was connected to it,will reconnect to the other server when using the previous proxy to do some operation,the time of this reconnection lasts for 20 seconds,it seems a little too long.

i've tried to set "IceGrid.Registry.NodeSessionTimeout=5", but it only worked when client and the previous server are in the same computer.

how can we shorten the period of reconnection?

thanks

Comments

  • benoit
    benoit Rennes, France
    Hi,

    It's difficult to give you an answer without knowing a little more about your client and server. Could you perhaps post some sample code showing what your client is doing? What exactly do you shutdown for your test, the Ice server, the IceGrid node or the machine?

    If you could also provide the network and protocol tracing of your client this could help figuring out where the client Ice runtime spends some time. To enable this tracing, you can run your client with the following properties:
    --Ice.Trace.Network=2 --Ice.Trace.Protocol --Ice.Logger.Timestamp

    Cheers,
    Benoit.
  • we use demo/icegrid/simple to do this test, just to use the sayHello function
    and we shutdown server by shuting down icegrid node

    i will do the tracing later and post it here

    thanks
  • benoit
    benoit Rennes, France
    Ok. In addition to the tracing, could you please also post the deployment descriptor that you're using for the test?

    Thanks.

    Cheers,
    Benoit
  • here is the tracing and deployment.
    thanks for a further look~:)
    [ 06/07/06 21:40:50.265 Network: trying to establish tcp connection to 192.168.101.87:10000 ]
    [ 06/07/06 21:40:50.281 Network: tcp connection established
      local address = 192.168.101.131:4340
      remote address = 192.168.101.87:10000 ]
    [ 06/07/06 21:40:50.281 Protocol: received validate connection
      message type = 3 (validate connection)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 14 ]
    [ 06/07/06 21:40:50.281 Protocol: sending request
      message type = 0 (request)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 65
      request id = 1
      identity = IceGrid/Locator
      facet =
      operation = findObjectById
      mode = 1 (nonmutating)
      context =  ]
    [ 06/07/06 21:40:50.296 Protocol: received reply
      message type = 2 (reply)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 68
      request id = 1
      reply status = 0 (ok) ]
    [ 06/07/06 21:40:50.296 Network: trying to establish tcp connection to 192.168.101.87:3340 ]
    [ 06/07/06 21:40:50.312 Network: tcp connection established
      local address = 192.168.101.131:4343
      remote address = 192.168.101.87:3340 ]
    [ 06/07/06 21:40:50.312 Protocol: received validate connection
      message type = 3 (validate connection)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 14 ]
    [ 06/07/06 21:40:50.312 Protocol: sending request
      message type = 0 (request)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 56
      request id = 1
      identity = hello
      facet =
      operation = ice_isA
      mode = 1 (nonmutating)
      context =  ]
    [ 06/07/06 21:40:50.312 Protocol: received reply
      message type = 2 (reply)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 26
      request id = 1
      reply status = 0 (ok) ]
    usage:
    t: send greeting
    s: shutdown server
    x: exit
    ?: help
    ==> t
    [ 06/07/06 21:41:02.468 Protocol: sending request
      message type = 0 (request)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 43
      request id = 2
      identity = hello
      facet =
      operation = sayHello
      mode = 1 (nonmutating)
      context =  ]
    [ 06/07/06 21:41:02.484 Protocol: received reply
      message type = 2 (reply)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 25
      request id = 2
      reply status = 0 (ok) ]
    ==> [ 06/07/06 21:41:18.234 Protocol: received close connection
      message type = 4 (close connection)
      compression status = 1 (not compressed; compress response, if any)
      message size = 14 ]
    [ 06/07/06 21:41:18.234 Network: shutting down tcp connection for writing
      local address = 192.168.101.131:4343
      remote address = 192.168.101.87:3340 ]
    [ 06/07/06 21:41:18.234 Network: closing tcp connection
      local address = 192.168.101.131:4343
      remote address = 192.168.101.87:3340 ]
    t
    [ 06/07/06 21:41:21.843 Network: trying to establish tcp connection to 192.168.101.87:3340 ]
    [ 06/07/06 21:41:42.859 Protocol: sending request
      message type = 0 (request)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 65
      request id = 2
      identity = IceGrid/Locator
      facet =
      operation = findObjectById
      mode = 1 (nonmutating)
      context =  ]
    [ 06/07/06 21:41:44.546 Protocol: received reply
      message type = 2 (reply)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 69
      request id = 2
      reply status = 0 (ok) ]
    [ 06/07/06 21:41:44.562 Network: trying to establish tcp connection to 192.168.1
    01.131:4346 ]
    [ 06/07/06 21:41:44.562 Network: tcp connection established
      local address = 192.168.101.131:4354
      remote address = 192.168.101.131:4346 ]
    [ 06/07/06 21:41:44.562 Protocol: received validate connection
      message type = 3 (validate connection)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 14 ]
    [ 06/07/06 21:41:44.562 Protocol: sending request
      message type = 0 (request)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 43
      request id = 1
      identity = hello
      facet =
      operation = sayHello
      mode = 1 (nonmutating)
      context =  ]
    [ 06/07/06 21:41:44.562 Protocol: received reply
      message type = 2 (reply)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 25
      request id = 1
      reply status = 0 (ok) ]
    ==>
    


    xml set for the nodes
    <icegrid>
      <application name="Simple">
    	<replica-group id="HelloAdapters">
    	<load-balancing type="round-robin"/>
    	<object identity="hello"	type="::Demo::Hello"/>
    	</replica-group>
    	<server-template id="HelloNodeTemplate">
    	<parameter name="index"/>
    	<parameter name="exepath"	default="./server"/>
    	<server id="SimpleServer${index}"	exe="${exepath}"	activation="on-demand">
    	<adapter name="Hello"	replica-group="HelloAdapters"
    		register-process="true"	endpoints="tcp"/>
    	<property name="Identity" value="hello"/>
    	</server>
    </server-template>
    <node name="HelloNode1">
    <server-instance template="HelloNodeTemplate" index="1"/>
    </node>
    <node name="HelloNode2">
    <server-instance template="HelloNodeTemplate" index="2"/>
    </node>
    </application>
    </icegrid>
    
  • benoit
    benoit Rennes, France
    From the following traces:
    [ 06/07/06 21:41:21.843 Network: trying to establish tcp connection to 192.168.101.87:3340 ]
    [ 06/07/06 21:41:42.859 Protocol: sending request
      message type = 0 (request)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 65
      request id = 2
      identity = IceGrid/Locator
      facet =
      operation = findObjectById
      mode = 1 (nonmutating)
      context =  ]
    

    It's clear that it takes about 21 seconds for the Ice runtime to give up on the connection establishment attempt before to contact again the IceGrid locator to get the endpoints of the other replica. However, I don't know why the connection attempt takes so long. Is the machine IP address still accessible/reachable after you shutdown the IceGrid node? Do the node and server shutdown cleanly and in a timely manner? Which operating system do you use on your 2 machines?

    Cheers,
    Benoit.
  • we are using winxp sp2, after shuting down the firewall on the Server1 side ,the period of reconnection becomes much shorter(trace below),
    we've tried several times ,and it worked the same ,so we can sure that , this question is because of the firewall.

    another question is , if we shut down all the firewall, the system may become unsafe, how to solve this problem?
    [ 06/08/06 09:15:09.406 Network: trying to establish tcp connection to 192.168.101.87:10000 ]
    [ 06/08/06 09:15:09.421 Network: tcp connection established
      local address = 192.168.101.131:1097
      remote address = 192.168.101.87:10000 ]
    [ 06/08/06 09:15:09.421 Protocol: received validate connection
      message type = 3 (validate connection)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 14 ]
    [ 06/08/06 09:15:09.421 Protocol: sending request
      message type = 0 (request)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 65
      request id = 1
      identity = IceGrid/Locator
      facet =
      operation = findObjectById
      mode = 1 (nonmutating)
      context =  ]
    [ 06/08/06 09:15:09.468 Protocol: received reply
      message type = 2 (reply)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 68
      request id = 1
      reply status = 0 (ok) ]
    [ 06/08/06 09:15:09.484 Network: trying to establish tcp connection to 192.168.101.87:3559 ]
    [ 06/08/06 09:15:09.484 Network: tcp connection established
      local address = 192.168.101.131:1100
      remote address = 192.168.101.87:3559 ]
    [ 06/08/06 09:15:09.500 Protocol: received validate connection
      message type = 3 (validate connection)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 14 ]
    [ 06/08/06 09:15:09.500 Protocol: sending request
      message type = 0 (request)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 56
      request id = 1
      identity = hello
      facet =
      operation = ice_isA
      mode = 1 (nonmutating)
      context =  ]
    [ 06/08/06 09:15:09.500 Protocol: received reply
      message type = 2 (reply)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 26
      request id = 1
      reply status = 0 (ok) ]
    usage:
    t: send greeting
    s: shutdown server
    x: exit
    ?: help
    ==> t
    [ 06/08/06 09:15:13.312 Protocol: sending request
      message type = 0 (request)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 43
      request id = 2
      identity = hello
      facet =
      operation = sayHello
      mode = 1 (nonmutating)
      context =  ]
    [ 06/08/06 09:15:13.328 Protocol: received reply
      message type = 2 (reply)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 25
      request id = 2
      reply status = 0 (ok) ]
    ==> [ 06/08/06 09:15:41.484 Protocol: received close connection
      message type = 4 (close connection)
      compression status = 1 (not compressed; compress response, if any)
      message size = 14 ]
    [ 06/08/06 09:15:41.484 Network: shutting down tcp connection for writing
      local address = 192.168.101.131:1100
      remote address = 192.168.101.87:3559 ]
    [ 06/08/06 09:15:41.484 Network: closing tcp connection
      local address = 192.168.101.131:1100
      remote address = 192.168.101.87:3559 ]
    t
    [ 06/08/06 09:15:46.531 Network: trying to establish tcp connection to 192.168.101.87:3559 ]
    [ 06/08/06 09:15:47.515 Protocol: sending request
      message type = 0 (request)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 65
      request id = 2
      identity = IceGrid/Locator
      facet =
      operation = findObjectById
      mode = 1 (nonmutating)
      context =  ]
    [ 06/08/06 09:15:48.468 Protocol: received reply
      message type = 2 (reply)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 69
      request id = 2
      reply status = 0 (ok) ]
    [ 06/08/06 09:15:48.468 Network: trying to establish tcp connection to 192.168.101.131:1109 ]
    [ 06/08/06 09:15:48.468 Network: tcp connection established
      local address = 192.168.101.131:1118
      remote address = 192.168.101.131:1109 ]
    [ 06/08/06 09:15:48.468 Protocol: received validate connection
      message type = 3 (validate connection)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 14 ]
    [ 06/08/06 09:15:48.468 Protocol: sending request
      message type = 0 (request)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 43
      request id = 1
      identity = hello
      facet =
      operation = sayHello
      mode = 1 (nonmutating)
      context =  ]
    [ 06/08/06 09:15:48.484 Protocol: received reply
      message type = 2 (reply)
      compression status = 0 (not compressed; do not compress response, if any)
      message size = 25
      request id = 1
      reply status = 0 (ok) ]
    ==>
    
  • benoit
    benoit Rennes, France
    Hi,
    leya wrote:
    another question is , if we shut down all the firewall, the system may become unsafe, how to solve this problem?

    I'm not very familiar with the Windows XP SP2 firewall but instead of adding an "exception" for the server program, you could try to configure your server to listen on a fixed port (for example, to listen on port 12345, you can set the adapter endpoints to "tcp -p 12345" instead of just "tcp" in the XML descriptor) and configure the firewall to accept connection on this port (i.e.: add a port "exception" in Windows firewall terms).

    In any case, firewall setup doesn't really have anything to do with Ice, I would recommend to check out the Microsoft forums for more information on the setup of the Windows XP firewall.

    Cheers,
    Benoit.
  • Thank you very much~~:) :)