Archived

This forum has been archived. Please start a new discussion on GitHub.

Same client, different result (Linux and Windows)

Hi,

I'm testing Simple IceGrid Demo in the next scenario:

* Linux (CentOS 4.4) (172.20.0.43) --> Registry + Node1 + Java client
* Windows XP (172.20.0.64) --> Node2 + Java client

If I execute the same test with the Java client in windows and then in linux, I don't receive the same result. The Java client executed in windows works fine, but the same Java client in linux has a problem.

First, I show you the configuration files for both servers:

******************************
Linux (172.20.0.43) config.grid
******************************

IceGrid.InstanceName=DemoIceGrid

Ice.Default.Locator=DemoIceGrid/Locator:default -h 172.20.0.43 -p 12000

IceGrid.Registry.Client.Endpoints=default -h 172.20.0.43 -p 12000
IceGrid.Registry.Server.Endpoints=default -h 172.20.0.43
IceGrid.Registry.Internal.Endpoints=default -h 172.20.0.43
IceGrid.Registry.Admin.Endpoints=default -h 172.20.0.43
IceGrid.Registry.Data=db/registry
IceGrid.Registry.PermissionsVerifier=DemoIceGrid/NullPermissionsVerifier
IceGrid.Registry.AdminPermissionsVerifier=DemoIceGrid/NullPermissionsVerifier

IceGrid.Node.Name=Node1
IceGrid.Node.Endpoints=default
IceGrid.Node.Data=db/node
IceGrid.Node.CollocateRegistry=1

IceGrid.Node.Trace.Activator=2
IceGrid.Node.Trace.Patch=1
IceGrid.Node.Trace.Adapter=3
IceGrid.Node.Trace.Server=3
IceGrid.Registry.Trace.Locator=3


*****************************
Windows (172.20.0.64) config.node2
*****************************

IceGrid.InstanceName=DemoIceGrid

Ice.Default.Locator=DemoIceGrid/Locator:default -h 172.20.0.43 -p 12000

IceGrid.Node.Name=Node2
IceGrid.Node.Endpoints=default
IceGrid.Node.Data=db/node
IceGrid.Node.CollocateRegistry=0

IceGrid.Node.Trace.Activator=2
IceGrid.Node.Trace.Patch=1
IceGrid.Node.Trace.Adapter=3
IceGrid.Node.Trace.Server=3



There are a Hello adapter in both servers:


[user@speedy IceGrid_Demo_Simple]$ icegridadmin --Ice.Config=config.grid
Ice 3.1.1 Copyright 2003-2006 ZeroC, Inc.
>>> adapter list
ReplicatedHelloAdapter
SimpleServer-1.Hello
SimpleServer-2.Hello
>>> server list
SimpleServer-1
SimpleServer-2
>>> server describe SimpleServer-1
server `SimpleServer-1'
{
application = `Simple'
node = `Node1'
exe = `java'
activation = `on-demand'
options = `Server'
properties
{
Hello.Endpoints = `tcp'
Identity = `hello'
}
adapter `Hello'
{
id = `SimpleServer-1.Hello'
replica group id = `ReplicatedHelloAdapter'
endpoints = `tcp'
register process = `true'
wait for activation = `true'
}
}
>>> server describe SimpleServer-2
server `SimpleServer-2'
{
application = `Simple'
node = `Node2'
exe = `java'
activation = `on-demand'
options = `Server'
properties
{
Hello.Endpoints = `tcp'
Identity = `hello'
}
adapter `Hello'
{
id = `SimpleServer-2.Hello'
replica group id = `ReplicatedHelloAdapter'
endpoints = `tcp'
register process = `true'
wait for activation = `true'
}
}
>>>


When I execute the windows java client, it connects with Linux server, then I shutdown the server (s key), and I send a greeting (t key). Windows server responds, and I shutdown the server (s key) and I send another greeting (t key). Linux servers responds very well.

The same test with the linux Java client has a different behavior. When I execute the linux Java client, it connects with Linux server, then I shutdown the server (s key), and I send a greeting (t key). Windows server responds, and I shutdown the server (s key) and I send another greeting (t key). Linux servers responds 3 minutes later!!! :eek:

This is the last linux Java client trace (it has connected with Windows Server):

usage:
t: send greeting
s: shutdown server
x: exit
?: help

==> s
[ 01/18/07 11:58:40:196 Protocol: sending request
message type = 0 (request)
compression status = 0 (not compressed; do not compress response, if any)
message size = 43
request id = 2
identity = hello
facet =
operation = shutdown
mode = 2 (idempotent)
context = ]
[ 01/18/07 11:58:40:201 Protocol: received reply
message type = 2 (reply)
compression status = 0 (not compressed; do not compress response, if any)
message size = 25
request id = 2
reply status = 0 (ok) ]
==> [ 01/18/07 11:58:40:202 Protocol: received close connection
message type = 4 (close connection)
compression status = 0 (not compressed; do not compress response, if any)
message size = 14 ]
[ 01/18/07 11:58:40:202 Network: shutting down tcp connection for writing
local address = 172.20.0.43:42145
remote address = 172.20.0.64:2184 ]
[ 01/18/07 11:58:40:202 Network: closing tcp connection
local address = 172.20.0.43:42145
remote address = 172.20.0.64:2184 ]
t
[ 01/18/07 11:58:48:569 Location: found endpoints in locator table
object = hello
endpoints = tcp -h 172.20.0.64 -p 2184 ]
[ 01/18/07 11:58:48:570 Network: trying to establish tcp connection to 172.20.0.64:2184 ]

(
> WAITING 1 MINUTE!!!)

[ 01/18/07 11:59:42:311 Protocol: sending close connection
message type = 4 (close connection)
compression status = 0 (not compressed; do not compress response, if any)
message size = 14 ]
[ 01/18/07 11:59:42:311 Network: shutting down tcp connection for writing
local address = 172.20.0.43:42126
remote address = 172.20.0.43:12000 ]
[ 01/18/07 11:59:42:312 Network: closing tcp connection
local address = 172.20.0.43:42126
remote address = 172.20.0.43:12000 ]

(
> WAITING 2 MINUTES!!!)


[ 01/18/07 12:01:57:635 Location: removed endpoints from locator table
object = hello
endpoints = tcp -h 172.20.0.64 -p 2184 ]
[ 01/18/07 12:01:57:635 Location: searching for object by id
object = hello ]
[ 01/18/07 12:01:57:636 Network: trying to establish tcp connection to 172.20.0.43:12000 ]
[ 01/18/07 12:01:57:637 Network: tcp connection established
local address = 172.20.0.43:42147
remote address = 172.20.0.43:12000 ]
[ 01/18/07 12:01:57:637 Protocol: received validate connection
message type = 3 (validate connection)
compression status = 0 (not compressed; do not compress response, if any)
message size = 14 ]
[ 01/18/07 12:01:57:641 Protocol: sending request
message type = 0 (request)
compression status = 0 (not compressed; do not compress response, if any)
message size = 69
request id = 1
identity = DemoIceGrid/Locator
facet =
operation = findObjectById
mode = 1 (nonmutating)
context = ]
[ 01/18/07 12:01:57:938 Protocol: received reply
message type = 2 (reply)
compression status = 0 (not compressed; do not compress response, if any)
message size = 65
request id = 1
reply status = 0 (ok) ]
[ 01/18/07 12:01:57:939 Location: retrieved endpoints from locator, adding to locator table
object = hello
endpoints = tcp -h 172.20.0.43 -p 42153 ]
[ 01/18/07 12:01:57:939 Network: trying to establish tcp connection to 172.20.0.43:42153 ]
[ 01/18/07 12:01:57:939 Network: tcp connection established
local address = 172.20.0.43:42159
remote address = 172.20.0.43:42153 ]
[ 01/18/07 12:01:57:959 Protocol: received validate connection
message type = 3 (validate connection)
compression status = 0 (not compressed; do not compress response, if any)
message size = 14 ]
[ 01/18/07 12:01:57:960 Protocol: sending request
message type = 0 (request)
compression status = 0 (not compressed; do not compress response, if any)
message size = 43
request id = 1
identity = hello
facet =
operation = sayHello
mode = 1 (nonmutating)
context = ]
[ 01/18/07 12:01:57:970 Protocol: received reply
message type = 2 (reply)
compression status = 0 (not compressed; do not compress response, if any)
message size = 25
request id = 1
reply status = 0 (ok) ]
==>


This is the last Windows Server trace:

SimpleServer-2 says Hello World!
SimpleServer-2 shutting down...
[ icegridnode: Adapter: server `SimpleServer-2' adapter `SimpleServer-2.Hello' deactivated ]
[ icegridnode: Activator: detected termination of server `SimpleServer-2' ]
[ icegridnode: Server: changed server `SimpleServer-2' state to `Deactivating' ]
[ icegridnode: Adapter: server `SimpleServer-2' adapter `SimpleServer-2.Hello' deactivated ]
[ icegridnode: Server: changed server `SimpleServer-2' state to `Inactive' ]


And this is the last Linux Server trace:


[ icegridnode: Locator: registered replicated adapter `SimpleServer-2.Hello' endpoints: `' ]

(
> WAITING 3 MINUTES)

[ icegridnode: Adapter: waiting for activation of server `SimpleServer-1' adapter `SimpleServer-1.Hello' ]
[ icegridnode: Server: changed server `SimpleServer-1' state to `Activating' ]
[ icegridnode: Activator: activating server `SimpleServer-1'
path = java
pwd = /home/jrubio/devel/IceGrid_Demo_Simple
uid/gid = 500/500
args = java Server --Ice.Config=/home/jrubio/devel/IceGrid_Demo_Simple/db/node/servers/SimpleServer-1/config/config ]
[ icegridnode: Server: changed server `SimpleServer-1' state to `WaitForActivation' ]
[ icegridnode: Locator: registered replicated adapter `SimpleServer-1.Hello' endpoints: `dummy -t:tcp -h 172.20.0.43 -p 42153' ]
[ icegridnode: Server: changed server `SimpleServer-1' state to `Active' ]
[ icegridnode: Adapter: server `SimpleServer-1' adapter `SimpleServer-1.Hello' activated: dummy -t:tcp -h 172.20.0.43 -p 42153 ]
[ icegridnode: Locator: registered server `SimpleServer-1' process proxy: `"7f:0:0:1:-2fb32b1f:11034dec036:-8000" -t:tcp -h 172.20.0.43 -p 42153' ]
SimpleServer-1 says Hello World!


What I'm doing wrong? I don't understand why with Windows I haven't problems, and the same scenario with Linux has this annoying delay...

Thanks a lot for helping me!

Comments

  • benoit
    benoit Rennes, France
    Hi,

    The problem is that the connection establishement to your Windows machine on the port where the server was previously running hangs (for 3 minutes).

    The Ice client cashes the endpoints of the server so once you have shutdown the server, the client will try again to establish a connection to the endpoints from the cache -- if that fails, it removes the endpoints from the cache and queries the locator again to get the new endpoints.

    Anyway, the key problem is that the connection establisment attempt hangs for 3 minutes. This is in general caused by the Windows firewall. Did you try completely disabling the Windows firewall to see if it helps?

    You could also disable the locator cache (with Ice.Default.LocatorCacheTimeout) to avoid re-connecting to an old endpoint but this could have some performance impact on your application.

    Cheers,
    Benoit.
  • Hi benoit,

    %&*/$ firewall... :cool:

    Yes, the Windows Firewall was the problem... Now both scenarios works fine!!

    Thanks a lot, again ;)

    Cheers!