Archived

This forum has been archived. Please start a new discussion on GitHub.

Starting a server after the clients

Hello.

There are cases in which a server is started after the clients. In those scenarios invocation of RPCs in the client side will result in Ice.Local exceptions (typically ConnectionRefusedException) until the server is up and running again. If I understand correctly, ICE takes care of all the details regarding connection, reconnection and so on. This implies that, in order to face this situation, in the client one could simply try to do remote invocations until the exceptions are not thrown. This means that the connection is up and running.

I have tried this, and the approach basically works. However, sometimes when I start the server (with the clients already running) the communications hang for some time (about 60 secs.) with no sign of activity in the server neither in the clients side. So I guess that somewhere in the code there is a wait for something to happen (a timeout?), and after that the communications are restored to normality.

My questions:
- What is the cause of this timeout?
- What is the correct way to initialize and configure the server and the clients in a scenario with a server running intermittently? Should I be paying attention to something special?

Regards.

Comments

  • benoit
    benoit Rennes, France
    Hi Jose,

    Such a hang isn't expected, it should eventually connect to the server in a timely fashion once the server is up. Can you tell us a little more on the platform and Ice version you're using?

    Can you try running both the server and the client with network tracing enabled to see what exception is being raised after 60s? Use the properties Ice.Trace.Network=2 and Ice.Trace.Retry=1 to enable the tracing.

    Cheers,
    Benoit.
  • Hi Benoit. thanks for your answer and sorry for my late reply.

    Here are some details about what I have here.
    Kubuntu 15.10
    Ice-3.6.1
    Using SSL connections only (IceSSL plugin), with the certificates given in the examples provided by Zeroc.

    Client: (in essence), it contains one communicator and two proxies to two renmote objects, each with couple of interfaces.
    The flow is as follows: the main thread instantiates a communicator, starts another thread to initialize the communicator, waits for the initializer thread to complete and starts an endless loop.
    The proxies are initialized using checked casts, and therefore the client tries to contact the server already in the initialization. If this connection fails the proxies remain uninitialized, whereas the communicator is already valid.
    Within the loop all the methods from the two remote objects are called one by one within a try-catch block that catches all exceptions. Before every such call, if the corresponding proxy is not initialized, we try to initialize it again using a checked cast. If this fails the RPC is not executed in that cycle. The exceptions are all printed out and ignored.

    Server (in essence): created by extending Ice.Application with one object adapter and two servants, one for each object.

    Here I attach the logs corresponding to the situation described in my previous post. In this case I have two independent processes running each one instance of the client described above. Their respective log files are attached (client*_l.txt). Both clients are started before the server.

    From server_l.txt we can see that the server binds to the SSL socket at 16:26:03. From then until 16:27:37 the server keeps trying to accept the connection but failing with Ice.ConnectionLostException
    error = 0. In the meanwhile, the clients keep trying to estabilish a SSL connection. Before starting the server, the exception thrown in the clients is Ice.ConnectionRefusedException, and after starting the server it becomes an Ice.ConnectTimeoutException. Find also attached a dump of netstat -tonp showing many connections in CLOSE_WAIT and FIN_WAIT. The amount of these increases with the time during which the connections are not properly estabilished.

    As a side note, trying to reproduce this using the localhost is much more difficult, in that the maximum time that it takes for the connections to be estabilished is maximum a couple of seconds.

    Obvioulsy I am missing something. Any hings are very welcomed.

    Regards.
  • benoit
    benoit Rennes, France
    Hi,

    According to the logs, the clients fail with a connection timeout after one second. How did you configure timeouts? Note that timeouts are typically configured in milli-seconds.

    A one second timeout can be too short for an SSL connection establishment so you should try configure a larger timeout (5 to 10 seconds for example).

    Cheers,
    Benoit.
  • Hi.

    Thanks for pointing that out. Just changed it to 10 secs. But still the symptoms are very similar.

    Just to add some more info: the situation that I described is is not always reproducible. Usually I have to try to start and shutdown the server couple of times (while the clients are running already) until I see the temporal freeze. I stop the server by pressing Ctrl-C (which is supposed to be handled by the shutdown hook of Ice.Application, which itself destroys the communicator). Stopping the server (and clients) also seems to take long time sometimes when the communications are not properly initialized.

    Regards.
  • benoit
    benoit Rennes, France
    Hi,

    Does this problem also shows up if you're using TCP instead of SSL? It would be good if you could get a thread dump of the server and client threads when the hang occurs. It might show where the client or server are hanging.

    Cheers,
    Benoit.
  • Hi.

    With TCP instead of SSL the symptoms are the same. I am trying to get thread dumps, but I cannot reproduce the problem when running the server and clients in the debugger of the IDE (IntelliJ). I will keep exploring and trying. I will also try running in Windows to see if it makes any difference, and running clients and server in different machines. In the meanwhile, should you have any suggestion about what else I can try, please let me know.

    Regards.
  • Hello.

    Some more info on this issue.

    - If I run the same tests at home instead of the office, I cannot reproduce the problem.
    - Back to the office, if I run the same tests in Windows instead of Linux, I cannot reproduce the problem.
    - If I add an extra router between my computer and my office's main router (creating a private network), I cannot reproduce the problem in Linux.

    This makes me suspect that, for some reason that I ignore, the main router at my office does not get on very well with my linux installation, and that it drops connection attempts frequently. This suspicion is somehow aligned with other experiences that I have had with browsers in my linux machine, which every now and then time-out waiting for connections being opened with servers. I don't know how esoteric this explanation might sound, but I cannot figure out anything else.

    So, this issue that I am experiencing seems to not to be related to ICE.

    Regards.