Archived

This forum has been archived. Please start a new discussion on GitHub.

Server unexpectedly stopping TcpAcceptor

Hi folks

We are using Ice 3.2.1

Our server runs for a period of time then we see something in STDERR like: "Network: stopping to accept tcp connections at .........." after which the server stops listening on the designated port (basically rendering it useless). A thread dump reveals that the server thread pool threads and connection manager threads are running as usual.

Any idea why this is happening? We have changed our Network Trace to 2 from 1 to see if there's anything else we can pick up to determine what is going on.

We are running the exact same process on different kit (on a different network) with no issues.

Is it possible that a low-level unhandled (or logged) error in the Ice runtime is causing the issue?

Many thanks
Simon

Comments

  • benoit
    benoit Rennes, France
    Hi Simon,

    This is most likely a spurious wake-up problem with the JDK selector. Under rare circumstances, the Java selector select() method returns even if there are no FDs ready for reading. If select() returns an empty key set, Ice interprets it as "no activity" and shuts down the object adapters (as if Ice.ServerIdleTime was set to a non-0 value).

    We have added a workaround for this JDK problem in Ice 3.3.x. I'll send you by email a patch for Ice 3.2.1.

    Cheers,
    Benoit.
  • Network: stopping to accept tcp connections...

    I am having the same issue as described above.
    How do I get a patch for 3.2.x?
    Or better off, should I consider an upgrade to 3.3.x?
  • bernard
    bernard Jupiter, FL
    Hi Vitali,

    Welcome to our forums! I will email you this patch together with patch instructions.

    It could also make more sense for you to upgrade to the latest Ice version available, currently Ice 3.4.1.

    Best regards,
    Bernard
  • Network : stopping to accept tcp connections

    Hello,
    I am using ICE for a while. I am today in 3.5.
    Everything is really fine and smart so far, thank you for the great job.
    Today, I applied a recent Redhat 6.3 patch for recvmmsg (system crash issue) and it works but I have issues to pass the bind to a tcp port through ICE.

    Here is my code :
            Ice::PropertiesPtr properties = Ice::createProperties();
            properties->setProperty("Ice.IPv6", "0");
            properties->setProperty("Ice.Override.Timeout", "5000");
            properties->setProperty("Ice.RetryIntervals", "-1");
            properties->setProperty("Ice.ThreadPool.Server.SizeMax", "4");
            properties->setProperty("Ice.Trace.ThreadPool", "1");
    
            //
            // Network Tracing
            //
            // 0 = no network tracing
            // 1 = trace connection establishment and closure
            // 2 = like 1, but more detailed
            // 3 = like 2, but also trace data transfer
            //
            properties->setProperty("Ice.Trace.Network", "2");
    
            Ice::InitializationData initData;
            initData.properties = properties;
            initData.logger = new LoggerI();
            ic_ = Ice::initialize(initData);
            if (!ic_)
            {
                FATAL("Ice::initialize failed");
                return -1;
            }
    
            INFO2("Bind Ice on source [%1] with port [%2]", sourceName_, listenport_);
    
            adapter_ = ic_->createObjectAdapterWithEndpoints(sourceName_, std::string("default -p ") + listenport_);
    

    Here is some outputs:
    BEFORE patching:
    2014-01-07 17:53:19,606003 INFO Starting Ice...
    2014-01-07 17:53:19,608015 DEBUG [ICE.LOGGER] ThreadPool : creating Ice.ThreadPool.Client: Size = 1, SizeMax = 1, SizeWarn = 0
    2014-01-07 17:53:19,608147 INFO Bind Ice on source [TESTLIBFID5] with port [57579]
    2014-01-07 17:53:19,608433 DEBUG [ICE.LOGGER] Network : attempting to bind to tcp socket 0.0.0.0:57579
    2014-01-07 17:53:19,608638 DEBUG [ICE.LOGGER] Network : listening for tcp connections at 0.0.0.0:57579
    local interfaces: 10.147.53.33, 127.0.0.1
    2014-01-07 17:53:19,608829 DEBUG [ICE.LOGGER] ThreadPool : creating Ice.ThreadPool.Server: Size = 1, SizeMax = 4, SizeWarn = 0
    2014-01-07 17:53:19,608954 DEBUG [ICE.LOGGER] Network : published endpoints for object adapter `TESTLIBFID5':
    tcp -h 10.147.53.33 -p 57579 -t 5000

    AFTER patch:
    2014-01-07 18:25:33,440446 INFO Starting Ice...
    2014-01-07 18:25:33,442481 DEBUG [ICE.LOGGER] ThreadPool : creating Ice.ThreadPool.Client: Size = 1, SizeMax = 1, SizeWarn = 0
    2014-01-07 18:25:33,442617 INFO Bind Ice on source [TESTLIBFID5] with port [57579]
    2014-01-07 18:25:33,442907 DEBUG [ICE.LOGGER] Network : attempting to bind to tcp socket 0.0.0.0:57579
    2014-01-07 18:25:33,443230 DEBUG [ICE.LOGGER] Network : listening for tcp connections at 0.0.0.0:57579
    2014-01-07 18:25:33,443262 DEBUG [ICE.LOGGER] Network : stopping to accept tcp connections at 0.0.0.0:57579
    2014-01-07 18:25:33,443483 FATAL Ice exception catched [Network.cpp:279: Ice:ocketException:
    socket exception: Invalid argument]

    I tried another way to bind a tcp port through ACE and it works fine.
    I started a tcp router and I ran clients (bidirectionel connections) and it works fine.
    If someone can give me some hints to check with Redhat what is the root of the issue, I will appreciate ;)
  • mes
    mes California
    Hi Philippe,

    Based on the source of the exception (Network.cpp:279), it appears that the call to getifaddrs is now failing. I don't have an explanation for this failure yet. Can you give us more information about this RedHat patch so that we can try it ourselves?

    Thanks,
    Mark
  • mes wrote: »
    Hi Philippe,

    Based on the source of the exception (Network.cpp:279), it appears that the call to getifaddrs is now failing. I don't have an explanation for this failure yet. Can you give us more information about this RedHat patch so that we can try it ourselves?

    Thanks,
    Mark

    Hi Mark,

    Many thanks for your reply, I will have a look into ICE source code as well :)
    And here is information about Red Hat Enterprise Linux Server release 6.3 (Santiago) patches.
    Before :
    Linux 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux

    After :
    Linux 2.6.32-431.1.2.el6.00988367.x86_64 #1 SMP Fri Jan 3 12:54:01 BRST 2014 x86_64 x86_64 x86_64 GNU/Linux

    From RedHat,
    RedHat Engineer has found the 'Oops' condition appears to be resolved in an upstream patch that is not currently incorporated into our kernel.

    The patch is this specific commit:
    1be374a0518a288147c6a7398792583200a67261 - net: Block MSG_CMSG_COMPAT in send(m)msg and recv(m)msg

    There are two other interesting fixes, which they have listed, which may also help alleviate the situation, they believe the first one mentioned to be the true fix:
    71c5c1595c04852d6fbf3c4882b47b30b61a4d32 - net: Add MSG_WAITFORONE flag to recvmmsg
    b9eb8b8752804cecbacdb4d24b52e823cf07f107 - net: recvmmsg: Strip MSG_WAITFORONE when calling recvmsg.

    Please not this is a custom kernel provided by RedHat for this fix ;)

    EDIT: I sent info to RedHat about getifaddrs issue.
  • benoit
    benoit Rennes, France
    Hi,

    I don't see what Ice could be doing wrong with the call to getifaddrs, this sounds like a problem with your kernel. You could try to see if you can reproduce it with this small C program:
    #include <sys/types.h>
    #include <sys/socket.h>
    #include <ifaddrs.h>
    #include <stdio.h>
    #include <errno.h>
    
    int
    main(int argc, char** argv)
    {
        struct ifaddrs* ifap;
        if(getifaddrs(&ifap) == -1)
        {
            printf("getifaddrs failed with errno = %d\n", errno);
            return 1;
        }
        freeifaddrs(ifap);
        return 0;
    }
    

    If you need to workaround the problem, you could add the "-h <IP or hostname>" option to the object adapter endpoint. Ice won't call getifaddrs if you explicitly specify the interfaces to listen on in the object adapter endpoints.

    Cheers,
    Benoit.
  • benoit wrote: »
    Hi,

    I don't see what Ice could be doing wrong with the call to getifaddrs, this sounds like a problem with your kernel. You could try to see if you can reproduce it with this small C program:
    #include <sys/types.h>
    #include <sys/socket.h>
    #include <ifaddrs.h>
    #include <stdio.h>
    #include <errno.h>
    
    int
    main(int argc, char** argv)
    {
        struct ifaddrs* ifap;
        if(getifaddrs(&ifap) == -1)
        {
            printf("getifaddrs failed with errno = %d\n", errno);
            return 1;
        }
        freeifaddrs(ifap);
        return 0;
    }
    

    If you need to workaround the problem, you could add the "-h <IP or hostname>" option to the object adapter endpoint. Ice won't call getifaddrs if you explicitly specify the interfaces to listen on in the object adapter endpoints.

    Cheers,
    Benoit.


    Hi Benoit, I did quite similar test example but it works...
    You are right, Ice is not the root of the issue.
    I don't want to bother with this, you gave me a first hint to chase and RedHat gave me another task to do with strace.
    I will let you know,
    Merci