Archived

This forum has been archived. Please start a new discussion on GitHub.

IceStorm behind a private net

2

Comments

  • Ok, I have attached a simple example that reproduces the problem. I attach only the important code that produce the exception, so the application does not anything interesting.

    The steps to run the application are:
    - Compile it. I have a debian distribution with a kernel 2.6.16-1-k7, and I use C++. There is a Makefile to compile it.
    > icebox --ice.Config=config.service
    > ./sessionserver
    > glacier2router --Ice.Config=config.glacier2
    > ./subscriber market quiniela [market and quiniela are the topics]

    In this moment the subscriber has subscribed to "quiniela" and "market" in IceStorm. In this example there is not a publisher because is not important.
    Now we can stop the subscriber with C^c and see what append in the subscriber.

    I hope this example helps you.

    Thanks
  • dwayne
    dwayne St. John's, Newfoundland
    Hi,

    Threre are two things incorrect in your subscriber. First off, your ping thread is just sending pings as fast as it can in a tight loop. You should instead do a timed wait and only send a ping often enough so that your session does not timeout on you. The way your code is now it never waits, meaning destroy() can never obtain the lock and thus will cause your program to hang on shutdown.
    class SessionPingThread : public IceUtil::Thread, public IceUtil::Monitor<IceUtil::Mutex>
    {
    public:
    
      SessionPingThread(const Glacier2::SessionPrx& session) :
        _session(session),
        _timeout(IceUtil::Time::seconds(20)),
        _destroy(false)
      {
      }
    
        virtual void
        run()
        {
            Lock sync(*this);
            while(!_destroy)
            {
                timedWait(_timeout);
                if(_destroy)
                {
                    break;
                }
                try
                {
                    _session->ice_ping();
                }
                catch(const Ice::Exception&)
                {
                    break;
                }
            }
        }
    
        void
        destroy()
        {
            Lock sync(*this);
            _destroy = true;
            notify();
        }
    
    private:
    
        const Glacier2::SessionPrx _session;
        const IceUtil::Time _timeout;
        bool _destroy;
    };
    
    

    The second issue is that you are destroying the session too soon. You cannot destroy the session until after you have unsubscribed from the IceStorm topics. If you destroy the session first, then the unsubscribes cannot succeed since the Glacier2 session they are using to communicate with IceStorm will no longer exist.

    Regards,
    Dwayne
  • I have done the changes you said me, and now I get the following exception. Benoit said me it was a bug and that it will be fixed in the next version.
    [ Network: tcp connection established
    local address = 138.4.9.76:57218
    remote address = 138.4.9.76:10000 ]
    This demo accepts any user-id / password combination.
    user id: a
    password: a
    Step0
    Step1
    Step5
    Step6
    subscriber: ObjectAdapterI.cpp:767: IceInternal::ServantManagerPtr Ice::ObjectAdapterI::getServantManager() const: Assertion `_instance' failed.
    Abortado

    Thanks for your help
  • benoit
    benoit Rennes, France
    Hi,

    Before we look into a patch for this bug (this bug was fixed with other changes on our mainline so it's not so easy to provide a patch just for this problem for 3.0.1), you could try the following workaround: instead of retrieving again the topics from the topic manager after the communicator is shutdown, you could store the IceStorm topic proxies in a map when you retrieve them for subscribing.

    In the for loop where you subscribe the subscribers:
      map<string, IceStorm::TopicPrx> topicProxies;
      for(Ice::StringSeq::iterator p = topics.begin(); p != topics.end(); ++p) 
      {
          ...
          IceStorm::TopicPrx topic = manager->retrieve(*p);
          topicProxies[*p] = topic; // Store the topic proxy
          topic->subscribe(qos, object);
          ...
       }
    

    In the for loop where you unsusbcribe the subscribers:
      for(map<string,Ice::ObjectPrx>::const_iterator q = subscribers.begin(); q != subscribers.end(); ++q)
      {
         ... 
          IceStorm::TopicPrx topic = topicProxies[q->first];
          topic->unsubscribe(q->second);
          ...
      }
    

    This should work-around the problem (which shows up when you invoke on a proxy which doesn't have any connection associated and if you make the invocation after the communicator is shutdown).

    In any case, as I previously stated in a previous post, I don't recommend you to do this cleanup in the client: it's not realiable and it will endup in leaking subscribers if the client is killed or the connection between your client and Glacier2 is dropped before the client got a chance to unsubscribe. Instead, you should implement a session manager and each client session should be responsible for cleaning up the subscribers when it's destroyed.

    Cheers,
    Benoit.
  • Hi,
    I've continued working with Glacier2, and I have a problem.
    In the diagram I've attached you can see the general structure of the application.
    In a PC there are 3 "datastorer" each one associated with one topic (I have called them "0", "1" and "2"), a "Manager" in charge of order the datastorer when they have to subscribe or unsubscribe from a topic; and a "router Glacier2".
    In other PC, behind a private net, there is an application called "Modem" which establishes a session with the router Glacier2, and waites for the incoming requests from the "datastorer".
    When an operation of subscription is invoked on the modem, it subscribes its own object to the topic in IceStorm, and when some information of such topic is published, the modem receives it and forwards it to the datastorer associated. The modem knows the proxy of each datastorer.
    If i subscribe only one datastorer, there is no problem, the datastorer receive well the information; but if more than one datastorer are subscribed, all them are well subscribed in IceStorm, but only one of them receive all the information, and always is the topic with the lowest number, independent of the order I subscribed or created them.
    I do not know if it is related with Glacier2, but if I do not use Glacier2, I don't have this problem.

    Thanks for your time
  • benoit
    benoit Rennes, France
    Hi,

    Just a guess... are you using different identities for your datastorer objects? If not, you should :) I believe you would get the behavior you describe if the datastorer objects have the same identity.

    Otherwise, I'm afraid it's difficult to figure out what's the problem without more details. Perhaps you could post the code showing the registration, subscription of your "datastorer" objects to IceStorm through the Modem object?

    Cheers,
    Benoit.
  • Hi Benoit,
    we are cesartovic and fgarcia, we are working in the same project, and as your help has been very important for our work, we want to wish you a happy birthday. Sorry for the delay.

    "Joyeux anniversaire
    joyeux anniversaire
    joyeux anniversaire Benoit
    joyeux anniversaire et voilà, le tour est joué!!!"

    :D:D:D:D:D:D:D

    Cesartovic & fgarcia

    PD: Thanks for all your help
  • benoit
    benoit Rennes, France
    Merci! And good luck with your project! :D

    Benoit.
  • Hi,
    I have solved my last problem with the identities of the Datastorer.
    Now, in the same diagram I've posted, I want that the Datastorer gets some information from an application at the modem side, called "Database". I have defined the following interface:
    interface TC65Getter{
    allNotifications getNotificationTC65ByDate (string topic, string end, string begin);
    allNotifications getNotificationTC65 (string topic, parameters param);
    };
    in which "allNotifications" is a sequence of structs "Notifications" (in other slice definitios).
    The "Datastorer" invokes the operations on the "Modem", who requests the information to the "Database" with this operations:
    interface Getter{
    allNotifications getNotificationByDate (string end, string begin);
    allNotifications getNotification (parameters param);
    };

    The problem is that sometimes these operations fail, but not always, and I don't know why. In both cases (success or failure) the "Modem" invokes the operation on the "Database" and this one executes all the code. The only one difference in the traces between both cases , is that in the success one I get this trace
    [Network: accepted tcp connection
    local address = 138.4.9.76:10030
    remote address = 138.4.9.76:46286 ]
    and not in case of failure. The port 10030 is the one where the "Modem" is listening for the operations of the "Datastorer". But independent of this trace, the "Modem" invokes the operation on the "Databases", so I suppose the session is well established.
    I suppose that is a problem with glacier2 and the session, because without glacier2 it works.
    Do you know any solutions for this? If neccesary I can post some code.

    Thanks
  • benoit
    benoit Rennes, France
    Hi,

    How does the operation fail, with an exception? If that's the case, which exception?

    If I understand it correctly, the Modem server is the one that establishes the session. Why is this server listening on port 10030? It shouldn't need to listen on any endpoints if it only receives requests from Glacier2 (because the connection between the "Modem" and Glacier2 is bidirectional).

    Cheers,
    Benoit.
  • Hi, I've solved the previous question.
    Now I have a problem with the subscription to IceStorm.
    In the application "Modem" I obtain the proxy to the TopicManager as usual:
    const string proxyProperty = "IceStorm.TopicManager.Proxy";
    string proxy = properties->getProperty(proxyProperty);
    Ice::ObjectPrx base = communicator()->stringToProxy(proxy);
    IceStorm::TopicManagerPrx manager = IceStorm::TopicManagerPrx::checkedCast(base);
    Then I create an adapter with a servant that contains this proxy to use in a later invocation.
    Ice::ObjectAdapterPtr adapter = communicator()->createObjectAdapter("TC65Operations.Modem");
    TC65OperationsPtr TC65operations = new TC65OperationsI(getterProxies, manager, adapterTC65Storer, properties);
    TC65OperationsPrx TC65operationsProxy = TC65OperationsPrx::uncheckedCast(adapter->add(TC65operations, TC65OperationsIdent));

    The code of the TC65OperationsI where I use the manager, is the next:
    int
    TC65OperationsI::subscribeTC65(const string & topic, const Ice::Current&)
    {
    IceStorm::QoS qos;
    qos["reliability"] = "batch";

    const string stProxy = "Storer.DataStorer_" + topic + ".Proxy";
    string storerProxy = _properties->getProperty(stProxy);
    if(storerProxy.empty()){
    cerr << "property `" << stProxy << "' does not exist in the configuration file" << endl;
    return 1;
    }
    Ice::CommunicatorPtr communicator = _adapterTC65->getCommunicator();
    Ice::ObjectPrx base = communicator->stringToProxy(storerProxy);

    StorerPrx storer = StorerPrx::uncheckedCast(base);
    if(!storer){
    cerr << "invalid proxy" << endl;
    return 1;
    }
    Ice::ObjectPtr TC65storer = new TC65StorerI(storer);
    try{
    Ice::ObjectPrx object = _adapterTC65->addWithUUID(TC65storer);
    IceStorm::TopicPrx ttopic = _manager->retrieve(topic);
    ttopic->subscribe(qos, object);
    subscribers[topic] = object;
    }
    catch(const IceStorm::NoSuchTopic& e){
    cerr << e << " name: " << e.name << endl;
    return 1;
    }
    _adapterTC65->activate();
    return 0;
    }
    But when the operation is invoked the application hangs, waiting for the finish of "_manager->retrieve(topic)", and without obtains any error.

    I have showed in the screen the contain of "manager" (ManagerPrx: RFIceStorm/TopicManager -t:tcp -h 127.0.0.1 -p 10005), and it seems to be right. What is the problem??

    Thanks
  • benoit
    benoit Rennes, France
    It's difficult to say without more information. If I understand it correctly, the modem process is talking to the IceStorm service directly, requests are not going through Glacier2, is this correct?

    Could you attach the debugger to your modem process and get a stack trace of each thread? We should be able to see why the retrieve() operation hangs with the stack traces.

    Also, did you configure IceStorm or your modem application to use the thread per connection concurrency model or are you using the default thread pool model?

    Cheers,
    Benoit.
  • The modem process talks to the IceStorm directly when other application ("gestor") invokes the operation "subscribe" in the modem. The router glacier2 is between the modem and the "gestor". The "modem" subscribes one of its objects and all the information send to this adapter is forwared to the "gestor".

    I have configured the router glacier2 with the property Ice.ThreadPerConnection.StackSize=262144 as in the demos, is this enough to use the thread per connection concurrency model?

    What do you mean when you say to attach debugger to the modem? Do you refer to the Ice.Trace.Network property?
    These are the traces of the modem:
    fgarcia@iris:~/Desktop/ICE/Ice-3.0.0/rfranco/gestion/Tut_base14.2$ ./modem
    [ Network: tcp connection established
    local address = 138.4.9.76:36547
    remote address = 138.4.9.76:10000 ]
    user id: 1
    password: 1
    !!!sessionReg: A6A6AD98-EFC4-4DA7-BF80-B004397327B8 -t:tcp -h 138.4.9.76 -p 10011
    [ Network: accepting tcp connections at 127.0.0.1:59742 ]
    [ Network: accepting tcp connections at 138.4.9.76:57195 ]
    !!!Proxy of the TC65operationsProxy: "ROjlo\/GYVO@-9nj\/v&k1/TC65operations" -t:tcp -h 138.4.9.76 -p 37523
    The messages with !!! at the beginning are mine, and show proxies.

    Thanks
  • benoit
    benoit Rennes, France
    Hi,

    Glacier2 always uses thread per connection, you don't have to configure anything. I was more asking for other processes such as the modem server or IceStorm server (by default Ice uses the thread pool model, you have to set the Ice.ThreadPerConnection property to use the thread per connection model).

    I meant using the debugger to debug the modem process. Which OS/platform do you use? On Linux, you can attach a running process, knowning its PID, with the gdb debugger. It allows you to get the stack trace of each threads. Seeing the stack traces would help to figure out what the modem process is doing exactly when it hangs on the call to IceStorm.

    Cheers,
    Benoit.
  • Hi,
    I have set the Ice.ThreadPerConnection and it does not work. I have another application very similar with the same problem, so I am going to post it.

    This is the slice definition:
    #include <Clases.ice>
    module Management{
    interface Storer{
    void storeNotification (Notification notif);
    };
    interface Getter{
    allNotifications getNotificationByDate (string end, string begin);
    allNotifications getNotification (parameters param);
    };
    };
    #endif

    And here is the Clases.ice
    module Management{
    struct Notification
    {
    string idMachine;
    string idGame;
    string clase;
    string name;
    string info;
    string timeStamp;

    };
    sequence<Notification> allNotifications;
    dictionary<string,string> parameters;
    };
    #endif

    Using this interface, one application invokes the operation "getNotificationByDate", implemented as you can see:
    Management::allNotifications
    GetterI::getNotificationByDate(const string & end, const string & begin, const Ice::Current&)
    {
    Management::allNotifications tmpNoti;
    if (begin == ""){
    tmpNoti = _dataBase->getNotification(end);
    }
    else {
    tmpNoti = _dataBase->getNotification(end, begin);
    }
    Management::allNotifications::iterator i = tmpNoti.begin();
    cout << "!!!!!First element: " << i->name << endl;
    return tmpNoti;
    }

    The operation executes correctly, and "tmpNoti" is not empty, but the application that invokes the operation hangs after the invocation, waiting for the response.

    The traces that I get are the next:
    *In the application that does the invocation:
    [ Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 109
    request id = 16
    identity = getter_incidencias
    facet =
    operation = getNotificationByDate
    mode = 0 (normal)
    context = ]
    [ Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 74
    request id = 17
    identity = A0CD86ED-281D-475F-942A-CB5A4235244A
    facet =
    operation = ice_ping
    mode = 1 (nonmutating)
    context = ]

    *And in the other application:
    [ Protocol: received request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 109
    request id = 2
    identity = getter_incidencias
    facet =
    operation = getNotificationByDate
    mode = 0 (normal)
    context = ]
    SELECT * FROM RF_INCIDENCIAS WHERE TS<"2010-01-01 00:00:00" AND TS >"1990-01-01 00:00:00"
    !!!!!First element: encendido
    [ Protocol: sending reply
    message type = 2 (reply)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 150
    request id = 2
    reply status = 0 (ok) ]
    [ Network: closing tcp connection
    local address = 138.4.9.76:10031
    remote address = 138.4.9.76:42800 ]

    I do not know what append, I have done other application very similar and I never have this problem.

    Thanks for your time
  • benoit
    benoit Rennes, France
    Hi,

    Are you perhaps using AMI? If you're using AMI, it's possible that the client thread pool (responsible for reading replies) might be exhausted because of some deadlocks in your AMI callbacks.

    Again, when a program hangs, the first thing to try is to attach to the process with the debugger. Seeing the stack trace of each thread in the client should give us a good idea on where the invocation is hanging. From the traces you sent, the only thing I can say is that the server sent the reply but for some reasons the client is not reading it.

    Could you try to debug to the client process with the debugger and post the stack traces? Let us know if you need more information on how to do this (please specify on which platform your client is running).

    Of course, if you can reproduce this hang with a small test case that you could post here, we'll be happy to take a look at it.

    Cheers,
    Benoit.
  • Hi,
    I am not using AMI. The client and all the applications are running on linux.
    I have debugged both applications, the "modem" (which invokes the operation), and the "localStorer" (which executes the opertion) with gdb, and these are the results.
    I have set a breakpoint at the beginning of the function that invokes the operation "getNotificationByDate". Also I have set another breakpoint at the beginning of the function "getNotificationByDate". I have executed both step by step and save the results in the following files (the important traces are after the breakpoints), "gdb_modem.txt" and "gdb_localStorer.txt" (I have only attached the traces after the breakpoints)
    The "modem" application invokes the "getNotificationByDate" on the "modem", and waits for the response, as you can see in the traces of the file gdb_modem.txt. In the file gdb_localStorer.txt are the traces which show the execution of the operation "getNotificationByDate". You can see that after the last trace, the application hangs and nothing has been returned to the "modem".

    If it is not enough, I can try to develop a small test that reproduces the error, let me know if it is necessary.

    Thanks
  • benoit
    benoit Rennes, France
    Hi,

    Could you try the following instead?
    • Run the modem application in the debugger.
    • Get it to hang.
    • Hit Ctrl-<C> when it hangs.
    • Run the "thread apply all bt" command under the debugger to dump the stack traces of each thread.
    • Post the stack traces here or in a text file.

    Also, which component is invoking the getNotificationTC65ByDate operation on the modem application? Is it the Glacier2 router?

    Cheers,
    Benoit.
  • These are the stacks traces of the threads in the "modem" just before the invocation of the operation.
    33 tmpNoti = _getterProxies[topic]->getNotificationByDate(end,begin);
    (gdb) thread apply all bt

    Thread 7 (Thread -1533367376 (LWP 12722)):
    #0 0xa75e5327 in select () from /lib/tls/libc.so.6
    #1 0xa7b2179c in IceInternal::doAccept (fd=8, timeout=-1) at Network.cpp:741
    #2 0xa7b81471 in IceInternal::TcpAcceptor::accept (this=0x80a6dc0, timeout=-1) at TcpAcceptor.cpp:65
    #3 0xa7aa5dfe in IceInternal::IncomingConnectionFactory::run (this=0x80a6cb0) at ConnectionFactory.cpp:1044
    #4 0xa7aa6580 in IceInternal::IncomingConnectionFactory::ThreadPerIncomingConnectionFactory::run (this=0x80a6df8)
    at ConnectionFactory.cpp:1155
    #5 0xa79759e9 in startHook (arg=0x80a6df8) at Thread.cpp:482
    #6 0xa7933ced in start_thread () from /lib/tls/libpthread.so.0
    #7 0xa75ecdee in clone () from /lib/tls/libc.so.6

    Thread 6 (Thread -1524978768 (LWP 12721)):
    #0 0xa75e5327 in select () from /lib/tls/libc.so.6
    #1 0xa7b2179c in IceInternal::doAccept (fd=7, timeout=-1) at Network.cpp:741
    #2 0xa7b81471 in IceInternal::TcpAcceptor::accept (this=0x80a6a60, timeout=-1) at TcpAcceptor.cpp:65
    #3 0xa7aa5dfe in IceInternal::IncomingConnectionFactory::run (this=0x80a6928) at ConnectionFactory.cpp:1044
    #4 0xa7aa6580 in IceInternal::IncomingConnectionFactory::ThreadPerIncomingConnectionFactory::run (this=0x80a6b30)
    at ConnectionFactory.cpp:1155
    #5 0xa79759e9 in startHook (arg=0x80a6b30) at Thread.cpp:482
    #6 0xa7933ced in start_thread () from /lib/tls/libpthread.so.0
    #7 0xa75ecdee in clone () from /lib/tls/libc.so.6

    Thread 5 (Thread -1516487760 (LWP 12719)):
    #0 0xa7935b81 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/tls/libpthread.so.0
    #1 0xa7aad8f8 in IceUtil::Cond::waitImpl<IceUtil::Mutex> (this=0xa59c31fc, mutex=@0xa59c322c)
    at ../../include/IceUtil/Cond.h:203
    #2 0xa7aad9c0 in IceUtil::Monitor<IceUtil::Mutex>::wait (this=0xa59c31fc) at ../../include/IceUtil/Monitor.h:152
    #3 0xa7b3f9dd in IceInternal::Outgoing::invoke (this=0xa59c31f8) at Outgoing.cpp:164
    #4 0xa7b549e6 in IceDelegateM::Ice::Object::ice_ping (this=0x80a8034, __context=@0x80a5ce0) at Proxy.cpp:911
    #5 0xa7b53735 in IceProxy::Ice::Object::ice_ping (this=0x80a59b4, __context=@0x80a5ce0) at Proxy.cpp:192
    #6 0xa7b53871 in IceProxy::Ice::Object::ice_ping (this=0x80a59b4) at Proxy.cpp:180
    #7 0x080816dd in SessionPingThread::run (this=0x80a5888) at Modem.cpp:41
    #8 0xa79759e9 in startHook (arg=0x80a5888) at Thread.cpp:482
    #9 0xa7933ced in start_thread () from /lib/tls/libpthread.so.0
    ---Type <return> to continue, or q <return> to quit---
    #10 0xa75ecdee in clone () from /lib/tls/libc.so.6

    Thread 4 (Thread -1508099152 (LWP 12715)):
    #0 TC65OperationsI::getNotificationTC65ByDate (this=0x80a7fa8, topic=@0xa61c2b20, end=@0xa61c2b1c, begin=@0xa61c2b18)
    at TC65OperationsI.cpp:33
    #1 0x0806b5e3 in Management::TC65Operations::___getNotificationTC65ByDate (this=0x80a7fa8, __inS=@0xa61c2e70,
    __current=@0xa61c2e70) at TC65Operations.cpp:743
    #2 0x0806b8d4 in Management::TC65Operations::__dispatch (this=0x80a7fa8, in=@0xa61c2e70, current=@0xa61c2e70)
    at TC65Operations.cpp:818
    #3 0xa7ae6de9 in IceInternal::Incoming::invoke (this=0xa61c2e70, servantManager=@0xa61c3298) at Incoming.cpp:171
    #4 0xa7abee97 in Ice::ConnectionI::invokeAll (this=0x80a55f8, stream=@0xa61c315c, invokeNum=1, requestId=1,
    compress=0 '\0', servantManager=@0xa61c3298, adapter=@0xa61c3294) at ConnectionI.cpp:2234
    #5 0xa7ac06bc in Ice::ConnectionI::run (this=0x80a55f8) at ConnectionI.cpp:2530
    #6 0xa7ac0a6a in Ice::ConnectionI::ThreadPerConnection::run (this=0x80a52c0) at ConnectionI.cpp:2560
    #7 0xa79759e9 in startHook (arg=0x80a52c0) at Thread.cpp:482
    #8 0xa7933ced in start_thread () from /lib/tls/libpthread.so.0
    #9 0xa75ecdee in clone () from /lib/tls/libc.so.6

    Thread 3 (Thread -1499710544 (LWP 12714)):
    #0 0xa7935de2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/tls/libpthread.so.0
    #1 0x08081523 in IceUtil::Cond::timedWaitImpl<IceUtil::Mutex> (this=0x80a53fc, mutex=@0x80a542c, timeout=@0x80a544c)
    at ../../../include/IceUtil/Cond.h:224
    #2 0x08081618 in IceUtil::Monitor<IceUtil::Mutex>::timedWait (this=0x80a53fc, timeout=@0x80a544c)
    at ../../../include/IceUtil/Monitor.h:180
    #3 0xa7ac8a45 in IceInternal::ConnectionMonitor::run (this=0x80a53d8) at ConnectionMonitor.cpp:78
    #4 0xa79759e9 in startHook (arg=0x80a53d8) at Thread.cpp:482
    #5 0xa7933ced in start_thread () from /lib/tls/libpthread.so.0
    #6 0xa75ecdee in clone () from /lib/tls/libc.so.6

    Thread 2 (Thread -1491321936 (LWP 12713)):
    #0 0xa7939419 in do_sigwait () from /lib/tls/libpthread.so.0
    #1 0xa79394af in sigwait () from /lib/tls/libpthread.so.0
    #2 0xa795e29c in sigwaitThread () at CtrlCHandler.cpp:124
    #3 0xa7933ced in start_thread () from /lib/tls/libpthread.so.0
    #4 0xa75ecdee in clone () from /lib/tls/libc.so.6
    ---Type <return> to continue, or q <return> to quit---

    Thread 1 (Thread -1491319104 (LWP 12710)):
    #0 0xa7935b81 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/tls/libpthread.so.0
    #1 0xa7b24a72 in IceUtil::Cond::waitImpl<IceUtil::RecMutex> (this=0x80a50ac, mutex=@0x80a50dc)
    at ../../include/IceUtil/Cond.h:203
    #2 0xa7b24b3a in IceUtil::Monitor<IceUtil::RecMutex>::wait (this=0x80a50ac) at ../../include/IceUtil/Monitor.h:152
    #3 0xa7b23317 in IceInternal::ObjectAdapterFactory::waitForShutdown (this=0x80a50a0) at ObjectAdapterFactory.cpp:66
    #4 0xa7a9f138 in Ice::CommunicatorI::waitForShutdown (this=0x80a4ba0) at CommunicatorI.cpp:119
    #5 0x0807f7d9 in Modem::run (this=0xafb64fc8, argc=1, argv=0xafb65054) at Modem.cpp:213
    #6 0xa7a80611 in Ice::Application::main (this=0xafb64fc8, argc=1, argv=0xafb65054, configFile=0x808da8a "config.modem",
    logger=@0xafb64fcc) at Application.cpp:266
    #7 0x0807da9d in main (argc=1, argv=0xafb65054) at Modem.cpp:78
    33 tmpNoti = _getterProxies[topic]->getNotificationByDate(end,begin);
    (gdb) n
    [ Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 109
    request id = 17
    identity = getter_incidencias
    facet =
    operation = getNotificationByDate
    mode = 0 (normal)
    context = ]

    I can't attach the traces after hitting Ctrl-<C> because I get this
    [ Network: accepted tcp connection
    local address = 127.0.0.1:44297
    remote address = 127.0.0.1:37518 ]
    [ Network: closing tcp connection
    local address = 127.0.0.1:44297
    remote address = 127.0.0.1:37518 ]
    [ Network: stopping to accept tcp connections at 127.0.0.1:44297 ]
    [ Network: accepted tcp connection
    local address = 138.4.9.76:47099
    remote address = 138.4.9.76:48623 ][Thread -1524978768 (LWP 12721) exited]

    [ Network: closing tcp connection
    local address = 138.4.9.76:47099
    remote address = 138.4.9.76:48623 ]
    [ Network: stopping to accept tcp connections at 138.4.9.76:47099 ]
    [Thread -1533367376 (LWP 12722) exited]
    [ Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 58
    request id = 18
    identity = Glacier2/router
    facet =
    operation = destroySession
    mode = 0 (normal)
    context = ]
    and then hangs, and it doesn't let me don anything.

    The component which invokes the "getNotificationTC65ByDate" operation is the Glacier2 router.

    Thanks and sorry for this huge post
  • benoit
    benoit Rennes, France
    Hi,

    Sorry, using Ctrl-C in the debugger wasn't a good idea, the signal is intercepted by the application and it starts the shutdown. The best would be to attach to the process when it hangs like I suggested in my previous posts.

    To attach to the modem process with gdb, you need to figure out its pid with the "ps" command and then you can run gdb with the path of the executable and the pid as arguments. You could also use the gdb "attach <pid>" command.

    In any case, I suspect I know what the problem is. You're using thread per connection with a bidirectional connection. It's not possible to have nested callbacks with such a configuration. You shouldn't use --Ice.ThreadPerConnection for your modem application. Instead, you should use the default thread pool model and adjust the size of the client thread pool to allow nested callbacks. I recommend you to take a look at the section "30.9 The Ice Threading Models" in the manual. It explains the limitations of the different threading models and how to setup the thread pool size to allow nested callbacks.

    I would also recommend to take a look at Bernard's newsletter articles "Avoiding Deadlocks". These articles also detail the potential issues with using nested callbacks.

    Cheers,
    Benoit.
  • Hi,
    I have fixed the problem, as you said I changed to the default thread pool model and it worked.

    Thank you very much!!!
  • Hi, following with the application I've described higher up, I've had some problems carring it to several pc's. If I run all the application in my pc (server, client and glacier2), I've no problems, but if I run the client and glacier2 in one, and the server in other it does not works.
    The problem rises when the server tries to do a checkedCast over the proxy, this is the line:
    IceStorm::TopicManagerPrx manager = IceStorm::TopicManagerPrx::checkedCast(base);
    because it seems that tries to do it over the glacier2router.

    These are the stacks of the server and the glacier2router respectively:
    [ Network: tcp connection established
    local address = 138.4.9.115:54650
    remote address = 138.4.9.76:10000 ]
    ...
    !!!!!RouterPrx: Glacier2/router -t:tcp -h 138.4.9.76 -p 10000
    ...
    !!!!!RegisterSessionPrx: 1E60E482-004C-45A2-BFE3-8B54CB70681B -t:tcp -h 138.4.9.76 -p 10011
    [ Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 58
    request id = 3
    identity = Glacier2/router
    facet =
    operation = getClientProxy
    mode = 1 (nonmutating)
    context = ]
    [ Protocol: received reply
    message type = 2 (reply)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 64
    request id = 3
    reply status = 0 (ok) ]
    [ Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 107
    request id = 4
    identity = Glacier2/router
    facet =
    operation = addProxy
    mode = 2 (idempotent)
    context = ]
    [ Protocol: received reply
    message type = 2 (reply)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 25
    request id = 4
    reply status = 0 (ok) ]
    [ Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 84
    request id = 5
    identity = RFIceStorm/TopicManager
    facet =
    operation = ice_isA
    mode = 1 (nonmutating)
    context = ]
    [ Protocol: received reply
    message type = 2 (reply)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 108
    request id = 5
    reply status = 5 (unknown local exception)
    unknown = Network.cpp:669: Ice::ConnectionRefusedException:
    connection refused: Connection refused ]
    ./emodem: Outgoing.cpp:415: Ice::UnknownLocalException:
    unknown local exception:
    Network.cpp:669: Ice::ConnectionRefusedException:
    connection refused: Connection refused
    [ Protocol: sending close connection
    message type = 4 (close connection)
    compression status = 1 (not compressed; compress response, if any)
    message size = 14 ]
    [ Network: closing tcp connection
    local address = 138.4.9.115:54650
    remote address = 138.4.9.76:10000 ]
    [ glacier2router: Network: accepted tcp connection
    local address = 138.4.9.76:10000
    remote address = 138.4.9.115:60503 ]
    [ glacier2router: Network: closing tcp connection
    local address = 138.4.9.76:10000
    remote address = 138.4.9.115:60503 ]
    [ glacier2router: Network: accepted tcp connection
    local address = 138.4.9.76:10000
    remote address = 138.4.9.115:60504 ]
    glacier2router: warning: dispatch exception: Network.cpp:669: Ice::ConnectionRefusedException:
    connection refused: Connection refused
    identity: RFIceStorm/TopicManager
    facet:
    operation: ice_isA
    [ glacier2router: Network: closing tcp connection
    local address = 138.4.9.76:10000
    remote address = 138.4.9.115:60504 ]

    Thanks for your help
  • Try to run Glacier2 with Ice.Trace.Network=2, then you also see connection attempts. This should give you enough information to find out why the connection was refused. See also:

    http://www.zeroc.com/faq/connectionRefusedException.html
  • After running Glacier2 and the server with "Ice.Trace.Network=2" I have got these results:

    Server traces
    [ Network: trying to establish tcp connection to 138.4.9.76:10000 ]
    [ Network: tcp connection established
    local address = 138.4.9.115:42786
    remote address = 138.4.9.76:10000 ]
    !!!!!RouterPrx: Glacier2/router -t:tcp -h 138.4.9.76 -p 10000
    !!!!!RegisterSessionPrx: 1FFACC2B-1E65-4A97-9199-E922239DF447 -t:tcp -h 138.4.9.76 -p 10011
    !!!!!TopicManagerProxy: RFIceStorm/TopicManager:tcp -h 127.0.0.1 -p 10005
    ./emodem: Outgoing.cpp:415: Ice::UnknownLocalException:
    unknown local exception:
    Network.cpp:669: Ice::ConnectionRefusedException:
    connection refused: Connection refused
    [ Network: shutting down tcp connection for writing
    local address = 138.4.9.115:42786
    remote address = 138.4.9.76:10000 ]
    [ Network: closing tcp connection
    local address = 138.4.9.115:42786
    remote address = 138.4.9.76:10000 ]

    Glacier2 traces:
    ...
    ...iniciando router Glacier2
    [ glacier2router: Network: attempting to bind to tcp socket 138.4.9.76:10000 ]
    [ glacier2router: Network: accepting tcp connections at 138.4.9.76:10000 ]
    [ glacier2router: Network: attempting to bind to tcp socket 138.4.9.76:0 ]
    [ glacier2router: Network: accepting tcp connections at 138.4.9.76:36072 ]
    [ glacier2router: Network: trying to establish tcp connection to 138.4.9.76:10011 ]
    [ glacier2router: Network: tcp connection established
    local address = 138.4.9.76:60198
    remote address = 138.4.9.76:10011 ]
    ...
    [ glacier2router: Network: accepted tcp connection
    local address = 138.4.9.76:10000
    remote address = 138.4.9.115:42786 ]
    [ glacier2router: Network: trying to establish tcp connection to 127.0.0.1:10005 ]
    glacier2router: warning: dispatch exception: Network.cpp:669: Ice::ConnectionRefusedException:
    connection refused: Connection refused
    identity: RFIceStorm/TopicManager
    facet:
    operation: ice_isA

    [ glacier2router: Network: shutting down tcp connection for reading and writing
    local address = 138.4.9.76:10000
    remote address = 138.4.9.115:42786 ]
    [ glacier2router: Network: closing tcp connection
    local address = 138.4.9.76:10000
    remote address = 138.4.9.115:42786 ]

    It seems that when the server tries to do a checkedCast of the TopicManagerProxy "IceStorm::TopicManagerPrx manager = IceStorm::TopicManagerPrx::checkedCast(base)" it sends the request to the Glacier2router and of course Glacier2router is unable to solve this because the endpoint of the IceStorm manager is in the pc of the server and not in the pc where Glacier2 and the client are running.

    Do you know how can I fix this?

    Thanks
  • benoit
    benoit Rennes, France
    Hi,

    When you set a default router on the communicator (either with the Ice.Default.Router property or programatically with the setDefaultRouter method of the Ice::Communicator interface), all the client invocations from the communicator will be sent to this router.

    So if you don't want the invocations to your IceStorm server to go through the router, you should either explicitly disable the router on the IceStorm topic manager proxy (and other IceStorm proxies) or not use a default router and set the router on a per-proxy basis.

    To set or unset a router on a proxy, you can use the proxy ice_router operation. See the Ice manual for more information (Section 30.10).

    Cheers,
    Benoit.
  • Thanks I have used the operation ice_router(0) to unset the router in the proxies I wanted, and now it works.

    Thanks
  • Is there anyway of checking the connectivity between two pc's? Initialy I check the connectivity with the adapter with the ice_ping operation, but I also want to check it every minute. Then in a random moment I unplug the ethernet interface, but it seems that the ice_ping operation has no effect, once the initial check has been success.

    How can I check it??
  • You can call ice_ping() periodically from a separate thread. There is in general no reliable way to detect a loss of connectivity with TCP/IP, other than trying to send data. For example, if you just "cut the cable", the server has no chance to send a TCP "Fin" message, and therefore the client will never notice that the server has gone away, unless it tries to send data to the server. This has nothing to do with Ice, but this is simply how TCP/IP works in general.
  • Could I set a timeout with the operation so if the operation get too much time, to stop it?
  • Yes, of course!