Archived

This forum has been archived. Please start a new discussion on GitHub.

IceStorm drops subscriber

Dear Experts,

I am running IceStorm deployed through IceGrid. Most of the time everything is running fine with the exception that one of my subscribers is regularly unregistered by IceStorm.

It is interesting that subscriber receives a number of messages before it gets unsubscribed. Other subscribers to this topic stay subscribed.

Here is my configuration in a nutshell.
Ice.Default.Locator=TestIceGrid/Locator:default -h 192.168.4.227 -p 14061:default -h 192.168.4.227 -p 14062:default -h 192.168.4.227 -p 14063:default -h 192.168.4.227 -p 14064

and seemingly relevant IceStorm timeouts:
<property name="${service}.Send.Timeout" value="60"/> <!-- was 10-->
<property name="${service}.Discard.Interval" value="60"/> <!-- was 1-->

I enabled network tracing but I do not see anything suspicious.

Am I missing some magic "Trace" or some timeout? I am at a loss :(

Thank you,
Aleksey

Comments

  • mes
    mes California
    Hi,

    Try setting the property IceStorm.Trace.Subscriber=2. This will cause IceStorm to log messages regarding subscriber activity and might explain what's happening with the problematic subscriber.

    If you're still having trouble, please provide more details about your platforms, programming languages, and Ice version.

    Regards,
    Mark
  • Hi Mark,

    thank you for your prompt response.
    mes wrote: »
    Try setting the property IceStorm.Trace.Subscriber=2. This will cause IceStorm to log messages regarding subscriber activity and might explain what's happening with the problematic subscriber.

    I did enable Subscriber tracing but that did not lead to any clues. I am attaching application.xml.
    mes wrote: »
    If you're still having trouble, please provide more details about your platforms, programming languages, and Ice version.

    Ice version 3.5.0
    IceStorm Server OS: Centos 2.6.32-220.el6.x86_64
    Subscriber Machine OS: same as IceStorm
    Subscriber C++, icpc
    Publisher C#, Windows

    Anything else?

    Thank you,
    Aleksey
  • mes
    mes California
    I'd like to see the IceStorm log output. I'd also like to see the subscriber's log output with Ice.Trace.Network=2.

    Mark
  • Here it is.

    IceStorm ip 192.168.4.227
    Subscriber ip 192.168.161.25

    Thank you,
    Aleksey
  • mes
    mes California
    The attached file appears to be a log for the subscriber. Can you provide the IceStorm log too?

    Also, from a programming perspective, it's important for a subscriber to activate its object adapter before subscribing to a topic, as demonstrated in our sample code. Otherwise, there is a risk of the following situation occurring:
    • Subscriber subscribes to a topic
    • Around the same time, a message is published on that topic
    • IceStorm attempts to deliver the message to this new subscriber
    • The subscriber's object adapter is not listening yet, so IceStorm receives an error during delivery
    • IceStorm removes the subscriber
    • Subscriber activates its object adapter (but too late!)
    I noticed in the subscriber log that there were several instances of the "PubSubBaseAdapter" being activated. I don't know if each occurrence was a separate execution of the subscriber program, or if the subscriber is creating and destroying this adapter multiple times during a single execution. I would be concerned if it was the latter case.

    Regards,
    Mark
  • You might be right! I will test if this is indeed the case.

    Meanwhile I am attaching IceStorm logs.
    Publisher ip 192.168.4.47.

    Thank you,
    Aleksey
  • mes
    mes California
    Thanks for providing the logs. I see entries in the first file indicating that a subscriber was dropped due to a connection timeout:
    -- 11/08/13 09:53:46.806 BelvedereIceStorm-1-IceStorm: Retry: cannot retry operation call because retry limit has been exceeded
       ConnectionI.cpp:1818: Ice::ConnectTimeoutException:
       timeout while establishing a connection
    -- 11/08/13 09:53:46.806 BelvedereIceStorm-1-IceStorm: Subscriber: endpoints: "tcp -h 192.168.161.25 -p 52815 -t 5000" transition from: online to: error
    -- 11/08/13 09:53:46.806 BelvedereIceStorm-1-IceStorm: Subscriber: BDD627EF-0CE9-4640-A56E-30679749B28A endpoints: "tcp -h 192.168.161.25 -p 52815 -t 5000" subscriber errored out: ConnectionI.cpp:1818: Ice::ConnectTimeoutException:
       timeout while establishing a connection retry: 0/0
    
    The next step would be to determine why IceStorm was no longer able to connect to 192.168.161.25 port 52815. Comparing the time stamps of the log entries against the activity in the subscriber's log should provide more clues.

    Mark
  • Hi Mark,

    That error is coming from IceStorm trying to deliver a message to the old subscriber that was previously unsubscribed by IceStorm.

    I also verified that the publisher waits plenty of time for subscriber to become active.

    While looking at the code I noticed that the number of messages that are send per unit of time by the publisher is very large. I increase IceStorm.Send.Timeout=600. This seem to have fixed the issue.

    I got time to test it only a few times using full system and a number of time using test subscriber/publisher.

    I will be able to confirm on Monday if that worked...

    Thank you,
    Aleksey
  • Hi Mark,

    It appears that the reason for those timeouts was a very large number of messages sent by the publisher and inadequately small value of IceStorm.Send.Timeout parameter.


    Thank you for your very prompt help!
    Aleksey
  • mes
    mes California
    No problem, glad to hear you worked it out.

    Mark