Archived

This forum has been archived. Please start a new discussion on GitHub.

IceStorm Subscribe to All Topics?

Hi,

We have been using IceStorm pub/sub to publish events from lots of processes to a smaller set of interested subscribers for some time. Each publisher creates a unique topic, and each event has some data which is transmitted.

At the moment we are trying to add some admin style monitoring of all events using a replicated process to subscribe to all topics and based on certain criteria trigger a specific action depending on the event data, and which process (topic) it came from.

Ultimately we would like to subscribe to all available topics and process each event to make decision about whether an action is required. As there could be potentially hundreds of thousands of topics we of course want to do this is a replicated manner.

So far we have tried using a replicated indirect proxy to subscribe to the IceStorm however multiple subscribers using this indirect proxy to the same topic throw a IceStorm::AlreadySubscribed as I am guessing IceStorm is doing a lookup for the proxy and given topic name.

The other solution we can think of is creating replicated agents in master slave arrangement. Each agent would use evictor pattern to subscribe to each topic individually, and they would share map of subscribe topics.

​Before I go to much further I just wanted to check is there any easier approach to subscribe to all topics using replicated processes without creating a thread per topic?

Cheers,
Chris

Comments

  • I've been continuing my investigations into this a come up with one possible solution which would be to federate all the topics we want to catch into a single topic. And then have replicated subscribers subscribe to this topic with the replica group load balanced. The intention would be that IceStorm would only send each message to one subscriber as dictated by load balancer. The federation part seems pretty straight forward but I'm not sure about that replicated subscribers is possible? To achieve this all of the subscribers would need to subscribe with the same proxy subscribers@replicagroup. Again I think this might bring me back to the same issue that the subscribers won't be able to subscribe with the same indirect proxy, as they will throw TopicAlreadySubscribed.

    Any help would be much appreciated.

    Cheers
    Chris
  • benoit
    benoit Rennes, France
    Hi Chris,

    If you're goal is to receive the events on only one of the replicated subscriber, you should subscribe only once with the replicated proxy (or subscribe from all the subscribers and simply ignore the AlreadySubscribed exception). As far as IceStorm is concerned, it's dealing with a single subscriber and doesn't need to know that the subscriber proxy points to replicas. On the first invocation on the subscriber proxy, IceStorm will resolve the proxy to one of the replica (according to the load balancing policy) and send the events to this subscriber replica.

    Cheers,
    Benoit.



  • Hi Benoit,

    Thanks for the reply.

    I have been testing with a replica group of load balanced subscribers. The load balancing is set to random for the group (to give a more dynamic result when testing on a single server):
    <replica-group id="Exchange">
    <load-balancing type="round-robin" n-replicas="5"/>
    <object identity="${agent-id}" type="::TestInterface::Agent"/>
    </replica-group>

    As you have suggested if I just ignore the AlreadySubscribed exception and subscribe to IceStorm with the replica group proxy it works however, I am struggling to work out what the load balancer is doing because I'm observing some odd behaviour.

    Senario:
    - Ten publishers which publish messages to topics named 'Test-1' through to 'Test-10'.
    - The ten topics are federated into a single topic named 'Test'.
    - Five replicated subscriber agents running in grid.
    - I subscribe once using the replica group proxy to topic 'Test'.

    In this senario all messages from the publishers arrive at two randomly selected Agents. Always two, it never assigns messages to more than two agent's at a time.
    04/09/2015 04:09:33.646095 [trace] Agency-3| New MSG: Test-3: test 41.
    04/09/2015 04:09:34.570226 [trace] Agency-4| New MSG: Test-1: test 61.

    Now if I stop all agents except Agent-1 all messages start arriving at Agent-1, as expected, looks good automatic failover to other subscriber agents. When I restart the other agent's though is does not redistribute the messages and they will continue to arrive at Agent-1 until I restart Agent-1.
    04/09/2015 04:10:59.761379 [trace] Agency-1| New MSG: Test-3: test 127.
    04/09/2015 04:11:00.694020 [trace] Agency-1| New MSG: Test-1: test 147.

    I also tested with round-robin load balancing and the results were pretty much the same. Messages are never distributed to more than two agent's.

    I'm struggling to figure out what is happening here, my guess it has something to do with when IceStrom updates the list of endpoints for the replica group proxy?

    The endgame I'm trying to achieve is that messages are somewhat evenly distributed among subscriber agents.


    Cheers
    Chris
  • benoit
    benoit Rennes, France
    Hi Chris,

    I'm afraid IceStorm doesn't support this use-case right now. It does support the fail-over scenario when publishing events to a replicated proxy but not a scenario where it evenly distributes the events to all the members of the replica group.

    Right now, once IceStorm establishes a connection to a given subscriber (member of the replica group), it will continue sending events to this subscriber and until the connection to this subscriber is closed (in which case, IceStorm might pick another subscriber when it tries to establish another connection). IceStorm doesn't support sending events with the replicated proxy using "per-request" load balancing (as described here).

    This is something we could support in the future with an additional quality of service setting. I'll add this to our TODO list!

    Cheers,
    Benoit.
  • Hi Benoit,

    I think this would be a pretty powerful inclusion to IceStorm.

    In the meantime I can probably work something out by subscribing to each topic individually within each agent, and work out a process to ensure each topic is subscribed to once by one agent.

    Thanks for your help.

    Cheers
    Chris