Archived

This forum has been archived. Please start a new discussion on GitHub.

pauses in data distribution

Greetings

We have developed a distributed system (diagnostic testbed for
electrical system) based on the IceStorm publish-subscribe system.
While running our system, we frequently observe pauses in data
transmission to all subscribers when a new agent (publisher/subscriber)
subscribes to or unsubscribes from the IceStorm server.

In order to evaluate these delays, we have performed an experiment using
the Clock example included in the IceStorm example.

We made the following changes to Publisher.java:

//Changing the for loop with a while loop
while (true)
{
clock.tick();
try {
Thread.sleep(300);
} catch ( InterruptedException tex) {
System.out.print("interrupted");
}
}


We also added an integer to the printout to better monitor the progress
of the output in ClockI.java

public class ClockI extends _ClockDisp
{ int m_counter=0;

public void tick(Ice.Current current)
{
m_counter = m_counter % 10;
System.out.println("tick" + m_counter++);
}
}

In addition, we changed the clock proxy to twoway in Subscriber.java to
better reflect our setup in our own architecture.


We ran one publisher and up to 30 subscribers. When starting a new
Subscriber we often noticed a pause of data flow (~20sec) to all
subscribers but we did not recognize a pattern. Sometimes, we were able
to start several subscribers without pause and other times
pauses occurred more frequently. We also noticed pauses
when shutting down the Subscribers (control-C) about 50% of the time.
Below is a list of outputs from the IceStorm server:


// Starting a new subscriber -- do difference between cases that started
// normally and cases that interrupted data flow.
[ icebox-IceStorm: Topic: Subscribe: 8f:e8:e4:c:77fce653:10dc230ed58:-8000 ]


// Unsubscribing a Subscriber without interruption of data flow.

[ icebox-IceStorm: Topic: Unsubscribe:
8f:e8:e4:c:7b4a383f:10dc2287e1b:-8000 ]

[ icebox-IceStorm: Subscriber: Unsubscribe
8f:e8:e4:c:7b4a383f:10dc2287e1b:-8000 ]


// Unsubscribing a Subscriber that resulted in interruption of data flow.

[ icebox-IceStorm: Subscriber: 8f:e8:e4:c:199048f2:10dc2287e9d:-8000:
publish failed: .\Network.cpp:669: Ice::ConnectionRefusedException:
connection refused: WSAECONNREFUSED ]

[ icebox-IceStorm: Topic: Unsubscribe:
8f:e8:e4:c:199048f2:10dc2287e9d:-8000 ]

[ icebox-IceStorm: Topic: 8f:e8:e4:c:199048f2:10dc2287e9d:-8000: not
subscribed. ]


We have not observed any pauses when starting or stopping additional Publishers.

Does anybody have any clues why the delays happen? For our distributed
system, we have sometimes observed delays of more than a minute,
something that is not acceptable for our "real-time" electrical system
health monitoring system.

We would appreciate any information that may help us solve this problem.

Christian Neukom

Comments

  • matthew
    matthew NL, Canada
    A very similar question was asked by a member of your research group :)http://www.zeroc.com/vbulletin/showthread.php?t=2474

    I suspect the same problem. If you are using the default configuration then you can get pauses when connecting. Also note that when using Java you can expect some pauses for garbage collection.
  • pauses in data distribution

    Hello,

    We continued our testing using the C++ clock example this time. We also increased the thread pool.

    Ice.ThreadPool.Server.Size=50
    Ice.ThreadPool.Client.Size=10

    We see the same problems. Any other suggestions? Thanks.

    -Chris
  • matthew
    matthew NL, Canada
    50 threads in the pool is too high. 10 is likely a suitable number. Do you have unreachable endpoints in your proxy configuration (from running VMware or similar on your host).

    You should run IceStorm with Ice.Trace.Network=2 to see the network connections that IceStorm is attemping to establish.