Archived

This forum has been archived. Please start a new discussion on GitHub.

Load balancing and fault tolerance

Hi,

I am currently trying to get practical experience regarding Load Balancing (LB) and Fault Tolerance (FT) in Ice. There are some problems/questions I have got and would appreciate if somebody could provide some help or suggestions.

The first problem is related to FT: I have three instances of the simple server (just one interface with one method) running simultaneously with endpoints configured for local host and three different ports (10000 – 10003). Here is one of three configuration file records:
TEST.Endpoints=tcp -p 10000
There is also client application which creates proxy with multiple endpoints configured as following:
TESTProcessor.Proxy=TESTProcessor:tcp -p 10000:tcp -p 10001:tcp -p 10002

What I want to test is the case where client periodically calls the server and then server is “crashed” (simply press Ctrl-C) and the communication should continue with the next available server. If I am doing it with twoway proxy everything works exactly as expected. But if I am using oneway calls instead, when server is gone, the client breaks also giving the following error message:
ConnectionI.cpp:1993: Ice::CloseConnectionException:
protocol error: connection closed
instead of switching to the next server as in twoway case.

The following is the code fragment which illustrates the way I am creating corresponding proxies (actually copy/pasted from one of the examples):
Ice::ObjectPrx base = communicator->stringToProxy(proxy);
TEST::TestProcessorPrx twoway =
TEST::TestProcessorPrx::checkedCast(base->ice_twoway()->ice_timeout(-1)->ice_secure(false));
TEST::TestProcessorPrx oneway =
TEST::TestProcessorPrx::uncheckedCast(twoway->ice_oneway());

Am I doing something wrong or there is a bug related to oneways? Or is it expected behavior? If yes, what is the motivation behind it?

The second problem with this scenario is regarding LB. I’ve read in this forum that if there are multiple endpoints specified, Ice will pick up randomly one and try to use it for communication with server. However in the scenario I described above, the client(s) always pick the second one. It is also the case if I am trying to run multiple clients simultaneously – they all start talking with the second server. As a result, there is no LB functionality. So my questions is whether I misunderstand something here or is it a bug?


Thank you,
Andrey.

Comments

  • marc
    marc Florida
    As for the first part of your question, this is a known bug in Ice 2.1.0, which will be fixed in the next release. Please have a look at this thread for more information:

    http://www.zeroc.com/vbulletin/showthread.php?t=1271

    (If you cannot wait for 2.1.1 but need a patch now, please contact us at support@zeroc.com)

    As for the second problem, I have not encountered this behavior before. Which operating system and C++ compiler are you using?

    Just FYI, Ice for C++ uses STL's std::random_shuffle() to randomize endpoints, and then tries them sequentially.
  • Hi Marc,

    Thank you for the quick response!
    marc wrote:
    As for the first part of your question, this is a known bug in Ice 2.1.0, which will be fixed in the next release. Please have a look at this thread for more information:

    http://www.zeroc.com/vbulletin/showthread.php?t=1271

    (If you cannot wait for 2.1.1 but need a patch now, please contact us at support@zeroc.com)

    If it is known problem and is planned to be fixed in next version then I can wait. It is not crytical for me now.
    marc wrote:
    As for the second problem, I have not encountered this behavior before. Which operating system and C++ compiler are you using?

    Just FYI, Ice for C++ uses STL's std::random_shuffle() to randomize endpoints, and then tries them sequentially.

    I am using Linux with 2.6.8 kernel. Compiler is gcc 3.3.4.
    Sorry for not providing the complete information at the beginning.

    I beleave that the same behavior was on Windows with MSVC 7.1 but I am not sure. I will check it and let you know in a couple of hours.

    Please let me know if you are interesting to take a look at this small test application.

    Thank you,
    Andrey.
  • marc
    marc Florida
    A small test application would help. We used this feature extensively for several projects, and didn't have any problems yet.
  • marc
    marc Florida
    marc wrote:
    A small test application would help. We used this feature extensively for several projects, and didn't have any problems yet.

    We can reproduce this behavior under Linux, so there is no need to send us a test application. We are looking into the problem now. I know that it always worked perfectly with Visual C++ and STLport, so perhaps there is a problem with random_shuffle() in GCC.
  • dwayne
    dwayne St. John's, Newfoundland
    The problem with random endpoint selection being less than random has been fixed for the 2.1.1 release. The issue was that gcc's random_shuffle() uses lrand48() rather than rand(), and we were not properly seeding it. Until 2.1.1 is released you can just add a call to srand48() in the main() of your application to get the correct behavior. Of course once 2.1.1 is released this will no longer be necessary.

    Thanks for the bug report,
    Dwayne
  • Thank you guys for the quick reaction and fix!

    Regards
    Andrey.
  • Random Endpoints Not Random

    Hi,

    We're seeing this same situation in 2.1 with a C# client on XP connecting to a Solaris server built with gcc. Has this been universally fixed, or was it just something that was addressed for Linux?

    I should add that the behavior is somewhat different than the above post. We are seeing perhaps 95% connection to the second endpoint. This *could* be the result of a random sequence, but it get less likely each day:)

    Thanks,

    -- Andrew Bell
    andrew.bell.ia@gmail.com
  • C# doesn't have a shuffle() method for it's data structures, so I had to write one of my own. It's possible that I screwed up with that -- I'll have a look at this today and check whether there is a problem with the algorithm I used. (I sincerely hope that there isn'a problem with the underlying random number generator, which comes from .NET.)

    Cheers,

    Michi.
  • Me bad: an off-by-one error in the shuffle code meant that, for sequences of length 2, no shuffling was taking place. To fix it, change line 454 in Reference.cs to read:

    for(int i = 0; i < endpoints.Count - 1; ++i)

    Or use the following context diff for a patch:
    diff -c -r1.25 Reference.cs
    *** Reference.cs        2 May 2005 21:16:46 -0000       1.25
    --- Reference.cs        30 May 2005 04:24:39 -0000
    ***************
    *** 451,457 ****
                //
                // Randomize the order of endpoints.
                //
    !           for(int i = 0; i < endpoints.Count - 2; ++i)
                {
                    int r = _rand.Next(endpoints.Count - i) + i;
                    Debug.Assert(r >= i && r < endpoints.Count);
    --- 451,457 ----
                //
                // Randomize the order of endpoints.
                //
    !           for(int i = 0; i < endpoints.Count - 1; ++i)
                {
                    int r = _rand.Next(endpoints.Count - i) + i;
                    Debug.Assert(r >= i && r < endpoints.Count);
    

    Thanks for pointing this out!

    Cheers,

    Michi.
  • Didn't have the source

    Sorry,

    I should have found and fixed this myself, but I didn't realize that there was C# code in a separate source distribution. I thought you used the C++ code though a DLL that had a pInvoke/IJW wrapper. Next time I'll just send the fix:)

    Thanks again!

    -- Andrew Bell
    andrew.bell.ia@gmail.com
  • acbell wrote:
    I thought you used the C++ code though a DLL that had a pInvoke/IJW wrapper.

    No, Ice for C# is a stand-alone implementation in C#. It doesn't depend on the Ice for C++ DLLs in any way. (We use pInvoke in only a few places, to work around bugs in the .NET libraries and to do protocol compression with bzip2; but none of those calls go to the Ice for C++ libraries.)

    Cheers,

    Michi.