Home Help Center

my program halt in checkedCast , help!

ralf3ralf3 Member hexunOrganization: chinaProject: spine
i use ice 1.5.1 and gcc 3.2.2

the program halt in the information: "check spine"
what can cause it , pls tell me , thanks !!!

this is my program:

Ice::CommunicatorPtr _ic;
bool res = true;
try {
_ic = Ice::initialize( _argc, _argv );
Ice::ObjectPrx base = _ic->stringToProxy("server:tcp -h 192.168.5.64 -p 10000");
cout << "check spine" << endl;
SpineCommPrx _spine = SpineCommPrx::checkedCast(base);
cout << "check spine ok" << endl;
if( !_spine )
throw "Invalid SPINE proxy!";
wholeinfo.reg = _spine->regActor( wholeinfo.basic );
cout << "actorid is" << wholeinfo.reg.id << endl;
if (_ic)
_ic->destroy();

}
catch( const Ice::Exception & ex ) {
cerr << ex << endl;
res = false;
}
catch( const char *msg ) {
cerr << msg << endl;
res = false;
}
return res;
}

Comments

  • marcmarc FloridaAdministrators, ZeroC Staff Marc LaukienOrganization: ZeroC, Inc.Project: The Internet Communications Engine ZeroC Staff
    checkedCast() sends a request to the server, so the most likely reason is that the server does not respond. Since no timeout is set, the checkedCast() blocks forever. You can verify this by looking at the protocol tracing messages with --Ice.Trace.Protocol=1.

    Why the server does not respond, I cannot tell. Perhaps it runs out of threads? Running the server with protocol tracing will also give you more information.
  • DeepDiverDeepDiver Member Thomas MuellerOrganization: Freelance Software DeveloperProject: Project depend on Customers ✭✭✭
    hi,

    also have alook at your server code and make sure the object adapter has
    been activated.

    i always forget to do this!

    in this case the call "hangs" on server side.

    cu tom
  • ralf3ralf3 Member hexunOrganization: chinaProject: spine
    thank u all, i will try it .
  • rc_hzrc_hz Member Eric RCOrganization: www.genband.comProject: No project yet ✭✭✭
    Originally posted by DeepDiver
    hi,

    also have alook at your server code and make sure the object adapter has
    been activated.

    i always forget to do this!

    in this case the call "hangs" on server side.

    cu tom

    Really ? If so, it's too bad!
  • alexjalexj Member Alex JeffreyOrganization: Softopt LImitedProject: Distributed control systems
    Hi,

    I have encountered an issue over the last few days in which the main thread appears to be blocked. In a debugger, I found that the block was within a call to the checkedCast() and it never returned. The code around all of our checkedCast() calls expects one of four results:

    1. The cast fails (in which case we give up)
    2. The cast passes
    3. Ice::ConnectFailedException is thrown (we will retry later)
    4. Any other Ice::Exception is thrown (give up)

    [We use this apporach in (3) as we don't use IceGrid and we cannot guarantee the startup order of our clients and servers]

    When this error occurs, checkedCast() does not return or throw. We are expecting the result to be (3) as the server with the endpoint specified in the proxy property is not running. We don't have timeouts set for checkedCast() but I presume this would only be an issue if the server was located but did not respond to the isA() request.

    As a result of of reading this thread, I have set Ice.Override.Timeout and Ice.Override.ConnectTimeout to -1 and the behaviour has not changed. Enabling Protocol and Network trace does not help (no output once blocked).

    Is my assumption regarding the failure of checkedCast() correct i.e. if the server is not running I should get a ConnectionFailed even though I have no timeouts?

    I should add that the application works as expected on some platforms but this is a new one so I am looking for clues as t what the issue could be or pointers on how to investigate.

    Thanks for any help,

    Alex.
  • alexjalexj Member Alex JeffreyOrganization: Softopt LImitedProject: Distributed control systems
    Apologies, just realised a mistake in setting timeouts to -1. Have now set them to a sensible value and I get a ConnectionTimeout exception.

    Do you have any ideas as to why I get ConnectionTimeout exception on this platform whereas I normally get a ConnectionFailed exception?

    Thanks.
  • benoitbenoit Rennes, FranceAdministrators, ZeroC Staff Benoit FoucherOrganization: ZeroC, Inc.Project: Ice ZeroC Staff
    Hi Alex,

    There are many possible reasons why connection establishment might hang for a while instead of failing right away. It can depend on the client machine or the server machine.

    If the client tries to connect to an IP address for which no host is running, connection establishment will typically hang for several minutes. If the client tries to connect to a valid IP address but the server doesn't listen on the specified port, the connection establishment usually fails quickly. If the client machine or remote machine has a firewall enabled, it can also affect connection establishment failures.

    You would need to figure out and detail what are the differences between the failure scenarios to get an answer why in one case the connection establishment fails quickly and in the other case it hangs for a long time.

    Cheers,
    Benoit.
  • alexjalexj Member Alex JeffreyOrganization: Softopt LImitedProject: Distributed control systems
    Thanks for the reply Benoit.

    I appreciate there are a number of reasons for the failure but I have a very specific scenario which exhibits different behavior on different platforms and I wanted to see if there may be an obvious reason why the two platforms may behave differently.

    Both platforms are Linux based; one native (Intel64) and one ARM based. In both cases, the application code is the same and all clients/servers are running on the same machine and all proxy endpoints have "-h localhost". If I run Client A, which attempts to create a proxy to a servant on Server B when Server B is NOT running, I get different results. The native platform always throws a 'ConnectFailed' exception whereas the ARM one always blocks. If I set a nominal timeout (e.g. 10ms) for Ice.Override.Timeout and Ice.Override.ConnectTimeout, the ARM one throws ConnectionTimeout exception.

    Is there something in particular that could be different between the platforms to cause such a predictably different behaviour especially as the server is not running in either case. What is the difference between ConnectFailed and ConnectionTimeout?

    Thanks again,

    Alex.
  • benoitbenoit Rennes, FranceAdministrators, ZeroC Staff Benoit FoucherOrganization: ZeroC, Inc.Project: Ice ZeroC Staff
    Hi Alex,

    When you get a connection failure with Ice::ConnectFailedExcception, it indicates that the socket connect() system call reported an error to Ice and the connection establishment is done. A connection timeout indicates that the connect() system call didn't return any result after the timeout period and is still in progress (Ice uses non-blocking sockets so the connection establishment doesn't consume any threads but it's still in-progress after Ice raised the timeout exception).

    Did you let the ARM client hang for several minutes (with timeouts disabled) to see if it did eventually failed and what was the error? It can sometime take a while for the OS to report an error to the connect() system call.

    As to why it behaves differently on the ARM platform, this would require more investigation. Is the Linux firewall disabled on the ARM platform? Can you check with "netstat -an | grep <server port number>" the state of the network connections while the connection establishment hangs?

    Cheers,
    Benoit.
  • alexjalexj Member Alex JeffreyOrganization: Softopt LImitedProject: Distributed control systems
    Hi Benoit,

    That's all very useful.

    When it was in the 'blocking' state i.e. with no connection timeout set, it would eventually continue after some period of time (can't be exact but in the region of 4 to 6 minutes). I don't know what error it returned whern it finally unblocked. I'd like to understand the difference between the platforms so I'll investigate further when I get time but for now, setting the timeout has allowed me to progress.

    Thanks again,

    Alex.
Sign In or Register to comment.