possible thread starvation issue, proxy hangs

in Help Center
in my application there is an mfc ui and a linux server app. normally, the server just updates the ui via proxy, but occasionally the ui thread needs to call something on the server because the user changed something. all callbacks are protected by mutexes and 99% of the time everything works fine, but once in a while when many things are happenning at once, a call on one of the proxies will throw an exception or the proxy function call will execute but the proxy just hangs. the mutex situation on the callbacks has been double and triple checked and looks fine.
i was under the impression that with the default config all calls are basically serialized, but it seems like when many things are happening at once ice has problems. by the way, there are no nested callbacks anywhere. will just upping the Ice.ThreadPool.Server.SizeMax or Ice.ThreadPool.Client.SizeMax have a chance to solve this? what reason could there be for a proxy to hang in the ui after seemingly successfully executing the call on the server side? if too many requests come in too little time, will the proxies ever throw exceptions or will they just wait? what is this thread starvation/deadlock risk? is that even possible without nested callbacks? any help much appreciated.
thanks,
peter
i was under the impression that with the default config all calls are basically serialized, but it seems like when many things are happening at once ice has problems. by the way, there are no nested callbacks anywhere. will just upping the Ice.ThreadPool.Server.SizeMax or Ice.ThreadPool.Client.SizeMax have a chance to solve this? what reason could there be for a proxy to hang in the ui after seemingly successfully executing the call on the server side? if too many requests come in too little time, will the proxies ever throw exceptions or will they just wait? what is this thread starvation/deadlock risk? is that even possible without nested callbacks? any help much appreciated.
thanks,
peter
0
Comments
Which Ice version do you use?
Increasing the number of thread pool threads could be a solution if the problem is caused by thread starvation but before doing this you should ensure this is the case.
If the client request still hangs after it was dispatched by the server and the server sent the response, this usually indicates that the client thread pool thread is busy doing something else instead of listening for the outgoing connection and reading the server response. This can occur if you're using bidirectional connections or AMI. Is it the case?
The best way to investigate deadlock or hang issues is to attach to the process with the debugger and check the stack trace of each thread. If you post the traces here, we'll be happy to take a look.
Cheers,
Benoit.
There is a possibly related issue where on startup, many proxies are registered at once by the ui. Each proxy has its own thread in the ui. Rarely, only a few will register and then ice will hang. If thread starvation could be a possible cause, which should I change, the --Ice.ThreadPool.Client or Server variables.
Thanks,
Peter
What do you mean by that? Do you mean that the UI calls on the server to register its callback objects?
I'm afraid I don't understand what you mean here either. Is this a thread that you allocate? What does this thread do, and why do you want to devote a thread to a proxy?
It sounds like perhaps what is occurring is that you sometimes get a callback on one of the previously registered objects prior to all callbacks being registered. That is typically the client does:
And this typically completes with no callback being made prior to the entire group being registered. However, if a callback is made during the registration process then it causes a hang.
Is the object adapter activated in the UI at the point that you are making these calls? If not, then the calls from the server will hang blocking the calling thread. If the callbacks are made from threads allocated from the server side thread pool, and there is only a single thread in that pool then no further invocations can be handled and all calls on the server will block.
The solution here is to either:
- increase the size of the server side thread pool. (Ice.ThreadPool.Server.Size=(some number > 1)
OR
- make the callbacks to the UI using some thread other than a thread from the server side thread pool. You can do this using a work-queue -- see demo/IceUtil/workqueue for an example.
You might also want to review your UI code to ensure that you are not updating the the UI directly from callbacks. This is, in general, not safe! I wrote a series of articles on 4 articles on integrating UIs with Ice starting in issue 12 of Connections - http://www.zeroc.com/newsletter/issue12.pdf. You might also want to look at our bundled MFC demo - demo/Ice/MFC.
What do you mean by that? Do you mean that the UI calls on the server to register its callback objects?
yes, a typical interface looks like this
// Kicker
interface IKickerInstanceClient {
void updateParameters(XMLKickerElement parameters, bool playSound);
};
interface IKickerInstance {
void registerClient(IKickerInstanceClient* client);
void updateParameters(XMLKickerElement parameters);
XMLKickerElement getParameters();
};
sequence<Object*> KickerInstanceSeq;
the server side implements IKickerInstance and client implements IKickerInstanceClient.
I'm afraid I don't understand what you mean here either. Is this a thread that you allocate? What does this thread do, and why do you want to devote a thread to a proxy?
each dialog is modal and contained in a class. there is a thread that runs the modal dialog and terminates on exit. this way dialogs can be created and destroyed dynamically by non-mfc code. each ui has a corresponding thread and proxy to its server.
It sounds like perhaps what is occurring is that you sometimes get a callback on one of the previously registered objects prior to all callbacks being registered. That is typically the client does:
hmm, quite possibly. this still doesn't explain the random times ice hangs after things have been working right for hours.
You might also want to review your UI code to ensure that you are not updating the the UI directly from callbacks.
i only "invalidate" in mfc terms on callbacks. this returns immediately and just tells windows to repaint the next time it loops around. i think the problem lies somewhere in the fact that there are mutexes for basically every callback and in some other parts of the ui code to prevent the parameters being passed back and forth from being corrupted. i can't see any way of taking the mutexes out of the ui code without the risk that ice and the ui will try to edit the parameters at the same time. once again, the mutex code has been double checked and works properly 99% of the time.
possibly there are just too many threads trying to use ice at once which causes ice to hang on rare occasions? some of the callbacks do a little computation and all of them have mutexes so they aren't lightning fast. either way i have upped the maxsize of the threadpools on both ui and server without any issues...
Too many threads using Ice will not cause random hangs. Hangs are most typically caused by deadlocks in your code (thread A locking mutex M1 and then trying to acquire M2, while thread B has locked M2 and is trying to acquire M1).
What is this ping for? To detect the client going away by the server? Since you are sending callbacks why do you need to do that? You'll know the client has disappeared when the callback fails. If the ping is there for the client to detect the server going away you should probably ping from the client side.
At any rate, 20 pings a second is certainly excessive
No, it doesn't sound likely. The best way to find out the reason for the hang is to break your application in a debugger when the hang occurs. Then you will find out exactly what is occurring.
What I am going to do to address this is change everything involving the ui mutex to use trylock instead of lock and just return safely if the lock isn't acquired. This way, only the server mutex is blocking and if there is ever deadlock it will have to be on the server. What is the proper way to use the trylock helper object? If I just use IceUtil::Mutex::TryLock lock(uimutex_); how am I to tell whether the lock was acquired? Is there no exception-safe implementation of trylock?
What is this ping for? To detect the client going away by the server? Since you are sending callbacks why do you need to do that? You'll know the client has disappeared when the callback fails. If the ping is there for the client to detect the server going away you should probably ping from the client side.
At any rate, 20 pings a second is certainly excessive If this ping from the server is really necessary you should probably look to move to a session model, where the session is responsible for pinging. Look at demo/Ice/session for an example.
The ping is so the server can detect the client going away. The computers running the server and client are in separate locations and if the connection goes down the server needs to know immediately. I will definitely look into the session model as an alternative. Is there any possibility of a ping and proxy call occurring at the same time causing problems? Thanks again.
You should call acquired on the TryLock object to find out whether the lock was obtained.
However, this does not sound like a very good solution. Surely you don't want to lose updates from the server? If I were you I would figure out really why you are getting unexpected deadlocks and fix the source of the problem.
Typically you would ping from the client to the server, and use use a timeout on the server side to detect the client disappearing. See demo/Ice/session for an example.
Ice has no problems with concurrent calls.