CPU Idl is very high

dragzhb · September 2004

Hi All:
I have write a [AMI AMD] program, but the client 's cpu idl is very high, more than 50%, it can't use full cpu, my question is that when I use AMI invoke, ICE ' TCP using blocking or nobloking to comminucation. How can I adjust parameter to let ICE use full CPU.

The following is my test envirnorment:

Handware:
server -> IBM X340 2 CPU PIII 1.0G
client -> PC PIV 1400MHZ
OS :
server -> redhat advanced Server 3.0
client -> Fedora core 1, redhat linux 9.0

ice file:
#ifndef _DBENGINE_ICE
#define _DBENGINE_ICE

module pnote
{
sequence<byte> ContentList;

exception ShutDownError
{
string reason;
};

/* DCS 0 - english; 8 - unicode */
struct DBEngineData
{
long ID;
//string PhoneNo;
short GroupID;
short RuleID;
bool ContentFlag;
short ContentID;
short DCS;
ContentList content;
};

interface DBEngineHome
{
["ami", "amd"] nonmutating DBEngineData find(long ID,string strPhoneNo) throws ShutDownError;

};
};

#endif _DBENGINE_ICE

dragzhb · September 2004

When I start 1 client, client name is loadTest, cpu idl :60%
loadTest (only use 29.2% CPU)

CPU states: cpu user nice system irq softirq iowait idle
total 15.5% 0.0% 20.8% 0.0% 0.0% 0.0% 63.6%
Mem: 255488k av, 251044k used, 4444k free, 0k shrd, 30788k buff
45508k active, 190256k inactive
Swap: 522104k av, 97180k used, 424924k free 108784k cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
2877 smsc 21 0 2572 2572 2324 S 29.2 1.0 9:07 0 loadTest
2934 smsc 17 0 1076 1076 880 R 0.3 0.4 0:02 0 top
1 root 16 0 420 420 360 S 0.0 0.1 0:04 0 init
2 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 keventd
3 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kapmd
4 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0
6 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 bdflush
5 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kswapd
7 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kupdated
8 root 23 0 0 0 0 SW 0.0 0.0 0:00 0 mdrecoveryd
11 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kreiserfsd
72 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 khubd
601 root 20 0 0 0 0 SW 0.0 0.0 0:00 0 kjournald
602 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kjournald
2232 root 16

dragzhb · September 2004

when I use 2 client ,cpu idl is same as 1 client. only low little : 50%
loadTest 1 (only use 21.9% CPU)
loadTest 2 (only use 19.6% CPU)

CPU states: cpu user nice system irq softirq iowait idle
total 25.1% 0.0% 23.9% 0.0% 0.0% 0.0% 50.8%
Mem: 255488k av, 250928k used, 4560k free, 0k shrd, 31836k buff
45792k active, 190172k inactive
Swap: 522104k av, 97340k used, 424764k free 108924k cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
2986 smsc 21 0 2548 2548 2300 S 21.9 0.9 0:25 0 loadTest
2877 smsc 21 0 2572 2572 2324 S 19.6 1.0 9:41 0 loadTest
2934 smsc 17 0 1076 1076 880 R 0.5 0.4 0:03 0 top
1 root 16 0 420 420 360 S 0.0 0.1 0:04 0 init
2 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 keventd
3 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kapmd
4 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0
6 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 bdflush
5 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kswapd
7 root 15

dragzhb · September 2004

The following is my client program, How can I improve my cpu usage rate?

AMI_DBEngineHome_findPtr cbPtr = new AMI_DBEngineHome_findI(_iThreadIndex);
cout << "SenderThread id->" << _iThreadIndex << " is running in Async mode ..." << endl;
while (!isInterrupted && (iMaxSendNum >0 ))
{
getPhoneNo();
try
{
// AMI call
homePrx->find_async(cbPtr,_id,_phoneno);
vect_lInCount[_iThreadIndex]++ ;
}catch (const ShutDownError & ex)
{
isInterrupted = true;
cerr << "catch ShutDownError: " << ex.reason << endl;
} catch (const Ice::Exception & ex)
{
cerr << "send async data error:" << ex << endl;
isInterrupted = true;
//break;

}
//++lInCount;

if (!isRunningLongTime)
{
--iMaxSendNum;
}

}

dragzhb · September 2004

The following is Server CPU IDL:
my server name is pnotedb (only use 16.8%)

17:42:08 up 21 days, 54 min, 9 users, load average: 4.23, 5.08, 5.08
101 processes: 100 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 9.7% 0.0% 7.2% 2.3% 1.7% 0.7% 77.9%
cpu00 13.3% 0.0% 9.3% 2.7% 2.7% 0.7% 70.8%
cpu01 6.1% 0.0% 5.1% 1.9% 0.7% 0.7% 85.0%
Mem: 1028520k av, 1007616k used, 20904k free, 0k shrd, 29812k buff
203132k actv, 0k in_d, 16768k in_c
Swap: 1048120k av, 90784k used, 957336k free 149960k cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
2611 smsc 23 0 4220 4220 2868 S 16.8 0.4 1594m 0 pnotedb
3242 smsc 15 0 1208 1208 844 R 0.1 0.1 0:00 1 top
1 root 15 0 472 440 416 S 0.0 0.0 0:24 0 init
2 root RT 0 0 0 0 SW 0.0 0.0 0:00 0 migration/0
3 root RT 0 0 0 0 SW 0.0 0.0 0:00 1 migration/1
4 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 keventd
5 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0
6 root 34 19 0 0 0 SWN 0.0 0.0 0:00 1 ksoftirqd/1
9 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 bdflush
7 root 15 0 0 0 0 SW 0.0 0.0 1:58 0 kswapd
8 root 15 0 0 0 0 SW 0.0 0.0 0:23 1 kscand
10 root 15 0 0 0 0 SW 0.0 0.0 0:23 1 kupdated
11 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 mdrecoveryd
17 root 15 0 0 0 0 SW 0.0 0.0 0:00 1 ahc_dv_0
18 root 15 0 0 0 0 SW 0.0 0.0 0:00 1 ahc_dv_1
19 root 25 0 0 0 0 SW 0.0 0.0 0:00 1 scsi_eh_0
20 root 25 0 0 0 0 SW 0.0 0.0 0:00 1 scsi_eh_1
22 root 21 0

michi · September 2004

You don't show enough of your code for me to tell exactly why your CPU utilization is so high. But, fundamentally, Ice consumes no CPU if a client or server is idle. The Ice run time simply sits in a select() that waits for new network activity. While blocked in select(), no CPU is used.

In your client code, it seems that you are sending lots of async calls to the server in a tight loop. Obviously, that will keep the client busy most of the time, which explains the high CPU utilization you are seeing.

I'm not sure what you are trying to achieve in your code, so I can't give any detailed advice. But I'd have a look at whether you really have to use async calls to do what you are doing.

Cheers,

Michi.

dragzhb · September 2004

Thank you very much for your quick response.

I think you misunderstood my question because of my poor wordings. Sorry. I used the term "CPU idle is high".

My question is that I wan to use up all the CPU cycles I have on my machine.

I have already used the aysnc call and do that in a tight loop with no wait. In theory I can use 100% of the CPU on my machine. It seems that the ICE middleware is the bottleneck. May be there is a blocking operation somewhere on the async path, or I may not have tuned or configured the ICE correctly.

I have written programs using ACE (not ICE) and TCP/IP and do the async call in a tight loop, and I can use up all my CPU.

No matter how many processes I run in the tight loop using ICE, the CPU usage remains constant. It looks like going thru ICE introduce the bottleneck and limits the CPU usage to constant at about 50%. If I used a more powerful machine, the CPU used is much lower.

What question is that why can't ICE use up all my CPU in the tight loop?

marc · September 2004

When the TCP/IP stack fills up faster than data can be sent, then the CPU will be idle. This is nothing Ice specific, it's simply a matter of how fast data can be sent using TCP/IP. The threads that send data will block in the send() operations until the TCP/IP stack is ready to accept more data.

There is no difference between AMI calls and regular calls in this respect. An AMI call also has to send data on a TCP/IP connection, and must block until the data can be sent. The only difference is that an AMI call doesn't have to wait for the response from the server. Instead, a callback is invoked once the response from the server arrives.

I have no idea what your ACE application is doing, but fundamentally, if you send data over TCP/IP and get a 100% CPU utilization, then I would consider this a bug, because usually the CPU can load the TCP/IP stack much faster than data can actually be sent (except if you have some super-fast network and a very slow CPU).

dragzhb · September 2004

Hi Marc,

Thank you very much. I appreciate your help.

You said that "An AMI call also has to send data on a TCP/IP connection, and must block until the data can be sent."

Is it possible to make it non-blocking so that the call does not have to be blocked until all the data are sent? If this is done, I think that the thru put will be much higher and the CPU usage can go up to 100%.

I think this is the bottleneck that this is a blocking call. We tested this on both slow and fast machine with multiple CPUs and saw that slow CPU the usage is about 50%, and fast CPU the usage is about 17%. It does not matter how many client processes that you run. The CPU usage is constant.

If you have a blocking operation on the socket for the AMI or AMD invocation, this is not truly async on the AMI/AMD path.

Our application requires that every invocation must be non-blocking.

marc · September 2004

Originally posted by dragzhb
Is it possible to make it non-blocking so that the call does not have to be blocked until all the data are sent? If this is done, I think that the thru put will be much higher and the CPU usage can go up to 100%.

I think this is the bottleneck that this is a blocking call. We tested this on both slow and fast machine with multiple CPUs and saw that slow CPU the usage is about 50%, and fast CPU the usage is about 17%. It does not matter how many client processes that you run. The CPU usage is constant.

If you have a blocking operation on the socket for the AMI or AMD invocation, this is not truly async on the AMI/AMD path.

Our application requires that every invocation must be non-blocking.

I'm afraid I don't understand. If you want to send data, but the TCP/IP stack cannot accept more data, what exactly do you want Ice to do? It cannot magically push more data through the TCP/IP stack than the network can handle.

dragzhb · September 2004

I see , thanks

michi · September 2004

Another option is to use oneway invocations. With that approach, the client does not wait for each invocation to complete, which can improve throughput. (Obviously, even for oneway invocations, the client may still block if it sends data faster than the TCP/IP stack can accept the data.)

If all you are interested in is getting the maximum data transfer rate from client to server, oneway invocations may be a better choice than asynchronous invocations. Async invocations are useful if the client wants to keep processing things while the reply for a request is still outstanding. But if all that the client is interested in is getting data across to the server as efficiently as possible, async invocations are overkill.

Cheers,

Michi.

dragzhb · September 2004

Hi michi:
thanks for your response.

But I want server send back data to client, not only speed, so I can't use oneway method.

I am doing this project is facing telecom, so it want very high speed, But ICE can't use full CPU (include client and server) and it's speed can't improve more.

I suggest that could you change AMI interface to improve speed. this is a key application. special for telecom.

I think we can change AMI interface like this:

AMI_DBEngineHome_findPtr cbPtr = new AMI_DBEngineHome_findI();

// add this new method to set callback function
homePrx->setCallback(cbPtr);

while (!isInterrupted && (iMaxSendNum >0 ))
{

// AMI call ,only have input parameter , remove callback function to setCallback method
homePrx->find_async(_id,_phoneno);
}

client now can running in non-blocking status.

I think it will improve speed more and more

another question , Can I use AMI in batchoneway invoke?

marc · September 2004

I'm afraid I can only repeat what I already wrote before. You cannot improve the speed by artificially keeping the CPU busy. Data can only be sent as fast as the TCP/IP stack and your network allows you to do so. No interface or programming method in the world can change this.

If you would call send() on a non-blocking socket in a busy-loop, then your CPU utilization would be 100%, but you would not have any speed gain whatsoever. You would just burn CPU cycles for nothing.

As for your second question, an AMI call is inherently twoway, so it cannot be sent as a batch oneway.

dragzhb · September 2004

I see, thank you very much

dragzhb · October 2004

HI marc:

I have tested that oneway method almost 80% quickly than twoway, could you add oneway option in AMI and AMD in furture? beacuse with TCP/IP, It has high reliable transfer. I think we would need more and more speed in many place(eg. in telecom), not only stable

marc · October 2004

Oneway is faster than twoway, because no response is being sent, meaning that the client just pushes the data on the TCP/IP stack, and then moves on to other things without waiting for the server to respond.

AMI is defined as having the response from the server delivered asynchonously (in a callback). Since in the case of oneways there is no response, there is also no point in sending a oneway call with AMI.

For AMD calls, you can receive oneway calls and twoway calls, it doesn't matter to the AMD server implementation.

dragzhb · October 2004

HI marc:
Thanks for your response, but I have another question of AMI .
I think AMI should implment as oneway call, why? I think AMI shoul send data in one thread, and it will quickly return , and another thread receive data from server and call callback function. it's mean send data and receive data thread must be two thread , not one thread , if you only have one thread to send and receive data,when you sent data and wait for receive data and when you received data ,then invoke callback function, I think that it's not asynchronous invoke, it's maybe synchronous invoke? is it right?
I have printed out the thread ID, I have found callback class's thread is different than sending data thread, but I can't confirm receiving data thread

marc · October 2004

That's already the case. The caller thread sends the data, but doesn't receive the response. A thread from the client-side thread pool receives the response and invokes the callback.

dragzhb · October 2004

I see, thanks

Archived

CPU Idl is very high

Comments

Categories