Archived

This forum has been archived. Please start a new discussion on GitHub.

CPU Idl is very high

Hi All:
I have write a [AMI AMD] program, but the client 's cpu idl is very high, more than 50%, it can't use full cpu, my question is that when I use AMI invoke, ICE ' TCP using blocking or nobloking to comminucation. How can I adjust parameter to let ICE use full CPU.

The following is my test envirnorment:

Handware:
server -> IBM X340 2 CPU PIII 1.0G
client -> PC PIV 1400MHZ
OS :
server -> redhat advanced Server 3.0
client -> Fedora core 1, redhat linux 9.0

ice file:
#ifndef _DBENGINE_ICE
#define _DBENGINE_ICE

module pnote
{
sequence<byte> ContentList;

exception ShutDownError
{
string reason;
};

/* DCS 0 - english; 8 - unicode */
struct DBEngineData
{
long ID;
//string PhoneNo;
short GroupID;
short RuleID;
bool ContentFlag;
short ContentID;
short DCS;
ContentList content;
};


interface DBEngineHome
{
["ami", "amd"] nonmutating DBEngineData find(long ID,string strPhoneNo) throws ShutDownError;

};
};

#endif _DBENGINE_ICE

Comments

  • When I start 1 client, client name is loadTest, cpu idl :60%
    loadTest (only use 29.2% CPU)

    CPU states: cpu user nice system irq softirq iowait idle
    total 15.5% 0.0% 20.8% 0.0% 0.0% 0.0% 63.6%
    Mem: 255488k av, 251044k used, 4444k free, 0k shrd, 30788k buff
    45508k active, 190256k inactive
    Swap: 522104k av, 97180k used, 424924k free 108784k cached

    PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
    2877 smsc 21 0 2572 2572 2324 S 29.2 1.0 9:07 0 loadTest
    2934 smsc 17 0 1076 1076 880 R 0.3 0.4 0:02 0 top
    1 root 16 0 420 420 360 S 0.0 0.1 0:04 0 init
    2 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 keventd
    3 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kapmd
    4 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0
    6 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 bdflush
    5 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kswapd
    7 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kupdated
    8 root 23 0 0 0 0 SW 0.0 0.0 0:00 0 mdrecoveryd
    11 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kreiserfsd
    72 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 khubd
    601 root 20 0 0 0 0 SW 0.0 0.0 0:00 0 kjournald
    602 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kjournald
    2232 root 16
  • when I use 2 client ,cpu idl is same as 1 client. only low little : 50%
    loadTest 1 (only use 21.9% CPU)
    loadTest 2 (only use 19.6% CPU)

    CPU states: cpu user nice system irq softirq iowait idle
    total 25.1% 0.0% 23.9% 0.0% 0.0% 0.0% 50.8%
    Mem: 255488k av, 250928k used, 4560k free, 0k shrd, 31836k buff
    45792k active, 190172k inactive
    Swap: 522104k av, 97340k used, 424764k free 108924k cached

    PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
    2986 smsc 21 0 2548 2548 2300 S 21.9 0.9 0:25 0 loadTest
    2877 smsc 21 0 2572 2572 2324 S 19.6 1.0 9:41 0 loadTest
    2934 smsc 17 0 1076 1076 880 R 0.5 0.4 0:03 0 top
    1 root 16 0 420 420 360 S 0.0 0.1 0:04 0 init
    2 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 keventd
    3 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kapmd
    4 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0
    6 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 bdflush
    5 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kswapd
    7 root 15
  • The following is my client program, How can I improve my cpu usage rate?

    AMI_DBEngineHome_findPtr cbPtr = new AMI_DBEngineHome_findI(_iThreadIndex);
    cout << "SenderThread id->" << _iThreadIndex << " is running in Async mode ..." << endl;
    while (!isInterrupted && (iMaxSendNum >0 ))
    {
    getPhoneNo();
    try
    {
    // AMI call
    homePrx->find_async(cbPtr,_id,_phoneno);
    vect_lInCount[_iThreadIndex]++ ;
    }catch (const ShutDownError & ex)
    {
    isInterrupted = true;
    cerr << "catch ShutDownError: " << ex.reason << endl;
    } catch (const Ice::Exception & ex)
    {
    cerr << "send async data error:" << ex << endl;
    isInterrupted = true;
    //break;

    }
    //++lInCount;

    if (!isRunningLongTime)
    {
    --iMaxSendNum;
    }

    }
  • The following is Server CPU IDL:
    my server name is pnotedb (only use 16.8%)

    17:42:08 up 21 days, 54 min, 9 users, load average: 4.23, 5.08, 5.08
    101 processes: 100 sleeping, 1 running, 0 zombie, 0 stopped
    CPU states: cpu user nice system irq softirq iowait idle
    total 9.7% 0.0% 7.2% 2.3% 1.7% 0.7% 77.9%
    cpu00 13.3% 0.0% 9.3% 2.7% 2.7% 0.7% 70.8%
    cpu01 6.1% 0.0% 5.1% 1.9% 0.7% 0.7% 85.0%
    Mem: 1028520k av, 1007616k used, 20904k free, 0k shrd, 29812k buff
    203132k actv, 0k in_d, 16768k in_c
    Swap: 1048120k av, 90784k used, 957336k free 149960k cached

    PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
    2611 smsc 23 0 4220 4220 2868 S 16.8 0.4 1594m 0 pnotedb
    3242 smsc 15 0 1208 1208 844 R 0.1 0.1 0:00 1 top
    1 root 15 0 472 440 416 S 0.0 0.0 0:24 0 init
    2 root RT 0 0 0 0 SW 0.0 0.0 0:00 0 migration/0
    3 root RT 0 0 0 0 SW 0.0 0.0 0:00 1 migration/1
    4 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 keventd
    5 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0
    6 root 34 19 0 0 0 SWN 0.0 0.0 0:00 1 ksoftirqd/1
    9 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 bdflush
    7 root 15 0 0 0 0 SW 0.0 0.0 1:58 0 kswapd
    8 root 15 0 0 0 0 SW 0.0 0.0 0:23 1 kscand
    10 root 15 0 0 0 0 SW 0.0 0.0 0:23 1 kupdated
    11 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 mdrecoveryd
    17 root 15 0 0 0 0 SW 0.0 0.0 0:00 1 ahc_dv_0
    18 root 15 0 0 0 0 SW 0.0 0.0 0:00 1 ahc_dv_1
    19 root 25 0 0 0 0 SW 0.0 0.0 0:00 1 scsi_eh_0
    20 root 25 0 0 0 0 SW 0.0 0.0 0:00 1 scsi_eh_1
    22 root 21 0
  • You don't show enough of your code for me to tell exactly why your CPU utilization is so high. But, fundamentally, Ice consumes no CPU if a client or server is idle. The Ice run time simply sits in a select() that waits for new network activity. While blocked in select(), no CPU is used.

    In your client code, it seems that you are sending lots of async calls to the server in a tight loop. Obviously, that will keep the client busy most of the time, which explains the high CPU utilization you are seeing.

    I'm not sure what you are trying to achieve in your code, so I can't give any detailed advice. But I'd have a look at whether you really have to use async calls to do what you are doing.

    Cheers,

    Michi.
  • Thank you very much for your quick response.

    I think you misunderstood my question because of my poor wordings. Sorry. I used the term "CPU idle is high".

    My question is that I wan to use up all the CPU cycles I have on my machine.

    I have already used the aysnc call and do that in a tight loop with no wait. In theory I can use 100% of the CPU on my machine. It seems that the ICE middleware is the bottleneck. May be there is a blocking operation somewhere on the async path, or I may not have tuned or configured the ICE correctly.

    I have written programs using ACE (not ICE) and TCP/IP and do the async call in a tight loop, and I can use up all my CPU.

    No matter how many processes I run in the tight loop using ICE, the CPU usage remains constant. It looks like going thru ICE introduce the bottleneck and limits the CPU usage to constant at about 50%. If I used a more powerful machine, the CPU used is much lower.

    What question is that why can't ICE use up all my CPU in the tight loop?
  • When the TCP/IP stack fills up faster than data can be sent, then the CPU will be idle. This is nothing Ice specific, it's simply a matter of how fast data can be sent using TCP/IP. The threads that send data will block in the send() operations until the TCP/IP stack is ready to accept more data.

    There is no difference between AMI calls and regular calls in this respect. An AMI call also has to send data on a TCP/IP connection, and must block until the data can be sent. The only difference is that an AMI call doesn't have to wait for the response from the server. Instead, a callback is invoked once the response from the server arrives.

    I have no idea what your ACE application is doing, but fundamentally, if you send data over TCP/IP and get a 100% CPU utilization, then I would consider this a bug, because usually the CPU can load the TCP/IP stack much faster than data can actually be sent (except if you have some super-fast network and a very slow CPU).
  • Hi Marc,

    Thank you very much. I appreciate your help.

    You said that "An AMI call also has to send data on a TCP/IP connection, and must block until the data can be sent."

    Is it possible to make it non-blocking so that the call does not have to be blocked until all the data are sent? If this is done, I think that the thru put will be much higher and the CPU usage can go up to 100%.

    I think this is the bottleneck that this is a blocking call. We tested this on both slow and fast machine with multiple CPUs and saw that slow CPU the usage is about 50%, and fast CPU the usage is about 17%. It does not matter how many client processes that you run. The CPU usage is constant.

    If you have a blocking operation on the socket for the AMI or AMD invocation, this is not truly async on the AMI/AMD path.

    Our application requires that every invocation must be non-blocking.
  • Originally posted by dragzhb
    Is it possible to make it non-blocking so that the call does not have to be blocked until all the data are sent? If this is done, I think that the thru put will be much higher and the CPU usage can go up to 100%.

    I think this is the bottleneck that this is a blocking call. We tested this on both slow and fast machine with multiple CPUs and saw that slow CPU the usage is about 50%, and fast CPU the usage is about 17%. It does not matter how many client processes that you run. The CPU usage is constant.

    If you have a blocking operation on the socket for the AMI or AMD invocation, this is not truly async on the AMI/AMD path.

    Our application requires that every invocation must be non-blocking.

    I'm afraid I don't understand. If you want to send data, but the TCP/IP stack cannot accept more data, what exactly do you want Ice to do? It cannot magically push more data through the TCP/IP stack than the network can handle.
  • I see , thanks
  • Another option is to use oneway invocations. With that approach, the client does not wait for each invocation to complete, which can improve throughput. (Obviously, even for oneway invocations, the client may still block if it sends data faster than the TCP/IP stack can accept the data.)

    If all you are interested in is getting the maximum data transfer rate from client to server, oneway invocations may be a better choice than asynchronous invocations. Async invocations are useful if the client wants to keep processing things while the reply for a request is still outstanding. But if all that the client is interested in is getting data across to the server as efficiently as possible, async invocations are overkill.

    Cheers,

    Michi.
  • Hi michi:
    thanks for your response.

    But I want server send back data to client, not only speed, so I can't use oneway method.

    I am doing this project is facing telecom, so it want very high speed, But ICE can't use full CPU (include client and server) and it's speed can't improve more.

    I suggest that could you change AMI interface to improve speed. this is a key application. special for telecom.

    I think we can change AMI interface like this:

    AMI_DBEngineHome_findPtr cbPtr = new AMI_DBEngineHome_findI();

    // add this new method to set callback function
    homePrx->setCallback(cbPtr);

    while (!isInterrupted && (iMaxSendNum >0 ))
    {

    // AMI call ,only have input parameter , remove callback function to setCallback method
    homePrx->find_async(_id,_phoneno);
    }

    client now can running in non-blocking status.

    I think it will improve speed more and more

    another question , Can I use AMI in batchoneway invoke?
  • I'm afraid I can only repeat what I already wrote before. You cannot improve the speed by artificially keeping the CPU busy. Data can only be sent as fast as the TCP/IP stack and your network allows you to do so. No interface or programming method in the world can change this.

    If you would call send() on a non-blocking socket in a busy-loop, then your CPU utilization would be 100%, but you would not have any speed gain whatsoever. You would just burn CPU cycles for nothing.

    As for your second question, an AMI call is inherently twoway, so it cannot be sent as a batch oneway.
  • I see, thank you very much
  • HI marc:

    I have tested that oneway method almost 80% quickly than twoway, could you add oneway option in AMI and AMD in furture? beacuse with TCP/IP, It has high reliable transfer. I think we would need more and more speed in many place(eg. in telecom), not only stable
  • marc
    marc Florida
    Oneway is faster than twoway, because no response is being sent, meaning that the client just pushes the data on the TCP/IP stack, and then moves on to other things without waiting for the server to respond.

    AMI is defined as having the response from the server delivered asynchonously (in a callback). Since in the case of oneways there is no response, there is also no point in sending a oneway call with AMI.

    For AMD calls, you can receive oneway calls and twoway calls, it doesn't matter to the AMD server implementation.
  • HI marc:
    Thanks for your response, but I have another question of AMI .
    I think AMI should implment as oneway call, why? I think AMI shoul send data in one thread, and it will quickly return , and another thread receive data from server and call callback function. it's mean send data and receive data thread must be two thread , not one thread , if you only have one thread to send and receive data,when you sent data and wait for receive data and when you received data ,then invoke callback function, I think that it's not asynchronous invoke, it's maybe synchronous invoke? is it right?
    I have printed out the thread ID, I have found callback class's thread is different than sending data thread, but I can't confirm receiving data thread
  • marc
    marc Florida
    That's already the case. The caller thread sends the data, but doesn't receive the response. A thread from the client-side thread pool receives the response and invokes the callback.
  • I see, thanks