Archived

This forum has been archived. Please start a new discussion on GitHub.

Throughput Performance in Java

I am currently using IceJava to develop a project (java 1.4.2) and thinking about the performance to expect. I am experimenting with the throughput test included with the IceJava distribution:

IceJ-3.1.1-java2/demo/Ice/throughput

In my environment, I see these results for sequences of bytes and strings and struts:
==> 3
using variable-length struct sequences
==> t
sending 100 variable-length struct sequences of size 50000...
time for 100 sequences: 12029.0ms
time per sequence: 120.29ms
throughput: 43.23Mbps
==> 2
using string sequences
==> t
sending 100 string sequences of size 50000...
time for 100 sequences: 7392.0ms
time per sequence: 73.92ms
throughput: 27.06Mbps
==> 1
using byte sequences
==> t
sending 100 byte sequences of size 500000...
time for 100 sequences: 421.0ms
time per sequence: 4.21ms
throughput: 950.12Mbps

Can you discuss why sending a sequence of strings is 32 times slower than a sequence of bytes? A sequence of structs is 22 times slower.

In my application, I am sending a data structure that contains a sequence <byte> in addition to some other strings, bools, and long. The struct has a bit of hierarchy (i.e. one struct contains another struct). I am using synchronous method invocation and (so far) one Ice thread.

The majoriety of data is contained in the sequence of bytes. The other data is an order of magnitude (or two) smaller.

Given the above, should I expect my performance numbers to be more like the results for the byte test or the struct test? I am seeing actual performance that does not come close to the sequence of bytes test.

In my tests on my actual system, I notice that Ice performance appears to degrade when sending large amounts of data. There seems to be more Ice overhead when sending 20MByte of data than when sending 2MByte of data.

Thanks for your help.

Comments

  • I'm just guessing here, but much of the issue may have to do with object creation. A sequence of bytes is represented as an array of bytes in Java which, I think, is one object. A sequence of strings is an array of strings. You not only have to create the array, by you have to create all the strings as well. I would think this is pretty expensive. There is potentially additional GC, though this seems a pretty unlikely issue unless you are memory poor.
  • mes
    mes California
    Hi John,

    The large difference between the throughput performance of byte sequences and other structured types is explained by the cost of marshaling and unmarshaling. In the case of a byte sequence, the Ice run time only needs to make a copy of the data in the message buffer in order to transfer the parameter to the application; the other cases require the Ice run time to do much more work.

    Java poses some pretty significant obstacles to obtaining good throughput. For example, it's always a good idea to "warm up" the just-in-time (JIT) compiler before making any benchmark measurements, and the JIT typically needs thousands of iterations before it kicks in. You can find an example of this in the latency demo, where it does make a noticeable difference.

    Strings also present a problem because of the amount of conversion that's necessary between Java's native representation (16-bit Unicode) and Ice's on-the-wire representation (UTF-8). This conversion unfortunately requires extra allocations and copying, so longer strings can really impact throughput. Structs also require additional allocation, at least when compared to C++.

    Finally, significant allocations can affect performance measurements due to random interference from the garbage collector.
    Given the above, should I expect my performance numbers to be more like the results for the byte test or the struct test? I am seeing actual performance that does not come close to the sequence of bytes test.
    You should not expect your results to be as good as the byte test because the data you're transferring is (structurally) more complex than what the throughput demo is sending.
    There seems to be more Ice overhead when sending 20MByte of data than when sending 2MByte of data.
    Naturally that would depend highly on the structure of the data being transmitted. If you can post a concrete example we will take a look at it.

    Take care,
    - Mark