Archived
This forum has been archived. Please start a new discussion on GitHub.
RFC: Patch Ice-3.5.1 to add class lookup caching to BasicStream (java)
I've been experiencing severe performance issues decoding complex message in a java-hosted application.
Consider a slice definition such as:
Now consider a message that passes an ExampleSeq with thousands of entries.
I found in such a case, BasicStream#getConcreteClass would be called several times for each of these thousands of entries:
"IceCompactId.TypeId_5" (lookup fails)
"my.package.name.IceCompactId.TypeId_5" (lookup succeeds)
"my.package.name.Example" (lookup succeeds)
The accumulated total run time of all these lookups would result in 60+ second decode times for one message in real world test case I have. Profiling showed 137,350 calls to java.lang.ClassLoader.loadClass. After adopting a caching strategy for these lookups, the run time was reduced to under one second. Calls to loadClass dropped to about 379.
In my patch, I create a ConcurrentHashMap per instance of BasicStream, not knowing if BasicStream instances are ever shared between threads. I chose to wrap the Class instances in WeakReference so the cache doesn't prevent these classes from being unloaded. To get the desired performance benefit, I had to also cache lookup misses. This part worries me because if the class is later loaded, the cache would be returning false negatives.
I thought that ideally the cache would only be around for the decoding of a single message, but I found that BasicStream instances are created on a per-thread basis and reused.
In summary, I believe this patch works correctly for my purposes but is not an ideal general case solution. I would appreciate any feedback. Thanks!
Consider a slice definition such as:
class Example(5) { ... }; sequence<Example> ExampleSeq;
Now consider a message that passes an ExampleSeq with thousands of entries.
I found in such a case, BasicStream#getConcreteClass would be called several times for each of these thousands of entries:
"IceCompactId.TypeId_5" (lookup fails)
"my.package.name.IceCompactId.TypeId_5" (lookup succeeds)
"my.package.name.Example" (lookup succeeds)
The accumulated total run time of all these lookups would result in 60+ second decode times for one message in real world test case I have. Profiling showed 137,350 calls to java.lang.ClassLoader.loadClass. After adopting a caching strategy for these lookups, the run time was reduced to under one second. Calls to loadClass dropped to about 379.
In my patch, I create a ConcurrentHashMap per instance of BasicStream, not knowing if BasicStream instances are ever shared between threads. I chose to wrap the Class instances in WeakReference so the cache doesn't prevent these classes from being unloaded. To get the desired performance benefit, I had to also cache lookup misses. This part worries me because if the class is later loaded, the cache would be returning false negatives.
I thought that ideally the cache would only be around for the decoding of a single message, but I found that BasicStream instances are created on a per-thread basis and reused.
In summary, I believe this patch works correctly for my purposes but is not an ideal general case solution. I would appreciate any feedback. Thanks!
0
Comments
-
Hi Robert,
Just to follow up on this, we've added a fix for this issue in our 3.6 branch. It will be included in the upcoming 3.6.2 release.
Note that the performance issue only affected Java programs that unmarshal objects-by-value whose definitions use compact type IDs.
Regards,
Mark0