Java locate registry

sinofoolsinofool ✭✭Member Bochun BaiOrganization: sinofool.comProject: http://sinofool.com/ ✭✭
I recently upgrade Ice from 3.0.1 to 3.1.1 .
And I found some error never show before in java client log.
It said java is waiting while createIndirectProxy and no timeout.
This is not often happend, but it eats threadpool of java one by one.

I found this in INSTALL file of IceJava:
On Linux x86_64, we recommend using JRE 1.5.0_07 or an ealier version.
JRE 1.5.0_08 and JRE 1.5.0_09 introduced a bug that affects Freeze maps.

Will this bug affect IceJava client? Or it affects only using IceJava in server side.

Comments

  • benoitbenoit ZeroC Staff Rennes, FranceAdministrators, ZeroC Staff Benoit FoucherOrganization: ZeroC, Inc.Project: Ice ZeroC Staff
    Hi,

    I don't think this bug is related to your problem, it affects Freeze maps so it shouldn't affect the Ice client or server runtime.

    We would need more information to be able to help you with this issue. Which OS are you using? Which Java version? Can you post the thread dump of the Java client when it hangs?

    Thanks,

    Cheers,
    Benoit.
  • sinofoolsinofool ✭✭ Member Bochun BaiOrganization: sinofool.comProject: http://sinofool.com/ ✭✭
    This is resin said:
    "resin-tcp-connection-*:80-345" daemon prio=1 tid=0x0000002b26396680 nid=0x54d7 in Object.wait() [0x00000000688d5000..0x00000000688d6e30]
    at java.lang.Object.wait(Native Method)
    at java.lang.Object.wait(Object.java:474)
    at IceInternal.Outgoing.invoke(Outgoing.java:137)
    - locked <0x0000002addfc0540> (a IceInternal.Outgoing)
    at Ice._LocatorDelM.findAdapterById(_LocatorDelM.java:33)
    at Ice.LocatorPrxHelper.findAdapterById(LocatorPrxHelper.java:36)
    at Ice.LocatorPrxHelper.findAdapterById(LocatorPrxHelper.java:21)
    at IceInternal.LocatorInfo.getEndpoints(LocatorInfo.java:99)
    at IceInternal.IndirectReference.getConnection(IndirectReference.java:168)
    at Ice._ObjectDelM.setup(_ObjectDelM.java:258)
    at Ice.ObjectPrxHelperBase.__getDelegate(ObjectPrxHelperBase.java:905)
    - locked <0x0000002ade2c8f30> (a ustb.UserManagerPrxHelper)
    at ustb.UserManagerPrxHelper.reloadUser(Unknown Source)
    ......

    This is a AMD64 server running CentOS4.4 x86_64.
    JDK version is jdk-1.5.0_09-b03.
    Ice-3.1.1-java5

    I am using ReplicatedGroup with this mod: http://www.zeroc.com/vbulletin/showthread.php?t=2726

    This problem often appears when load heavy.
  • benoitbenoit ZeroC Staff Rennes, FranceAdministrators, ZeroC Staff Benoit FoucherOrganization: ZeroC, Inc.Project: Ice ZeroC Staff
    For some reasons the locate request to the locator hangs. It could be because the server to locate hangs on activation for example. Could this be the case here?

    If not, you should try to reproduce the problem with more tracing. You could try with the following traces:
    • IceGridRegistry
      IceGrid.Registry.Trace.Locator=2
      IceGrid.Registry.Trace.Server=2
      
    • IceGridNode
      IceGrid.Node.Trace.Server=2
      IceGrid.Node.Trace.Adapter=2
      

    Please post the traces here and I'll take a look at them!

    Btw, could you also make sure that you have [thread=2745]this patch[/thread] applied?

    Cheers,
    Benoit.
  • sinofoolsinofool ✭✭ Member Bochun BaiOrganization: sinofool.comProject: http://sinofool.com/ ✭✭
    Yes, I applied the locator patch. I found the bug first.

    I cannot reproduce the case at this time.
    It mainly happend while "server stop" in admin console.

    Is it related to the mod?
    http://www.zeroc.com/vbulletin/showthread.php?t=2726
    I changed the code of AdapterCache to make a hot backup for every service.
  • benoitbenoit ZeroC Staff Rennes, FranceAdministrators, ZeroC Staff Benoit FoucherOrganization: ZeroC, Inc.Project: Ice ZeroC Staff
    Sorry, I don't know if it's related to your change. The best to figure it out is to try without the change.

    Cheers,
    Benoit.
  • sinofoolsinofool ✭✭ Member Bochun BaiOrganization: sinofool.comProject: http://sinofool.com/ ✭✭
    I will try to reproduce it.
    Thanks!
  • sinofoolsinofool ✭✭ Member Bochun BaiOrganization: sinofool.comProject: http://sinofool.com/ ✭✭
    I have reproduced the problem

    I experienced the problem again.
    The java thread pool dump indicate the thread is waiting on line 137# Outgoing.java.
    These is the code:
    if(timedOut)
    {
    	//
    	// Must be called outside the synchronization of
    	// this object
    	//
    	[COLOR="Red"]_connection.exception(new Ice.TimeoutException());[/COLOR]
    
    	//
    	// We must wait until the exception set above has
    	// propagated to this Outgoing object.
    	//
    	synchronized(this)
    	{
    		while(_state == StateInProgress)
    		{
    			try
    			{
    				[COLOR="Red"]wait();[/COLOR]
    			}
    			catch(InterruptedException ex)
    			{
    			}
    		}
    	}
    }
    

    It seems _connection.exception didn't propagated back.

    _connection.exception is an synchronized function, and the following synchronized(this) is also a bottle neck.

    Does this a performance problem or an bug while dealing heavy load?
  • benoitbenoit ZeroC Staff Rennes, FranceAdministrators, ZeroC Staff Benoit FoucherOrganization: ZeroC, Inc.Project: Ice ZeroC Staff
    Sorry, it's impossible to say without more information. Could you please post the dump of the strack traces?

    Cheers,
    Benoit.
  • sinofoolsinofool ✭✭ Member Bochun BaiOrganization: sinofool.comProject: http://sinofool.com/ ✭✭
    The stack trace is the same with pervious posted in #3.
    I cannot restart icegridnode to add trace parameters at this time.

    Java stack trace is not enough to indicate the problem?
    I am trying to get a chance to add trace options and restart icegridnode.

    Thanks
  • benoitbenoit ZeroC Staff Rennes, FranceAdministrators, ZeroC Staff Benoit FoucherOrganization: ZeroC, Inc.Project: Ice ZeroC Staff
    Hi,

    It would be better to post the stack traces of all the threads as there might be other threads involved with the issue.

    The stack trace you posted indicates that the outgoing call is waiting for the exception to be propagated back. As you noticed, it's not being propagated back in a timely manner. This might be because the thread pool is busy doing something else (assuming you're using the thread pool concurrency model), that's why it would be helpful to see the other thread stack traces. Could you also please confirm which concurrency model you're using (thread pool or thread per connection)?

    Cheers,
    Benoit.
Sign In or Register to comment.