Archived

This forum has been archived. Please start a new discussion on GitHub.

Replicated IceStorm: Freeze::TransactionAlreadyInProgressException

Dear Experts,

I am seeing Freeze::TransactionAlreadyInProgressException exception in my IceStorm logs and I am not sure if I know how to fix.

My setup/configuration in a nutshell: I am running three replicas of IceStorm service (see configs below). They are running on the same machine. For the most of the time it is running smoothly, but sometimes my client fails to subscribe and I am seeing this error from one of the replicas:
-! 11/06/12 10:40:11.175 icebox-IceStorm: warning: dispatch exception: 
Outgoing.cpp:480: Ice::UnknownLocalException:   unknown local exception:   
ConnectionI.cpp:29: Freeze::TransactionAlreadyInProgressException:   transaction already in progress   identity: MyIceStorm/testTopic facet:    operation: subscribeAndGetPublisher   remote host: 192.168.4.235 remote port: 48958



This is my one of three icebox config files (config.ib1):
IceBox.ServiceManager.Endpoints=tcp -h 192.168.4.227 -p 9990 
#other two replicas: 9991 and 9992
IceBox.Service.IceStorm=IceStormService,34:createIceStorm --Ice.Config=config.s1

IceStorm.PublisherPool.Size=30
IceStorm.ThreadPool.Size=30

And here is corresponding config.s1:
IceStorm.NodeId=0
IceStorm.Node.Endpoints=default -h 192.168.4.227 -p 13000

IceStorm.Nodes.0=MyIceStorm/node0:default -h 192.168.4.227 -p 13000
IceStorm.Nodes.1=MyIceStorm/node1:default -h 192.168.4.227 -p 13010
IceStorm.Nodes.2=MyIceStorm/node2:default -h 192.168.4.227 -p 13020

IceStorm.Trace.Election=1
IceStorm.Trace.Replication=1

IceStorm.Election.MasterTimeout=2
IceStorm.Election.ElectionTimeout=2

IceStorm.TopicManager.Endpoints=default -h 192.168.4.227 -p 10000

IceStorm.ReplicatedTopicManagerEndpoints=default -h 192.168.4.227 -p 10000:default -h 192.168.4.227 -p 10010:default -h 192.168.4.227 -p 10020

IceStorm.InstanceName=MyIceStorm

IceStorm.Publish.Endpoints=default -h 192.168.4.227 -p 10001:udp -p 10001

IceStorm.ReplicatedPublishEndpoints=default -h 192.168.4.227 -p 10001:default -h 192.168.4.227 -p 10011:default -h 192.168.4.227 -p 10021:udp -p 10001:udp -p 10011:udp -p 10021

IceStorm.Flush.Timeout=2000

Freeze.DbEnv.IceStorm.DbHome=db1


ThreadPool.Client.Serialize=1
ThreadPool.Client.Size=10
ThreadPool.Client.SizeMax=30
ThreadPool.Client.ThreadIdleTime=0

ThreadPool.Server.Serialize=1
ThreadPool.Server.Size=10
ThreadPool.Server.SizeMax=30
ThreadPool.Server.ThreadIdleTime=0

IceStorm.Trace.ThreadPool=1

Ice.MessageSizeMax=10000

I feel lost and would greatly appreciate your help. Please let me know if you need more log messages including those from other replicas.

Thank you,
Aleksey

Comments

  • bernard
    bernard Jupiter, FL
    Hi Aleksey,

    Can you try to apply this patch:
    http://www.zeroc.com/forums/patches/5781-patch-4-ice-3-4-2-icestorm-assert-bug-fix.html

    and let us know if it solves your issue?

    Best regards,
    Bernard
  • bernard wrote: »
    Hi Aleksey,

    Can you try to apply this patch:
    http://www.zeroc.com/forums/patches/5781-patch-4-ice-3-4-2-icestorm-assert-bug-fix.html

    and let us know if it solves your issue?

    Best regards,
    Bernard

    Thank you, Bernard!

    I will give it a try. It might take a bit of time since I never had to compile Ice from scratch.

    Aleksey
  • I applied the patch and recompiled binaries. Unfortuantely, I am still seeing the same errors.

    Thank you,
    Aleksey
  • benoit
    benoit Rennes, France
    Hi Aleksey,

    I don't see how this exception could be occurring after applying the patch. Could you ensure that you have copied the right file after re-building the IceStorm service? You should copy the libIceStormService library (not the libIceStorm library) for each replica and restart them.

    If this still doesn't work, could you try to reproduce the problem using the demo/IceStorm/replicated demo from your Ice distribution so that we can investigate this?

    Cheers,
    Benoit.
  • You are right, I checked /proc/<pid>/maps and, indeed, the icebox loaded original libraries. I should be able to retest a bit later today.

    Thank you,
    Aleksey
  • Dear Benoit and Bernard,

    It seems, it worked - I was not able to reproduce the error and I will keep my tests running for the next few days.

    Thank you!
    Aleksey