Archived

This forum has been archived. Please start a new discussion on GitHub.

socket exception: The semaphore timeout period has expired.

Hello,

I've got this message from slave IceGridRegistry. It was running happily for weeks, but then it suddenly stopped and its log file ended like this:

-- 08/24/12 15:11:13.947 Registry: Replica: sending keep alive message to master replica
-- 08/24/12 15:11:14.661 Registry: Node: node `SERVER1' keep alive (load = 0.2125, 0.216, 0.190611)
-- 08/24/12 15:11:15.243 Registry: Node: node `SERVER7' keep alive (load = 0.0108333, 0.0095, 0.0135)
-- 08/24/12 15:11:15.676 Registry: Node: node `SERVER3' keep alive (load = 0.149167, 0.139167, 0.135611)
-- 08/24/12 15:11:16.023 Registry: Node: node `SERVER6' keep alive (load = 0.0025, 0.00516667, 0.00811111)
-- 08/24/12 15:11:16.051 Registry: Node: node `SERVER4' keep alive (load = 0.00666667, 0.00583333, 0.00705556)
-- 08/24/12 15:11:16.461 Registry: Node: node `SERVER2' keep alive (load = 0.3425, 0.346667, 0.348667)
-- 08/24/12 15:11:18.731 Registry: Node: node `SERVER5' keep alive (load = 0.005, 0.00633333, 0.00488889)
-- 08/24/12 15:11:26.471 Registry: Node: node `SERVER8' keep alive (load = 0.0025, 0.002, 0.00227778)
-- 08/24/12 15:11:28.947 Registry: Replica: sending keep alive message to master replica
-- 08/24/12 15:11:29.661 Registry: Node: node `SERVER1' keep alive (load = 0.221667, 0.217667, 0.191167)
-- 08/24/12 15:11:44.715 Registry: Node: node `SERVER1' keep alive (load = 0.2075, 0.215, 0.191167)
-- 08/24/12 15:11:47.413 Registry: Node: node `SERVER9' down
-- 08/24/12 15:11:47.446 Registry: Node: node `SERVER6' down
-- 08/24/12 15:11:47.474 Registry: Node: node `SERVER4' down
-- 08/24/12 15:11:47.504 Registry: Node: node `SERVER7' down
-- 08/24/12 15:11:47.535 Registry: Node: node `SERVER2' down
-- 08/24/12 15:11:47.569 Registry: Node: node `SERVER3' down
-- 08/24/12 15:11:47.845 Registry: Replica: lost session with master replica:
TcpTransceiver.cpp:420: Ice::SocketException:
socket exception: The semaphore timeout period has expired.

Apparently there was a temporary network outage preceding the crash. I am using latest Ice 3.4.2. Any ideas why it couldn't recover?

Kind Regards,
Marian