Archived

This forum has been archived. Please start a new discussion on GitHub.

Glacier keep sockets on keepalive status after expiring session.

Hi I use Glacier2Router v3.5.0 and sometimes I observed after missing out the connection with the client (after Glacier2.SessionTimeout finish), the session expires on the server (I activated the traces and show "Glacier2: expiring session") but keep connections as keepalive, not releasing the socket until it ends keepalive time and system closes it.
Could be a bug in Glacier v3.5.0 fixed in v3.5.1?
Thank

Comments

  • benoit
    benoit Rennes, France
    Hi,

    Glacier2 doesn't immediately close the network connection when the session expires. The connection is closed by active connection management (ACM) after being inactive for a period corresponding to the configured "Glacier2.Client.ACM" timeout.

    By default, this timeout is set to be twice the session timeout. If the session timeout is 30 seconds (the default value for Glacier2.SessionTimeout), the connection will be closed 60 seconds after being inactive. You can set a smaller value for Glacier2.Client.ACM but you should make sure connections aren't closed prematurely so its value should be larger than the session timeout.

    If this doesn't fully answer your question, could you perhaps copy/paste the network and session tracing of your Glacier2 router and show us the unexpected traces?

    Cheers,
    Benoit.
  • Hi, thanks for your fast response.

    I have setted the timeouts:

    Config Server:
    - Glacier2Router:
    Glacier2.SessionTimeout=60
    - Server(Server based on Ice.Application)
    No setted timeouts.

    Config Client:
    Ice.ACM.Client=0


    For example:
    I have multiple connections as Client -> Server (Glacier2Router + Server based on Ice.Application).
    The clients lost connection to the server by an external cause (at 8:45), then I stop all clients and not back to connect. However, even if the session expires, glacier leaves some sockets without closing. Later the system close the sockets when the system timeout (2 hours) expires for each connection (in keep alive status).
  • Information provided by the command ss -o
    At 8:58
    ESTAB 0 69 172.31.24.250:4063 54.171.19.134:51924 timer: (on,49sec,11)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:52129 timer: (keepalive,89min,0)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:52260 timer: (keepalive,98min,0)
    ESTAB 0 69 172.31.24.250:4063 54.171.19.134:51931 timer: (on,37sec,9)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:51820 timer: (keepalive,82min,0)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:51930 timer: (keepalive,83min,0)

    At 9:01
    ESTAB 0 69 172.31.24.250:4063 54.171.19.134:51924 timer: (on,1min57sec,13)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:52129 timer: (keepalive,86min,0)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:52260 timer: (keepalive,95min,0)
    ESTAB 0 69 172.31.24.250:4063 54.171.19.134:51931 timer: (on,1min45sec,11)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:51820 timer: (keepalive,79min,0)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:51930 timer: (keepalive,80min,0)

    At 9:20
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:52129 timer: (keepalive,67min,0)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:52260 timer: (keepalive,76min,0)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:51820 timer: (keepalive,60min,0)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:51930 timer: (keepalive,61min,0)

    At 9:37
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:52129 timer: (keepalive,49min,0)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:52260 timer: (keepalive,59min,0)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:51820 timer: (keepalive,43min,0)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:51930 timer: (keepalive,44min,0)

    At 10:36
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:52129 timer: (keepalive,12min,0)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:52260 timer: (keepalive,12min,0)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:51820 timer: (keepalive,14min,0)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:51930 timer: (keepalive,12min,0)

    At 10:50
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:52129 timer: (keepalive,19sec,1)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:52260 timer: (keepalive,41sec,2)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:51820 timer: (keepalive,18sec,0)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:51930 timer: (keepalive,1min9sec,2)

    At 11:00
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:52129 timer: (keepalive,33sec,9)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:51820 timer: (keepalive,32sec,8)
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:51930 timer: (keepalive,9.044ms,9)

    At 11:02
    ESTAB 0 0 172.31.24.250:4063 54.171.19.134:51820 timer: (keepalive,32sec,9)
  • Information provided by traces of GlacierRouter:

    -- 02/10/15 08:20:25.996 /usr/bin/glacier2router: Glacier2: created session
    id =
    category = HI(2p`VSDbQD'#T+3cA;
    local address = 172.31.24.250:4063
    remote address = 54.171.19.134:51820
    -- 02/10/15 08:20:26.003 /usr/bin/glacier2router: Glacier2: adding proxy to routing table:
    Z41Announce -t -e 1.0:ssl -h 54.171.112.132 -p 8008
    -- 02/10/15 08:21:31.520 /usr/bin/glacier2router: Glacier2: created session
    id =
    category = "{t0\K8dC/[gSe#sFDED
    local address = 172.31.24.250:4063
    remote address = 54.171.19.134:51924
    -- 02/10/15 08:21:31.566 /usr/bin/glacier2router: Glacier2: adding proxy to routing table:
    Z41Announce -t -e 1.0:ssl -h 54.171.112.132 -p 8008
    -- 02/10/15 08:21:37.475 /usr/bin/glacier2router: Glacier2: created session
    id =
    category = ~wZh']!&mw|d41e-t731
    local address = 172.31.24.250:4063
    remote address = 54.171.19.134:51930
    -- 02/10/15 08:21:37.564 /usr/bin/glacier2router: Glacier2: created session
    id =
    category = =dU>^,<L^>..>+=7ohP$
    local address = 172.31.24.250:4063
    remote address = 54.171.19.134:51931
    -- 02/10/15 08:21:37.574 /usr/bin/glacier2router: Glacier2: adding proxy to routing table:
    Z41Announce -t -e 1.0:ssl -h 54.171.112.132 -p 8008
    -- 02/10/15 08:21:37.629 /usr/bin/glacier2router: Glacier2: adding proxy to routing table:
    Z41Announce -t -e 1.0:ssl -h 54.171.112.132 -p 8008
    -- 02/10/15 08:27:13.523 /usr/bin/glacier2router: Glacier2: created session
    id =
    category = LX#1V;e)~FA89$8S<gd7
    local address = 172.31.24.250:4063
    remote address = 54.171.19.134:52129
    -- 02/10/15 08:27:13.677 /usr/bin/glacier2router: Glacier2: adding proxy to routing table:
    Z41Announce -t -e 1.0:ssl -h 54.171.112.132 -p 8008
    -- 02/10/15 08:36:10.150 /usr/bin/glacier2router: Glacier2: created session
    id =
    category = {"_aZnvJ$aME!Ti]9)7:
    local address = 172.31.24.250:4063
    remote address = 54.171.19.134:52260
    -- 02/10/15 08:36:10.209 /usr/bin/glacier2router: Glacier2: adding proxy to routing table:
    Z41Announce -t -e 1.0:ssl -h 54.171.112.132 -p 8008
    -- 02/10/15 08:48:34.715 /usr/bin/glacier2router: Glacier2: expiring session
    id =
    category = ~wZh']!&mw|d41e-t731
    local address = 172.31.24.250:4063
    remote address = 54.171.19.134:51930
    -- 02/10/15 08:49:04.750 /usr/bin/glacier2router: Glacier2: expiring session
    id =
    category = LX#1V;e)~FA89$8S<gd7
    local address = 172.31.24.250:4063
    remote address = 54.171.19.134:52129
    -- 02/10/15 08:50:19.772 /usr/bin/glacier2router: Glacier2: expiring session
    id =
    category = "{t0\K8dC/[gSe#sFDED
    local address = 172.31.24.250:4063
    remote address = 54.171.19.134:51924
    -- 02/10/15 08:50:19.772 /usr/bin/glacier2router: Glacier2: expiring session
    id =
    category = HI(2p`VSDbQD'#T+3cA;
    local address = 172.31.24.250:4063
    remote address = 54.171.19.134:51820
    -- 02/10/15 08:50:34.773 /usr/bin/glacier2router: Glacier2: expiring session
    id =
    category = =dU>^,<L^>..>+=7ohP$
    local address = 172.31.24.250:4063
    remote address = 54.171.19.134:51931
  • benoit
    benoit Rennes, France
    Hi,

    Do you set a timeout on the Glacier2 client endpoint? If you don't, it would explain why it takes so long for the connections to be closed. Glacier2 ACM tries to gracefully close the connection. If the peer doesn't respond, the connection will be closed only once the TCP/IP stack detects the connection loss and this can take a long time. Setting a timeout on the Glacier2 client endpoint should fix this problem.

    It's also possible to set a specific close timeout with Ice.Override.CloseTimeout. This timeout can be shorter than the endpoint timeout as it only applies to connection closure. Note that you should still set a timeout on the client endpoint to make sure the connection still times out in a timely manner if a network failure occurs during the lifetime of the connection.

    Cheers,
    Benoit.
  • It is quite possible that the problem is solved as you say. I'm doing some tests and it works. Thank you very much.
    One more question: What time should be greater Glacier2.SessionTimeout or Ice.Override.CloseTimeout?
  • benoit
    benoit Rennes, France
    Hi,

    These 2 timeouts are independent.

    You can set Ice.Override.CloseTimeout to a small value (few seconds for example) if you want Glacier2 to forcefully close the connections quickly if the peers don't close their side of the connection after Glacier2 sent the graceful closure connection message.

    The timeout which much be larger than the Glacier2 session timeout is the client side ACM timeout (Glacier2.Client.ACM with Ice 3.5 or Glacier2.Client.ACM.Timeout with Ice 3.6b) because you don't want active connection management to close the connections prematurely. You don't really need to change the default value of this property, by default Glacier2 sets it to be twice the session timeout.

    To summarize, if a client connection dies:
    • Glacier2 will destroy the session if it doesn't receive a keep alive within the Glacier2.SesssionTimeout period.
    • Glacier2 ACM will gracefully close the Ice connection if it's inactive during the Glacier2.Client.ACM period.
    • Gracefully closed connections are forcefully closed after Ice.Override.CloseTimeout if the client doesn't respond to the graceful connection closure.

    Cheers,
    Benoit.