Archived

This forum has been archived. Please start a new discussion on GitHub.

Patch to Network.cpp for FreeBSD for ECONNRESET on close(2) problem

We occasionally see the following error on FreeBSD:
!! 07/14/11 09:39:17.357 Server: error: unexpected connection exception:
   Network.cpp:720: Ice::SocketException:
   socket exception: Connection reset by peer
   local address = 10.10.10.27:48402
   remote address = 10.10.10.32:4062

This only happened a few times in a couple of weeks and only if there was some external interaction involved (e.g. when calling a client script that does a few calls with reasonable payload over the network). It's still happening too often (mostly over remote connections). After doing some research and reading the source code I figured that
  1. This happens on a call to close(2) when the connection changes to StateFinished in ConnectionI.cpp (the call to close itself is in Network.cpp of course)
  2. There are no resources leaking, because this is the last step in the connection state model and the state will be set nevertheless
  3. This is specific to FreeBSD (and maybe Darwin, but the Mac OS X man page of close(2) states otherwise). Since FreeBSD 6.3 (2006) FreeBSD can fail on close with errno ECONNRESET. While this additional information might be useful to some clients, it is quite annoying since it seems to differ from all other supported platforms. The man page still claims it is POSIX compliant (I don't have the standard here, so I don't want to judge it by that) - but it behaves differently than Linux and probably Mac OS X in this case for sure.
  4. Other projects (e.g. Ruby) seen these problems as well and provided similar patches (e.g. Ruby 1.9 - Bug #3515: FreeBSD wrongly raises ECONNRESET on close(2) - Ruby Issue Tracking System)
  5. There are also open bugs that discuss this behavior in the FreeBSD PR system (e.g. kern/146845: [libc] close(2) returns error 54 (connection reset by peer) wrongly), but it doesn't seem likely this is changed anytime soon

The easy (and safe) solution is to apply a simple patch, that doesn't consider errno==ECONNRESET as an error on the FreeBSD platform anymore:
--- cpp.orig/src/Ice/Network.cpp	2011-06-15 21:43:58.000000000 +0200
+++ cpp/src/Ice/Network.cpp	2011-07-15 23:40:26.000000000 +0200
@@ -715,7 +715,11 @@
     WSASetLastError(error);
 #else
     int error = errno;
-    if(close(fd) == SOCKET_ERROR)
+    if(close(fd) == SOCKET_ERROR
+#  if defined(__FreeBSD__)
+    && getSocketErrno() != ECONNRESET
+#  endif
+    )
     {
         SocketException ex(__FILE__, __LINE__);
         ex.error = getSocketErrno();

Please find the same patch attached as well.

Cheers
Michael

Comments

  • Gnu Darwin (and Mac OS X) might be affected by this as well

    The source is the same as in FreeBSD, check:

    tcp_usrreq.c

    So even though the Mac OS X man page of close(2) doesn't state this is the case, at least the underlying implementation seems to be the same.

    It's not that easy to provoke this using Ice, we've been experiencing this only on multi core machines and using a reasonably sized Ice grid. It would be nice though to be able to have a small and simple setup to reproduce this.

    In the meantime I opened PRs for the devel/ice and for the underlying POSIX incompatibility:

    ports/159031: [PATCH] devel/Ice: Fix close socket and incorporate security patch for IceGrid

    kern/159179: [libc] close(2) emitting ECONNRESET is not POSIX compliant
  • Patch is now in the FreeBSD ports tree

    Since it will take a while until this patch will be incorporated in a future Ice release it is now part of the FreeBSD devel/ice port. You can upgrade to port version 3.4.2_1 by doing
    sudo portsnap fetch update
    sudo portupgrade devel/ice