[icinga-devel] Make sockets non-blocking

Stephen Gran steve at lobefin.net
Wed May 5 15:25:06 CEST 2010


On Sat, May 01, 2010 at 05:03:59PM +0100, Stephen Gran said:
> Hi there,
> 
> We use NDO for network communication with a custom bit of perl to pass
> status updates around.  Recently, we've seen that a network flap can
> make ndo hang the entire nagios process, which is possibly imperfect :)
> 
> I think I've tracked it down to the write() call in io.c when sending
> the actual update to the remote server.  The attached patch is a
> relatively naive attempt at making this write nonblocking for network
> sockets.
> 
> This is a patch against the CVS - if you prefer a git-am style patch,
> that's fine.  I tried to clone the git off of sourceforge this morning,
> and got an empty repo.  If there's a better place to clone from, let me
> know and I'll fix it up for that.

So, it turned out my initial attempt to keep the patch small had some
limitations.  Working patch attached.

To recap, the main problem is that I/O operations are blocking.  This is
less important to local file or unix sockets, but can block the main
nagios process when the I/O operations are tcp based.

My first attempt merely marked the socket as non-blocking, and added
the optional return code to the list handled in the error path.  What I
found during testing was that this had a few problems.

First, the error path adds the return of write() to tbytes.  If write
returns -1, tbytes was being decremented, resulting in an infinite loop
because the loop termination condition became unsatisfiable.  Even when
correcting the return to 0 before addition, there was still no loop
termination condition when write could not succeed.  I've hackishly
corrected this with a hardcoded maximum number of loops.

To make things a little nicer, we don't even want to enter the write()
loop if we know we can't write().  We do this with a zero second select()
to check if we can write before entering the loop.  This is admittedly
racy, but I'd frankly rather return early than block the nagios main
process.

Back to socket creation.  We could mark the socket as non-blocking
after connect() returns, so that we know that we have a valid fd before
carrying on.  The problem with this is that the default connect() timeout
on the Redhat 5 machine I tested this on is 3 minutes.  That is, in my
opinion, again too long a time to block the main nagios process for.

What I've done instead is to mark the socket as non-blocking before
calling connect().  In the connect() routine, if connect() sets errno
to EINPROGRESS, we select() on the socket for 15 seconds to see if it
succeeds.  If it does not, we enter the normal error path for connect()
failures.

Arguably my choices of 10 loops for termination in write() and 15
seconds for connect() are not right for everyone.  They could be moved
to configuration options, or they could be taken from existing options,
or something.  At this point, it Works For Me(TM), which is good enough
at this point.  If people have any specific objections, I would of course
be happy to tidy it up for them.

Cheers,

diff -Nru ndoutils-1.4b7-bak/src/io.c ndoutils-1.4b7/src/io.c
--- ndoutils-1.4b7-bak/src/io.c 2007-01-08 00:35:50.000000000 +0000
+++ ndoutils-1.4b7/src/io.c     2010-05-05 13:41:42.000000000 +0100
@@ -203,12 +203,34 @@
                server_address_i.sin_family=AF_INET; 
                server_address_i.sin_port=htons(port);
 
+               int x;
+               x=fcntl(newfd,F_GETFL,0);
+               fcntl(newfd,F_SETFL,x | O_NONBLOCK);
+
                /* connect to the socket */
                if((connect(newfd,(struct sockaddr *)&server_address_i,sizeof(server_address_i)))){
-                       close(newfd);
-                       return NDO_ERROR;
-                       }
+                       if(errno==EINPROGRESS) {
+                               fd_set wfds;
+                               int retval = 0;
+                               struct timeval tv;
+
+                               tv.tv_sec = 15;
+                               tv.tv_usec = 0;
+                               FD_ZERO(&wfds);
+                               FD_SET(newfd, &wfds);
+
+                               retval=select(newfd+1, NULL, &wfds, NULL, &tv);
+
+                               if(!FD_ISSET(newfd, &wfds)) {
+                                       close(newfd);
+                                       return NDO_ERROR;
+                               }
+                       } else {
+                               close(newfd);
+                               return NDO_ERROR;
+                       }
                }
+       }
 
        /* unknown sink type */
        else
@@ -225,13 +247,31 @@
 int ndo_sink_write(int fd, char *buf, int buflen){
        int tbytes=0;
        int result=0;
+       int a = 0;
+       int retval = 0;
+       fd_set wfds;
+       struct timeval tv;
+
+       /* Don't wait - just check availability. */
+       tv.tv_sec = 0;
+       tv.tv_usec = 0;
+
+       FD_ZERO(&wfds);
+       FD_SET(fd, &wfds);
 
        if(buf==NULL)
                return NDO_ERROR;
        if(buflen<=0)
                return 0;
 
-       while(tbytes<buflen){
+       retval=select(fd+1, NULL, &wfds, NULL, &tv);
+
+       if (retval==-1 && errno!=EINTR)
+               return NDO_ERROR;
+       if(!FD_ISSET(fd, &wfds)) 
+               return NDO_ERROR;
+
+       while(tbytes<buflen && a<10){ /* 10 is completely arbitrary - we need some loop terminator, though */
 
                /* try to write everything we have left */
                result=write(fd,buf+tbytes,buflen-tbytes);
@@ -240,16 +280,18 @@
                if(result==-1){
 
                        /* unless we encountered a recoverable error, bail out */
-                       if(errno!=EAGAIN && errno!=EINTR)
+                       if(errno!=EAGAIN && errno!=EINTR && errno!=EWOULDBLOCK)
                                return NDO_ERROR;
-                       }
+                       result=0;
+               }
 
-               /* update the number of bytes we've written */
+       /* update the number of bytes we've written */
                tbytes+=result;
-               }
+               a++;
+        }
 
        return tbytes;
-        }
+}
 
 
 /* writes a newline to data sink */

-- 
 --------------------------------------------------------------------------
|  Stephen Gran                  | Break into jail and claim police        |
|  steve at lobefin.net             | brutality.                              |
|  http://www.lobefin.net/~steve |                                         |
 --------------------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.icinga.org/pipermail/icinga-devel/attachments/20100505/a497f36a/attachment.sig>


More information about the icinga-devel mailing list