Hi, Yes. It looks like sendalot is being set incorrectly in the SACK rexmit path. The problem is that if there's a lot of data in the SACK scoreboard, and if cwnd allows re-sending only part of it, sendalot gets set to 1 each time thru the loop and we loop around. The patch referenced below will cause just one SACK segment to be retransmitted in a call to tcp_output(), even if cwnd allows us to retransmit more segments. This will depress thruput in the face of packet loss. I think, it might be better to set sendalot only when len > 0 instead. This will cause us to retransmit as many SACK segments as we're allowed to send (per cwin) in a call to tcp_output() before we send new data. I'm referencing 6.x here, but this applies to 5.3 as well. crabapple.corp.yahoo.com> p4 diff -dc tcp_output.c ==== //depot/mohans/freebsd6_nfs/sys/netinet/tcp_output.c#1 - /homes/mohans/p4/mohans/freebsd6_nfs/sys/netinet/tcp_output.c ==== *************** *** 231,242 **** tp->snd_recover - p->rxmit)); } else len = ((long)ulmin(cwin, p->end - p->rxmit)); - sack_rxmit = 1; - sendalot = 1; off = p->rxmit - tp->snd_una; KASSERT(off >= 0,("%s: sack block to the left of una : %d", __func__, off)); if (len > 0) { tcpstat.tcps_sack_rexmits++; tcpstat.tcps_sack_rexmit_bytes += min(len, tp->t_maxseg); --- 231,242 ---- tp->snd_recover - p->rxmit)); } else len = ((long)ulmin(cwin, p->end - p->rxmit)); off = p->rxmit - tp->snd_una; KASSERT(off >= 0,("%s: sack block to the left of una : %d", __func__, off)); if (len > 0) { + sack_rxmit = 1; + sendalot = 1; tcpstat.tcps_sack_rexmits++; tcpstat.tcps_sack_rexmit_bytes += min(len, tp->t_maxseg); Let me know what you think. thanks, mohan Paul Saab (ps@yahoo-inc.com) wrote: > > From: Peter Losher > Organization: ISC > To: Robert Watson > Subject: Re: [Fwd: Re: 5.3 stability?] > Date: Wed, 27 Oct 2004 00:35:28 -0700 > Cc: Scott Long , > "George V.Neville-Neil" , ps@freebsd.org, > re@freebsd.org, dhartmei@freebsd.org > > On Tuesday 26 October 2004 06:13 am, Robert Watson wrote: > > > > Off to sleep... :) Thanks again for your suggestions and advice. > > > > Daniel Hartmeier has created the following patch that tweaks the logic > > for SACK retransmit to match what it is in NetBSD/OpenBSD, and he > > believes may prevent a spinning scenario with behavior that strongly > > resembles what we're seeing. Assuming that the SACK disabling caused > > the problem to disappear, it sounds like a good place to begin: > > > > http://www.benzedrine.cx/sendalot.diff > > Look like that did it; the patched kernel (w/ TCP_SACK turned back on) > has been running for 11+ hours w/ no problems so far. > > -=- > % uptime > 7:32AM up 11:10, 1 user, load averages: 8.11, 7.51, 6.20 > -=- > > I am heading for bed now to catch up on the sleep I lost last night. :) > Talk to you all later this morning. :) > > Best Wishes - Peter > -- > Peter_Losher@isc.org | ISC | OpenPGP Key E8048D08 | "The bits must flow"