PHWinfo banniere

Titres
PORTAIL ANNUAIRE ARTICLES COMPARATEUR HÉBERGEURS DEVIS FORUMS RÉDUCTEUR D'URL
Précédent   PHWinfo > Forums Hébergement > Forum Noms de domaine > comp.protocols.tcp-ip > Connections hung TIME_WAIT state
S'inscrire FAQ Membres Recherche Messages du jour Marquer les forums comme lus
comp.protocols.tcp-ip TCP and IP network protocols.

Connections hung TIME_WAIT state

Réponse
 
LinkBack Outils de la discussion
Vieux 01/05/2006, 15h02   #1
Kami
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Connections hung TIME_WAIT state

here goes the problem....

there are two servers, server A and server B.... Server A is running
apache, and server B is running memcached (database query result
caching)...

now the server A connects to server B on a specified port....
i'm using ab to generate request on server A locally.. so a high load
hit situation can be simulated for the caching server...
NOw the problem is ... the server B cleans up its connection
properly... but on server A, the connections keep hanging in TIME_WAIT
state, their numer start increasing, and eventually i get connection
timeout to server B...

here is a list of variables along with the values i changed , in an
attempt to forcfully kill the TIME_WAIT connections...

net.ipv4.netfilter.ip_conntrack_tcp_timeout_fin_wa it=1
net.ipv4.tcp_fin_timeout=1

net.ipv4.netfilter.ip_conntrack_tcp_timeout_close= 1
net.ipv4.netfilter.ip_conntrack_tcp_timeout_time_w ait=1
net.ipv4.netfilter.ip_conntrack_tcp_timeout_last_a ck=1
net.ipv4.netfilter.ip_conntrack_tcp_timeout_close_ wait=1
net.ipv4.netfilter.ip_conntrack_tcp_timeout_fin_wa it=1

net.ipv4.netfilter.ip_conntrack_tcp_timeout_syn_re cv=2
net.ipv4.netfilter.ip_conntrack_tcp_timeout_syn_se nt=2

net.ipv4.tcp_fin_timeout=1

these variables were changed using sysctl -w, but to no use..

The operating system used is Cent OS 4.2

any ideas on how to effectiley change these values... so that the the
connections are cleaned up properly.. ?

  Réponse avec citation
Vieux 01/05/2006, 18h38   #2
Rick Jones
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Connections hung TIME_WAIT state

Kami <mkamrannisar@gmail.com> wrote:
> here goes the problem....


> there are two servers, server A and server B.... Server A is running
> apache, and server B is running memcached (database query result
> caching)...


> now the server A connects to server B on a specified port....
> i'm using ab to generate request on server A locally.. so a high load
> hit situation can be simulated for the caching server...
> NOw the problem is ... the server B cleans up its connection
> properly... but on server A, the connections keep hanging in TIME_WAIT
> state, their numer start increasing, and eventually i get connection
> timeout to server B...


How do you know the connections are "hung" in TIME_WAIT?

I believe the "problem" is that the application software is trying to
establish and tear-down TCP connections "too fast" where too fast is

>= sizeof(clientportspace)/lengthof(TIME_WAIT)



> here is a list of variables along with the values i changed , in an
> attempt to forcfully kill the TIME_WAIT connections...


> net.ipv4.netfilter.ip_conntrack_tcp_timeout_fin_wa it=1
> net.ipv4.tcp_fin_timeout=1


> net.ipv4.netfilter.ip_conntrack_tcp_timeout_close= 1
> net.ipv4.netfilter.ip_conntrack_tcp_timeout_time_w ait=1
> net.ipv4.netfilter.ip_conntrack_tcp_timeout_last_a ck=1
> net.ipv4.netfilter.ip_conntrack_tcp_timeout_close_ wait=1
> net.ipv4.netfilter.ip_conntrack_tcp_timeout_fin_wa it=1


> net.ipv4.netfilter.ip_conntrack_tcp_timeout_syn_re cv=2
> net.ipv4.netfilter.ip_conntrack_tcp_timeout_syn_se nt=2


> net.ipv4.tcp_fin_timeout=1


> these variables were changed using sysctl -w, but to no use..


indeed, because none of them are directly related to the problem your
applications have. I hope you set those things back

TIME_WAIT is an integral part of TCP's correctness algorithms. It is
there to protect new connections by the same "name" from inadvertantly
accepting segments from old connections and thus corrupting data.

Strictly speaking, TIME_WAIT is supposed to last as long as four
minutes, so the connection rate that could result in attempts to reuse
a TCP connection name (local/remote IP, local/remote port) that is
still in TIME_WAIT would be:

sizeof(portspace)/240

If your client application is allowing the stack to pick the local
port number (eg is not calling bind() to pick a port number itself),
then likely as not, the range of ports it gets will be 49152 to 65535
or ~16384 port numbers:

16384/240

or 68 connections per second.

The best "fix" is to get your applications to use long-lived TCP
connections. The next best fix after that is to broaden the number of
ports (and perhaps IP addresses) involved. One way to do that is to
have the application attempt to bind() to port numbers in the range of
day 5000 to 65535. That would increase the rate before attempted
TIME_WAIT reuse to

65000/240

or ~270 connections per second.

You could achieve similar results by spreading the traffic across a
larger number of IP addresses - on the client, the server, or both.

A much more distant fourth option is to decrease the length of TIME_WAIT to
say 60 seconds (math left as an excercise to the reader

However, you should "never" take steps to make there be no TIME_WAIT
state at all, such as using an "abortive" close.

rick jones
--
firebug n, the idiot who tosses a lit cigarette out his car window
these opinions are mine, all mine; HP might not want them anyway...
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
  Réponse avec citation
Vieux 02/05/2006, 14h32   #3
Michael Wojcik
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Connections hung TIME_WAIT state


In article <jIr5g.7019$E85.7012@news.cpqcorp.net>, Rick Jones <rick.jones2@hp.com> writes:
> Kami <mkamrannisar@gmail.com> wrote:
>
> > now the server A connects to server B on a specified port....
> > i'm using ab to generate request on server A locally.. so a high load
> > hit situation can be simulated for the caching server...
> > NOw the problem is ... the server B cleans up its connection
> > properly... but on server A, the connections keep hanging in TIME_WAIT
> > state, their numer start increasing, and eventually i get connection
> > timeout to server B...

>
> How do you know the connections are "hung" in TIME_WAIT?


It's a pity that the TCP designers didn't give the state a more
explanatory name, like I_HAVE_TO_WAIT_FOR_A_CERTAIN_LENGTH_OF_TIME.

> I believe the "problem" is that the application software is trying to
> establish and tear-down TCP connections "too fast" where too fast is
>
> >= sizeof(clientportspace)/lengthof(TIME_WAIT)


Y'know, I just went through this with one of our testers, who had
a similar "load test" that fired SOAP requests at our server as
fast as it could using hundreds of client threads. Fortunately,
he took an Ethereal trace before reporting a problem, so I could
show him right where he started reusing client ports that hadn't
finished TIME_WAIT yet.

> If your client application is allowing the stack to pick the local
> port number (eg is not calling bind() to pick a port number itself),
> then likely as not, the range of ports it gets will be 49152 to 65535
> or ~16384 port numbers:
>
> 16384/240
>
> or 68 connections per second.
>
> The best "fix" is to get your applications to use long-lived TCP
> connections. The next best fix after that is to broaden the number of
> ports (and perhaps IP addresses) involved. One way to do that is to
> have the application attempt to bind() to port numbers in the range of
> day 5000 to 65535. That would increase the rate before attempted
> TIME_WAIT reuse to
>
> 65000/240
>
> or ~270 connections per second.


Er, 60536/240, or ~252 connections per second. Though of course
the principle is correct.

On some platforms you can change the range of ports the system
will assign for ephemeral use, rather than making the application
bind explicitly on the client side, though of course at some loss
of portability.

Also, when testing over a loopback connection, if the server could
use a port in the client's port space, watch out for self-connect,
if the platform's stack supports it. *That* can produce some odd
errors in testing.

See eg
http://groups.google.com/group/comp....0ce279fd1e2db0
or
http://tinyurl.com/owkhd

--
Michael Wojcik michael.wojcik@microfocus.com

Most people believe that anything that is true is true for a reason.
These theorems show that some things are true for no reason at all,
i.e., accidentally, or at random. -- G J Chaitin
  Réponse avec citation
Vieux 02/05/2006, 17h33   #4
Rick Jones
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Connections hung TIME_WAIT state


>> How do you know the connections are "hung" in TIME_WAIT?


> It's a pity that the TCP designers didn't give the state a more
> explanatory name, like I_HAVE_TO_WAIT_FOR_A_CERTAIN_LENGTH_OF_TIME.


Then people would have been complaining about how much space that
consumed in netstat output...

FWIW, the purpose of TIME_WAIT is discussed in the TCP RFCs and just
about any decent book on TCP out there


> On some platforms you can change the range of ports the system will
> assign for ephemeral use, rather than making the application bind
> explicitly on the client side, though of course at some loss of
> portability.


Indeed. My preference is to have ways to configure the application to
do the right thing without relying on the system administrator.

rick jones
--
firebug n, the idiot who tosses a lit cigarette out his car window
these opinions are mine, all mine; HP might not want them anyway...
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
  Réponse avec citation
Vieux 08/05/2006, 08h00   #5
Kami
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Connections hung TIME_WAIT state

Thank you all for ur replies ..
i wasn't able to come online for the past few days.. and would be going
through ur replies in detail now...

  Réponse avec citation
Réponse


Outils de la discussion

Règles de messages
Vous ne pouvez pas créer de nouvelles discussions
Vous ne pouvez pas envoyer des réponses
Vous ne pouvez pas envoyer des pièces jointes
Vous ne pouvez pas modifier vos messages

Les balises BB sont activées : oui
Les smileys sont activés : oui
La balise [IMG] est activée : oui
Le code HTML peut être employé : non
Trackbacks are oui
Pingbacks are oui
Refbacks are oui


Fuseau horaire GMT +1. Il est actuellement 03h06.


Édité par : vBulletin® version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.2.0 RC5 Tous droits réservés.
Version française #16 par l'association vBulletin francophone
PHWinfo est un site Éducation Sans Frontières ©2000-2008
Ad Management by RedTyger
©Tous droits réservés par les parties respectives
Page generated in 0,17174 seconds with 13 queries