[MUD-Dev] A User's Guide to TCP Windows

J C Lawrence claw at kanga.nu
Wed Apr 4 22:53:46 CEST 2001


Rather a good/useful article:

--<cut>--
A User's Guide to TCP Windows



     What is the TCP window size? 
     Computing the TCP window size 
     Setting the TCP window size 
     Testing bandwidth 
     Adjusting the TCP window size 



What is the TCP Window Size?

The TCP window size is by far the most important parameter to adjust
for achieving maximum bandwidth across high-performance
networks. Properly setting the TCP window size can often more than
double the achieved bandwidth.

Technically, the TCP window size is the maximum amount of data that
can be in the network at any time for a single connection. (It is
the upper limit of the TCP congestion window.)

Think of a water hose. To achieve maximum water flow, the hose
should be full. As the hose increases in diameter and length, the
volume of water to keep it full increases. In networks, diameter
equates to bandwidth, length is measured as round-trip time, and the
TCP window size is analogous to the volume of water necessary to
keep the hose full. On fast networks with large round-trip times,
the TCP window size must be increased to achieve maximum TCP
bandwidth.

Computing the TCP Window Size

Theoretically the TCP window size should be set to the bandwidth
delay product, which computes the volume of data that can be in the
network between two machines. The bandwidth delay product is:

     bottleneck bandwidth * round-trip time

To compute the bandwidth delay product for a pair of hosts, first
estimate what the slowest link between them is. Often this is the
100 Mbit/sec ethernet the machine is connected to, or the 45
Mbit/sec DS3 link from the campus to the wide-area.  Then use ping
to find the round-trip time. For example, if the slowest link is a
45 Mbit/sec DS3 link, and the round-trip time is 30 milliseconds:

     45 Mbit/sec * 30 ms 
     = 45e6 * 30e-3 
     = 1,350,000 bits / 8 / 1024 
     = 165 KBytes 

Setting the TCP Window Size

The TCP window size can be set on a per connection basis, as
detailed below. For setting the default TCP window size on a host,
and other important factors to high-performance networking, see
PSC's page on Enabling High Performance Data Transfers on Hosts.

Most OSes and hosts have upper limits on the TCP window size. These
may be as low as 64 KB, or as high as several MB. To enable TCP
window sizes larger than 64 KB, TCP large window extensions (RFC
1323) must be enabled. See PSC's page above for OSes that implement
it.

Since TCP is a reliable transport, if any data is lost in
transmission, TCP must be able to retransmit it. Thus TCP remembers
all the sent data in a buffer until the other side acknowledges
receiving it. The size of this buffer is the TCP window size.

The TCP window size is implemented by send and receive buffers on
each end of the connection. To set these buffers, use the SO_SNDBUF
and SO_RCVBUF socket options. Both ends of the connection must set
these options. For example:

     int window = 128 * 1024; // example, 128 KB
     int error = setsockopt( socket, SOL_SOCKET, SO_SNDBUF, &window, sizeof(window));
         error = setsockopt( socket, SOL_SOCKET, SO_RCVBUF, &window, sizeof(window));

This must occur before the listen() or connect() call for windows
larger than 64 KB to be effective. Please see the sample code for a
complete implementation with proper error checking and support for
various different operating systems.  UNICOS and AIX have special
code.

Testing Bandwidth

Here is a simple example of testing the network bandwidth with
several different TCP window sizes. First start the Iperf server on
one machine (here, cyclops), then start the client on another
machine (modi4).

Using the system default 60 KByte TCP window size:

     cyclops> iperf -s
     ------------------------------------------------------------
     Server listening on TCP port 5001
     TCP window size: 60.0 KByte (default)
     ------------------------------------------------------------
     [  4] local 172.31.178.168 port 5001 connected with 172.16.7.4 port 2357
     [ ID] Interval       Transfer     Bandwidth
     [  4]  0.0-10.1 sec   6.5 MBytes   5.2 Mbits/sec

     modi4> iperf -c cyclops
     ------------------------------------------------------------
     Client connecting to cyclops, TCP port 5001
     TCP window size: 59.9 KByte (default)
     ------------------------------------------------------------
     [  3] local 172.16.7.4 port 2357 connected with 172.31.178.168 port 5001
     [ ID] Interval       Transfer     Bandwidth
     [  3]  0.0-10.0 sec   6.5 MBytes   5.2 Mbits/sec

Setting the TCP window size to 130 KBytes. Note the increase in
bandwidth from 5.2 to 15.7 Mbits/sec.

     cyclops> iperf -s -w 130k
     ------------------------------------------------------------
     Server listening on TCP port 5001
     TCP window size:  130 KByte
     ------------------------------------------------------------
     [  4] local 172.31.178.168 port 5001 connected with 172.16.7.4 port 2530
     [ ID] Interval       Transfer     Bandwidth
     [  4]  0.0-10.1 sec  19.7 MBytes  15.7 Mbits/sec

     modi4> iperf -c cyclops -w 130k
     ------------------------------------------------------------
     Client connecting to cyclops, TCP port 5001
     TCP window size:  129 KByte (WARNING: requested  130 KByte)
     ------------------------------------------------------------
     [  3] local 172.16.7.4 port 2530 connected with 172.31.178.168 port 5001
     [ ID] Interval       Transfer     Bandwidth
     [  3]  0.0-10.0 sec  19.7 MBytes  15.8 Mbits/sec

The Iperf documentation has many more examples of testing the
bandwidth. You should also test bandwidth in your application, since
it will behave differently than Iperf. For instance, FTP must read
its data from disk, which slows it down substantially.

Adjusting the TCP window size

While the bandwidth delay product gives the theoretical value for
the TCP window size, that is not always the best value.  Problems
come because the OS's TCP implementation has bugs and/or the network
has deficiencies. Usually try values 10% above and below the
calculated TCP window size. If one of those is better, try values
above and below that, repeating until the maximum bandwidth is
reached. Remember there will be some variability in bandwidth due to
other competing network traffic. In some cases, OS and network
problems may be so bad that deliberately setting the TCP window size
low will increase performance because it masks the other
problems. Talk to your network engineers if that is the case.

Over time, network topology and routing changes, which will cause
changes in the bandwidth delay product. For instance, the connection
between cyclops and modi4 above changed from using the vBNS to using
the Abilene network, causing an increase in delay of about 10
ms. Therefore, you should periodically test the TCP window size to
see if you are still getting maximum performance.

Acknowledgments

This page grew out of an earlier page written by Von Welch at NCSA.

Last modified: Mon Jan 24 13:26:12 CST 2000 
--<cut>--

--
J C Lawrence                                       claw at kanga.nu
---------(*)                          http://www.kanga.nu/~claw/
--=| A man is as sane as he is dangerous to his environment |=--
_______________________________________________
MUD-Dev mailing list
MUD-Dev at kanga.nu
https://www.kanga.nu/lists/listinfo/mud-dev



More information about the mud-dev-archive mailing list