Packets can be sent out immediately with out a delay if it falls under below reasons and without violation of Nagle’s rules
Ø It is full sized.
Ø Or it contains FIN. (Already checked by caller)
Ø Or TCP_NODELAY was set.
Ø Or TCP_CORK is not set, and all sent packets are ACKed. With Minshall's modification: all sent small packets are ACKed.
The reason for getting less Network performance In Linux-2.6.20.9 is as follows.
If we have TSO enabled and if we an 8KB of data to be transmitted, In this case what the TCP/IP stack does is as follows.
STEP1: 8KB of data will be copied from Application to the Kernel Buffer
STEP2: Before forming the TCP segment Nagle’s test will be done, this will be done only if TSO_SEGS is equal to 1. This variable tells the HW to how many chunks this 8kB has to be divided while sending it out on the wire. For Example: IF 8KB of data is to be transmitted and if your MSS is 1448 then your tso_segs will be 6 (i.e. 8192 /1448 ~= 6).
STEP3: TCP/IP stack will only give data of length which is multiple of MSS, i.e., if you have 8KB of data and your max segment size is 1448 then in this case the amount of data that has sent out will be calculated as follows
Amount of data to be sent out = 8192 – (8192 mod 1448)
= 8192 – 952
= 7240
As a result of this only 7240 out of 8192 will be sent out, the remaining amount of data that is 952 will be sent out later as new TCP/IP segment. That is it as fall through again from step2.
STEP4: TCP/IP header will formed successfully for the data of length 7240 and it will be given to the Network driver for sending it out on the wire.
This remaining amount of data 952 bytes will be tried to send out, it will start from step2 again; in this case the tso_segs value will be 1, as a result of this it has to go for Nagel’s test. Since it is not a full sized packet nagle’s test will be failed and the packet will be delayed in sending out. As a result of this we were getting less performance in 2.6.20.9.
This can be avoided by committing the Trimming of data in step3 as our H/W as the feature of handling the entire 8k even if the last chunk is less than MSS or this can also be avoided by disabling Nagle’s algorithm.
Nagle’s algorithm can be disabled by using setsockopt
Comments