Monday, September 22, 2008

TCP Offload Engine support in Linux

TCP Offload Engines (TOE) are customized hardware which handle TCP connections completely in the network card itself, instead of in the kernel. Lately 10Gbps Ethernet cards are becoming the industry standard in the high end server market. A simple rule of thumb regarding TCP processing in the CPU requires 1 HZ of CPU for every 1 bit of TCP data handled per second. This means that a 10GigE card, requiring 10 GHz, can quickly eat the CPU like there is no tomorrow. Even with multiple CPUs and multiple cores per CPU, the impact is significant.

Added to this, IEEE standards are already being prepared for 40Gbps and 100Gbps Ethernet. So in this situation, using TOE becomes inevitable. There are already many TCP functions already being done in hardware like checksumming, LRO (Large Receive offload), LSO (Large Send Offload) etc. But a TOE provides a complete end-to-end solution.

TCP Offload Engines were never a hit with the linux networking community. Linux Kernel Maintainers, esp David Miller, have been against the idea of TOE due to various valid reasons like it reduces maintainability of code, etc. Also the kernel maintainers argue that TOE was only a stopgap solution, before CPU speeds caught up with the loads, citing cases in the past where TOE was implemented even for 100Mbps links. More details about their position can be found in this article - Linux and TCP Offload Engines. This has caused a situation where the 10GigE vendors like Chelsio are forced to maintain the TOE patches to the Linux kernel, out-of-tree. This causes the code to be inherently unstable.

Anyway, the end users do not face any loss of functionality since the vendor provided patches to the Linux kernel can be used to build kernel modules, which provide support for TOE hardware in Linux machines. This is what I like about Open Source. You do not have to be bound by what others think. You leave that decision to time.

No comments: