Linux Kernel TCP MSS mechanism Analyse
Overview
Last Week, Linux fixes 4 kernel CVE vulnerabilities[1]. Among them, CVE-2019-11477 makes me feel like a very powerful Dos vulnerability. However, because there are other things interrupted, my research progress is slower. For now, there have been some related analysis article in the Internet.[2][3]
In the process of trying to reproduce the CVE-2019-11477 vulnerability. In the first step, I encountered a problem in setting the MSS. I could not achieve the expected results. However, the current published analysis article did not elaborate on this part. So this article will analyze the MSS mechanism of TCP through the Linux kernel source code.
Test Environment
1. Targets with Vulnerabilities
OS: Ubuntu 18.04
Kernel: 4.15.0-20-generic
IP address: 192.168.11.112
Kernel Source Code:
1 | sudo apt install linux-source-4.15.0 |
Kernel Binary with symbols:
1 | cat /etc/apt/sources.list.d/ddebs.list |
Close Kernel Address Space Layout Randomization(KALSR):
1 | # because the Kernel is started by grup,we can modify grup config to add "nokaslr" to kernel started argv. |
Use Nginx for testing:
1 | sudo apt install nginx |
2. Host
OS: MacOS
Wireshark: Capture traffic
VM: VMware Fusion 11
Use VM to Deubg Linux:
1 | cat ubuntu_18.04_server_test.vmx|grep debug |
Compile gdb:
1 | $ ./configure --build=x86_64-apple-darwin --target=x86_64-linux --with-python=/usr/local/bin/python3 |
Use gdb for remote debug:
1 | $ gdb vmlinux-4.15.0-20-generic |
3. Attacker
OS: Linux
IP Address: 192.168.11.111
If you’re accustomed to Python, install a Scapy to send TCP package.
Custom SYN MSS option
There are three ways to set the MSS value of the TCP SYN packet.
1. iptable
1 | Add rules |
2. ip route
1 | show router information |
3. Use Scapy send packet
PS: Using Scapy to send TCP packet needs ROOT permissions.
1 | from scapy.all import * |
The “S” in the flags option indicates “SYN”; “A” indicates “ACK” and “SA” indicates “SYN, ACK”.
The TCP options table that can be set via Scapy is as follows:
1 | TCPOptions = ( |
But there will be a problem after sending a SYN package with Python: kernel will automatically send a RST packet. After checking some papers, it’s found out that:
Since you haven’t completed the full TCP handshake, your operating system might try to take control and start sending RST(reset) packets.
The solution is to use iptable to filter the RST package:
1 | sudo iptables -A OUTPUT -p tcp --tcp-flags RST RST -s 192.168.11.111 -j DROP |
In-depth research of the MSS mechanism
The details of the vulnerability have been analyzed in other articles. Here is a brief summary that the vulnerability is a uint16 integer overflow:
1 | tcp_gso_segs uint16 |
1 | hex(17*32*1024//8) |
Therefore, an integer overflow will occur only when mss_now is less than or equal to 8.
Having conducted the following test, I met a problem.
Having set the MSS value to 48 via iptables/iproute
command , the attack machine uses curl to request the HTTP service of the Target machine, and then the Host use wireshark to capture traffic. It is found that the HTTP packet returned by the server is divided into small blocks, but it’s only as small as 36, and my expected value is 8.
At this time, I chose to analyse and debug Linux Kernel source code to sort out the reason why the MSS failed to reach my expected value, and what happened during the process of setting the MSS value in the SYN packet to mss_now in the code.
Backtrack the overflow function tcp_set_skb_tso_segs
:
1 | tcp_set_skb_tso_segs <- tcp_fragment <- tso_fragment <- tcp_write_xmit |
Analyse tcp_current_mss
function and the key code is as follows:
1 | # tcp_output.c |
Having read the part of the source code, we will have a deeper understanding of the meaning of MSS. Firstly, we need know the TCP protocol.
The TCP protocol includes protocol headers and data. The protocol header includes fixed-length 20-byte and 40-byte optional parameters. That is to say, the TCP protocol header has a maximum length of 60 bytes and a minimum length of 20 bytes.
The mss_now
in the __tcp_mtu_to_mss
function is the MSS set for SYN package, from which we can see that the minimum MSS is 48. Through the understanding of the TCP protocol as well as the code, we can know about the MSS in the SYN packet. The minimum value of 48 bytes indicates that the TCP header optional parameter has a maximum length of 40 bytes and the minimum length of data is 8 bytes.
But mss_now
in the source code represents the length of the data, then let’s look at the calculation formula of the value.
tcphdr struct:
1 | struct tcphdr { |
This structure is a 20-byte TCP fixed protocol header.
The variable tcp_sk(sk)->tcp_header_len
indicates the length of the TCP packet header sent by the local machine.
Therefore, we can get the formula for calculating mss_now
: the MSS value set by the SYN packet - (The length of the TCP packet header sent by the local machine - the fixed length of the TCP header is 20 bytes)
So, if the value of tcp_header_len
can reach a maximum of 60, then mss_now
can be set to 8. So in the kernel code, is there any way to make tcp_header_len
reach the maximum length? Then we backtrack this variable:
1 | # tcp_output.c |
Therefore, in the Linux 4.15 kernel, the kernel does not send TCP packets with a header size of 60 bytes without user intervention, which resulted in that the MSS cannot be set to a minimum of 8, thus ultimately prevented the vulnerability from being exploited.
Summary
Let’s summarize the whole process:
- Attacker constructs a SYN packet, and the optional TCP header optional parameter has a value of 48 for the MSS.
- After the Target(vulnerable devices) receives the SYN request, it saves the data in the SYN packet in the memory and returns to the ‘SYN” and the “ACK’ packets.
- Attacker returns an ACK packet.
Then according to different services, the target actively sends data to the attacker or sends the data to the attacker after receiving the attacker request. Here, it is assumed to be an Nginx HTTP service.
1. The attacker sends a request to the target: GET / HTTP/1.1
.2. After receiving the request, the target firstly calculates tcp_header_len
, which is equal to 20 bytes by default. When the kernel parameters sysctl_tcp_timestamps
is enabled, 12 bytes are added. If you selected CONFIG_TCP_MD5SIG
when compiling the kernel, another 18 bytes will be added, which means that the maximum length of tcp_header_len
is 50 bytes.3. Then you will calculate mss_now = 48 - 50 + 20 = 18
It is assumed that the vulnerability might be exploited successfully under such circumstances: there is a TCP service that sets the TCP optional parameters to the full 40 bytes, then it is possible for an attacker to perform a Dos attack on the service by constructing the MSS value in the SYN packet.
I audited the Linux kernel from 2.6.29 to the present version, and the calculation formula of mss_now is the same. The length of tcp_header_len
will only add 12 bytes of the timestamp and 18 bytes of the md5 value.
—– 2019/07/03 UPDATE —–
Thanks for @riatre to correct me, I found that the above analysis of the tcp_current_mss
function missed an important piece of code:
1 | # tcp_output.c |
In the code of the tcp_established_options
function, in addition to the 12-byte timestamp, the 20-byte md5, and the calculation of the SACK length, if the length does not exceed the 40-byte limit of the tcp option, the formula is: Size = 4 + 8 * opts->num_sack_blocks
1 | eff_sacks = tp->rx_opt.num_sacks + tp->rx_opt.dsack; |
So the method of getting 40 bytes tcp options is: 12-byte timestamp + 8 * 3 (opts->num_sack_blocks)
The variable opts->num_sack_blocks
indicates the number of packets lost from the peer.
So here to modify the process of the last three steps in the summary:
- The attacker sends a normal HTTP request to the drone
- After receiving the request, the target will send an HTTP response packet. As shown in the screenshot above, the response packet will be divided into multiple segments according to the length of 36 bytes.
- The attacker constructs a serial queue with a missing ACK packet (the ACK packet needs to carry some data)
- After receiving the unordered ACK packet, the server finds that packet loss has occurred. Therefore, in the subsequent data packet, the SACK option is brought to tell the client that those packets are lost until the TCP link is disconnected or A packet that received a response sequence.
Results as shown below:
Because the timestamp is counted, the TCP SACK option can only contain up to 3 sequence numbers, so you can set the MSS to 8 by sending 4 ACK packets.
Part of the scapy code is as follows:
1 | data = "GET / HTTP/1.1\nHost: 192.168.11.112\r\n\r\n" |
Because the premise of mss_now=8 can now be met, the vulnerability will be further analyzed.
参考
Linux Kernel TCP MSS mechanism Analyse