PPP over Ethernet: MTU, MSS and Packet Loss
Saturday, 6 April 2024
I read about the Virtual Router Redundancy Protocol (VRRP) and wanted to set up my home network so that a fibre outage would cause a transparent switch to an alternative connection via the mobile network. keepalived implements VRRP along with various other features.
This project required building a router. There are small PCs intended for this, typically with four or more Ethernet ports, but a cheaper option is available now that Raspberry Pi has gigabit Ethernet. I used a Raspberry Pi Compute Module 4 and a dual gigabit Ethernet base board. One Ethernet port is connected to the LAN and another is connected to the fibre terminal. I got the most basic CM4 and the Waveshare Ethernet board for about £60, much less than a small PC. The cheapest router PCs typically come without RAM, hard disk or PSU, and still cost £150 or more.
I wanted to be able to control almost everything about the router, so I started with Raspberry Pi OS Lite rather than anything specifically for routers, e.g. OpenWRT. This meant that I needed to solve a number of problems, and I hit a few pitfalls. I've documented my findings in three parts:
- PPP over Ethernet: MTU, MSS and Packet Loss
- IPv6 setup for a router on Linux
- VRRP for router redundancy with keepalived
Part 1
My ISP uses PPP over Ethernet. There is a fibre terminal on the wall, which converts the fibre optic connection to gigabit Ethernet. But you cannot use TCP/IP directly over this Ethernet connection - you must use PPP over Ethernet to tunnel IP traffic. This creates a restriction because PPPoE adds some overhead to every Ethernet packet.
Configuring Linux to use PPP over Ethernet was straightforward. I used a tool called pppoeconf which searched for the PPPoE service and then generated a configuration in /etc/ppp automatically. I followed the instructions, answered some questions, and then the connection came up. I configured IPv6, firewall and NAT (see parts 2 and 3) and everything seemed to be ok...
There was a problem, and it was seen on Windows computers only, and then only when accessing certain webpages. The page would begin to open. The domain name was resolved, the connection would begin, but either the page would not load, or there would be a long delay. I tried lots of well-known sites in order to try to understand the problem. Amazon and Google had no problems, but the BBC wouldn't load, and nor would Netflix. On Linux, these same sites loaded without issues. It was a real puzzle.
In the end it took debugging with Wireshark to diagnose the problem as packet loss. Some packets were just dropped by the ISP before they even reached the PPP connection. Lost packets can be seen indirectly in Wireshark, as subsequent packets have references to earlier packets which were never seen.
The cause of the problem was the maximum segment size (MSS) in the TCP/IP packet header. This is set appropriately for Ethernet, which conventionally allows up to 1500 bytes per packet, but not for PPP over Ethernet, which adds its own overheads and reduces the maximum packet size to 1460 bytes or less.
It's a common enough problem that pppoeconf installs a special workaround and shows the following warning about it:
Many providers have routers that do not support TCP packets with a MSS higher than 1460. Usually, outgoing packets have this MSS when they go through one real Ethernet link with the default MTU size (1500). Unfortunately, if you are forwarding packets from other hosts (i.e. doing masquerading) the MSS may be increased depending on the packet size and the route to the client hosts, so your client machines won't be able to connect to some sites. There is a solution: the maximum MSS can be limited by pppoe. You can find more details about this issue in the pppoe documentation.
What this helpful message doesn't say is that the workaround is removed if you reload your firewall settings because it is installed on top of any firewall configuration you might already have. "nft list ruleset" shows a sample configuration like this:
table ip mangle { chain FORWARD { type filter hook forward priority mangle; policy accept; oifname "ppp0" tcp flags syn / syn,rst tcp option maxseg size 1400-65495 counter packets 17186 bytes 947144 tcp option maxseg size set rt mtu } }
This essential configuration will be deleted if you change your firewall rules in /etc/nftables.conf and then reload with "nft -f". If you do this, you need to reload the configuration, e.g.:
PPP_IFACE=ppp0 /etc/ppp/ip-up.d/0clampmss
What made this problem difficult for me to understand is:
- Confusion on my part between MTU and MSS. MTU sets the maximum packet size that can physically be sent, while MSS sets the maximum payload size that TCP/IP will use. I thought that setting the MTU for PPPoE to 1492 would be enough to limit the payload size, but this is not true. PPPoE won't fragment the packets by itself. This must be done elsewhere, by something which can rewrite part of the TCP header.
- All websites loaded ok on my Linux workstation, even without the MSS setting. I haven't found out why this is, but I suppose it must be because the Linux default MSS is within the PPPoE limits.
- Some websites loaded ok on Windows, while others did not. I suppose this is because the working websites impose a deliberately limited MSS, while the non-working websites use the maximum possible.
- The problem was introduced by reloading firewall rules, which was misleading, causing me to think that I was somehow blocking the website or that I had messed up some aspect of IPv6 configuration.
In conclusion, if you are setting up your own router with PPP over Ethernet, be aware that you will probably have to limit the MSS, and if you ever reload the firewall rules, you may have to limit the MSS again. Test your connection on Windows as well as Linux!