Linux Hacks to Fix Routing Problems


Hacks to Fix Routing Problems on Linux

In working with linux boxes in Glued environment, we have had cause to have a couple of boxes with multiple IP addresses, and the default Red Hat network configuration scripts do not seem to work correctly for what we needed in some of these cases. This document lists the problems seen and some hacks to fix them.

Note: These fixes are called hacks for a reason. While I do use them on several production machines I am responsible for, they are hacks and there is no guarantee of any kind that they will work, or even not make things worse, for you. Use at your own risk. While constructive comments are welcome, and I'll even entertain requests for assistance with these hacks, assistance to others comes after my own work obligations and my free time is severely limitted.

Problems encountered, and hacks to fix them

The following list enumerates the problems seen:

IP addresses on different subnets

The situation: A single box has multiple NICs in it, each connected to a different subnet (and therefore with distinct IP addresses). For specificity in the following, let us assume it has two NICs, one NICA having an IP address IPaddrA on the subnetA subnet. The other, NICB, has IP address IPaddrB on the subnetB subnet.

The symptoms: All machines on subnetA can see the box using IPaddrA. Similarly, boxes on subnetB can see the box using IPaddrB. I believe you should also be able to see either address ( IPaddrA or IPaddrB ) if on the other subnet ( subnetB or subnetA, respectively), but won't guarrantee it. The problem is that outside hosts, not on either local subnet (neither subnetA nor subnetB ) can only see the machine using one of the two addresses, and get no response from the other one.

Specifics: Observed with Glued Red Hat Enterprise Edition v3 for x86 based processors. Mainly seen on one box, a Dell PowerEdge 1650 with dual onboard Intel 82544EI NICs.

My analysis: Let us assume that it is IPaddrA which is visible from the outside world, and IPaddrB that is blocked. What appears to be happening is that both NICs function properly with respect to traffic on their own subnet. IPaddrA functions properly even for stuff not on subnetA; when a machine on some other net tries to contact, the subnetA gateway sends the packets to NICA, and the response goes out on NICA back to the gateway, with a source address of IPaddrA and the foreign machines IP address.

When a machine not on subnetB tries to talk to IPaddrB, things start the same. The subnetB gateway sends the packets to NICB, the linux box decides how to respond, and a response is sent out. However, the response goes out on NICA but with the IPaddrB source address. If the machine trying to be reached is on subnetA, the packets seem to get to the destination and no one complains. But if the packets are for another subnet, the router drops the packets because the source address is illegal for subnetA (as it is IPaddrB which is a subnetB address).

Hack to fix it: In the rc.machine file, use the /sbin/ip command to set up a somewhat more complicated routing scenario with a separate routing table for each subnet. For each subnet, the routing table simply goes out through the NIC if local, or through the NIC to the appropriate gateway if non-local. Then hook these tables into the routing rule based on the source IP address.

For example, if the two subnets are 172.70.12.0/23 and 172.80.24/23 on and , respectively, with 172.70.12.1 and 172.80.24.1 as the gateways you can do something like

#Set up the first subnet's routing table (we'll name it 70)
ip route flush table 70
ip route add table 70 to 172.70.12.0/23 dev eth0
ip route add table 70 to default via 172.70.12.1 dev eth0

#Set up the second subnet's routing table (we'll call it 80)
ip route flush table 80
ip route add table 80 to 172.80.24.0/23 dev eth1
ip route add table 80 to default via 172.80.24.1 dev eth1

#Create the rules to choose what table to use. Choose based on source IP
#We need to give the rules different priorities; for convenience name priority
#after the table
ip rule add from 172.70.12.0/23 table 70 priority 70
ip rule add from 172.80.24.0/23 table 80 priority 80

#Flush the cache to make effective
ip route flush cache

Physics typically puts this into a file called rc.linux-dual-net-route-hack in the sysconfig tree and calls this script from rc.machine. This seems to work fine, as the primary interface works properly even without the hack, and that is the interface used to communicate with AFS, KDC, etc. servers, so machine seems to boot OK. The extra bit of network connectivity gained by the other NIC can wait until the rc.machine script gets run.

Multiple IP addresses, single subnet

The situation: A single box has multiple IP addresses on the same subnet (in observed cases, all on the same NIC, not sure if matters). For specificity, assume it has two IP addresses, IPaddrA and IPaddrB on the subnetA subnet.

The symptoms: The machine boots fully and appears to be up and happy. However, network based logins get denied. It is possible to login on the console, but even then some problems. Most notably, attempting to ksu to root yields an error message about wrong target hostname or IP address. Basically, pure Unix stuff works, but a lot of AFS/kerberos related stuff having problems.

Specifics: Observed on a number of Glued Red Hat Enterprise Edition v3 for x86 based processors. Systems observed on include a number of Dell PowerEdge 1650s and 1750's. The systems were all using one of the onboard NICs, which were Broadcom NetXtreme BCM5704 Gigabits and Intel 82544EI Gigabits. In all cases tried, two or three IP addresses were attached to the same NIC. Note: Tried it on a Sun V20 AMD64 box, and the problem was not seen. Not sure why the difference.

My analysis: The presence of multiple IP addresses appears to be causing the system to create a rather complicated route table, with what appears to be 2N-1 default routes where is the number of IP addresses. The basic route command does not help much, showing something like:

Destination Gateway GenMask Flags Metric Ref Use Iface
subnetA * maskA U 0 0 0 NICA
default gatewayA 0.0.0.0 UG 0 0 0 NICA
default gatewayA 0.0.0.0 UG 0 0 0 NICA
default gatewayA 0.0.0.0 UG 1 0 0 NICA
for a system with two IP addresses. Note the multiple default routes, although the route command does not provide information to distinguish much, other than one has metric 1.

Using /sbin/ip route command, we can see a bit more, e.g. entries like:

default via gatewayA dev NICA src IPaddrA
default via gatewayA dev NICA
default via gatewayA dev NICA src IPaddrA metric 1

I am not an expert at reading route entries, but normally expect to see a single default route on a subnet, corresponding to the second line above (without the src specification.

What appears to be happening (based on interpretation of above and sniffing the network traffic), is that traffic originating from the host to hesiod or KDC or AFS servers appears to be using the second (or last) IP address as the source address. As the primary machine name is based on the first IP address, kerberos is not happy, and all the kerberos stuff appears to fail.

Hack to fix it: The solution appears to be to delete all the existing default routes and add a proper default route. This can be done manually, booting the machine into single user mode, starting up networking (e.g.

/etc/init.d/network start
and then issuing the commands
route
route del default
until no more default routes are defined, and then issuing the command
route add default gw gatewayA

To fix the problem in a more automated fashion, we run the following in the machines rc.machine, or more typically, create a script rc.linux-multi-ips-on-subnet-route-hack in the sysconfig tree and run that from the rc.machine file. The script consists of the lines:

echo "Fixing default route..."
#Get $GATEWAY
. /etc/sysconfig/network
RES=`route | grep default`
while [ "x$RES" != "x" ]
do
route del default
RES=`route | grep default`
done
route add default gw $GATEWAY
echo "default route should be fixed"

Currently, Physics is running this from the rc.machine file, (directly or indirectly), and this appears to be working. We need to look into it a bit more and ensure nothing requiring kerberos identity is breaking due to the lateness with which this hack is applied.


Main Physics Dept site Main UMD site


Valid HTML 4.01! Valid CSS!