More clients are having us implement Xen solutions (and for good reason – its perfect for development, QA and some staging environments to increase your capabilities without blowing your hardware budget. We’ve created deployments where the developers of our clients can spawn their own Xen images, freshly built using the same processes as production systems (autoyast with different profiles, cfengine, etc), and they can also trigger LVM snapshots of their VM’s before they try something odd, so if it blows up they can revert back to the running image as it was before the explosion. I think that’s cool. )
However, one thing I’d never gotten around to resolving fully was how to make a high availability (HA) xen master (dom 0 server). I usually set up most servers with NIC bonding for HA (in some cases we use OSPF running on the hosts advertising a path to a loopback interface, to which the service is bound.) However, out of the box, Xen does not play very nicely with bonded NICs (at least not with SLES 10, which most of our clients use.)
Out of the box, as soon as xend starts, it will assume that your main interface is eth0 (not bond0) and do various things to it which break your connectivity. (If you find yourself in this state, do: /etc/xen/scripts/network-bridge stop; service network restart to get backto your initial network.)
The short answer you may be looking for to make xen work with bonding: edit /etc/xen/xend-config.sxp and change this line:
(network-script ‘network-bridge netdev=bond0′)
and it will work. You will get errors in your syslog “bond0: received packet with own address as source address”, but these are cosmetic. The rest of this post is about investigating them, but the above is all you need to know.
OK, I thought, I should be able to resolve these messages – networking is my specialty. I was a CCIE many years ago (long enough that I can’t even remember when I stopped bothering to renew it, as I was not seeing the value) , and I know even more now. However – you cannot resolve them. As soon as a bonded NIC is part of a bridge interface, even with no other members, even on a non-XEN kernel system without the netback kernel modules, they occur. (This would be because a broadcast or multicast packet goes out NIC1 in the bond, and is flooded to all ports in the same vlan, including NIC2 in the bond. I’m guessing that the kernel has code to filter/ignore such packets if they come from a bond, but when the bond passes them to bridge first, the kernel sees the packets coming from a bridge, so doesn’t apply the same logic.)
There are some workarounds, if you like:
- Don’t run bonding, just set up your NICs as part of the linux bridge, and use STP for fault tolerance. It will resolve those messages, but I don’t like this one much as:
- its only old 802.1d spanning tree, so slow convergence
- by default the linux bridge will be the root of your STP. Not good.
- spanning tree should be avoided where possible – it’s just more error prone than layer 3.
- Don’t use active-backup mode 1 bonding, but instead use 802.3ad link aggregation for bonding. However, as we want High Availability, only use this if you are connecting to switches that support link aggregation across multiple physical units (such as Cisco’s 3750 series.) This is the best solution, except that you are now trusting increased complexity in the switches (virtualized stacking) to do the right thing.
UPDATE: I spoke to soon above – the Dom U images installed on a system with bonding set up above do not quite work. They work to some hosts (such as the host used to install the OS via network, which is what led me to jump to conclusions above), but with SLES 10 SP1, they suffer from the bug described here
The xen bridge is incorrectly interpreting the broadcast packets coming in through the other bonded NIC as meaning that the MAC address of the Dom U hosts is reachable out that NIC, so it doesn’t pass on the packets destined to dom U.
So the only solutions until Novell patches/updates the kernel are:
- patch the kernel yourself. (But then if you were going to do that, you wouldn’t be running SLES.)
- don’t run bonding, and run a single NIC
- don’t run bonding, and use 802.1d spanning tree with both nics in the bridge
- run 802.3ad link aggregation, with both NICs going to same switch (or switch image).
The latter is the best option at this point.