Hot migrating from vSwitch/dvSwitch to LACP

In a previous post I described the procedure to migrate from a vSwitch to a dvSwitch. Now let me explain how to migrate to a LACP configuration from either a vSwitch or a dvSwitch. LACP (Link Aggregation Control Protocol) is an IEEE standard (802.3ad) used to aggregate ethernet ports between network devices. Other protocols are:

  • static: no negotiation occurs between devices (already available on vSphere using “Routed based on IP hash” algorithm);
  • EtherChannel: available on Cisco devices only.

Before starting, be sure a dedicated dvSwitch already exists, because LACP is supported on distributed switches only. Mind that dvSwitches with both vmnic and LAG channels as active uplinks are not supported, this is the reason why a dedicated switch to LACP must be used.

During the migration process one adapter will be dedicated to the newly created distributed switch, and redundancy will be lost. Before starting be sure that all port groups use the switched defined load balancing policies:

override_vswitch0 In the above example, the Override flag must be unchecked.

Migrating a vSwitch to a LACP configuration

Let’s suppose a physical ESXi configured with two active vmnic:

vswitch0_before_lacp

Unassign vmnic1 from the vSwitch0:

unassign_vmnic0_from_vswitch

On the physical switch, shutdown the port facing vmnic1 and configure a Port-Channel:

interface GigabitEthernet1/0/4
 description esxi2:vmnic1
 switchport trunk encapsulation dot1q
 switchport trunk native vlan 4094
 switchport mode trunk
 switchport nonegotiate
 shutdown
 channel-group 2 mode active
 spanning-tree portfast trunk
 spanning-tree bpduguard enable
!
interface Port-channel2
 description esxi2:lag2
 switchport trunk encapsulation dot1q
 switchport trunk native vlan 4094
 switchport mode trunk
 switchport nonegotiate
 shutdown
 spanning-tree portfast trunk
 spanning-tree bpduguard enable

Check also which load balance protocol is used on the physical switch:

Switch#show etherchannel load-balance
EtherChannel Load-Balancing Configuration:
src-dst-ip

EtherChannel Load-Balancing Addresses Used Per-Protocol:
Non-IP: Source XOR Destination MAC address
IPv4: Source XOR Destination IP address
IPv6: Source XOR Destination IP address

Back to esxi2 host and add a lag adapter to the distributed switch (the same load balancing algorithm should be used):

add_lag_port

Mind that a single LAG interface can be interconnected to multiple ESXi hosts:

lag_ports

Now open the “Migrating network traffic to LAGs” (Add and Manage Hosts -> Add hosts -> New hosts -> select esxi2 -> select “Manage physical adapters” only) and assign vmnic1 to the lag port:

add_vmnic1_to_lag

Enable the port-channel on the physical switch (be sure native vlan exists on the switch). After a while the port-channel should came up:

Switch#show etherchannel summary
Flags:  D - down        P - in port-channel
        I - stand-alone s - suspended
        H - Hot-standby (LACP only)
        R - Layer3      S - Layer2
        U - in use      f - failed to allocate aggregator
        u - unsuitable for bundling
        w - waiting to be aggregated
        d - default port

Number of channel-groups in use: 1
Number of aggregators:           1

Group  Port-channel  Protocol    Ports
------+-------------+-----------+-----------------------------------------------
2      Po2(SU)         LACP      Gi1/0/4(P)

Now move the LAG channel as active port on the VLAN_1 port group of the dvSwitch (edit the port group settings -> Teaming and failover and move lag2 as the only active uplink):

lag_active

Mind that Uplink 1 to 4 are unused in our scenario.

Now migrate VMs and VMkernel ports to the new dvSwitch (open the “Add and Manage Hosts” wizard -> Manage host networking -> add esxi2 -> select and “Manage VMkernel adapters” and “Migrate virtual machine networking” and assign the port group on dvSwitch0 for all interfaces):

move_adapters

During the step described above a short network outage can happen, depending on the number of the VM migrated. In my lab test no packet has been lost.

Next steps will restore redundancy:

  1. unassign vmnic0 (should be unused now) from the vSwitch0;
  2. on the physical switch shutdown the port facing vmnic0 and add it to the port-channel;
  3. assign the vmnic0 to the lag (open “Migrating network traffic to the LAGs” wizard -> Add and Manage Hosts -> Manage hosts networking -> add esxi2 -> Manage physical adapters)

add_vmnic0

Enable the port facing vmnic0 and after a while the port-channel should use both interfaces:

Switch#show etherchannel summary
Flags:  D - down        P - in port-channel
        I - stand-alone s - suspended
        H - Hot-standby (LACP only)
        R - Layer3      S - Layer2
        U - in use      f - failed to allocate aggregator
        u - unsuitable for bundling
        w - waiting to be aggregated
        d - default port

Number of channel-groups in use: 1
Number of aggregators:           1

Group  Port-channel  Protocol    Ports
------+-------------+-----------+-----------------------------------------------
2      Po2(SU)         LACP      Gi1/0/3(P)  Gi1/0/4(P)

Migrating a dvSwitch to a LACP configuration

In this scenario we have two ESXi hosts attached to a distributed switch:

current_dvswitch

First ESXi is connected to the first two uplinks, the second host is connected to the two last uplinks and it will be migrated to a single LACP channel.

On the physical switch Shutdown the port facing vmnic1 and check which load balance algorithm is used:

Switch#show etherchannel load-balance
EtherChannel Load-Balancing Configuration:
src-dst-ip

EtherChannel Load-Balancing Addresses Used Per-Protocol:
Non-IP: Source XOR Destination MAC address
IPv4: Source XOR Destination IP address
IPv6: Source XOR Destination IP address

Add a lag adapter to the distributed switch (the same load balancing algorithm should be used):

add_lag_port

Open the “Migrating network traffic to LAGs” (Add and Manage Hosts -> Add hosts -> New hosts -> select esxi2 -> select “Manage physical adapters” only) and assign vmnic1 to the lag port:

move_vmnic1_to_lag2

On the physical switch configure the port-channel:

interface GigabitEthernet1/0/4
 description esxi2:vmnic1
 switchport trunk encapsulation dot1q
 switchport trunk native vlan 4094
 switchport mode trunk
 switchport nonegotiate
 shutdown
 channel-group 2 mode active
 spanning-tree portfast trunk
 spanning-tree bpduguard enable
!
interface Port-channel2
 description esxi2:lag2
 switchport trunk encapsulation dot1q
 switchport trunk native vlan 4094
 switchport mode trunk
 switchport nonegotiate
 spanning-tree portfast trunk
 spanning-tree bpduguard enable

Enable the port facing vmnic1 and the port-channel should come up after a while:

Switch#show etherchannel summary
Flags:  D - down        P - in port-channel
        I - stand-alone s - suspended
        H - Hot-standby (LACP only)
        R - Layer3      S - Layer2
        U - in use      f - failed to allocate aggregator
        u - unsuitable for bundling
        w - waiting to be aggregated
        d - default port

Number of channel-groups in use: 1
Number of aggregators:           1

Group  Port-channel  Protocol    Ports
------+-------------+-----------+-----------------------------------------------
2      Po2(SU)         LACP      Gi1/0/4(P)

Now traffic must be forced through the LAG:

  • a dvSwitch cannot have single and LAG active uplinks in the same time;
  • esxi1 host is running using single uplinks.

The only way requires to move LAG as a standby uplink, and other esxi2 uplinks as unused. This configuration is supported during migration process only.

activating_lag

This is the most critical step, so be careful.

Now the port-channel is the only active uplink on esxi2, but configuration is not redundant. On the physical switch shutdown the port facing vmnic0 and add it to the port-channel. Then, using “Migrating network traffic to LAGs” wizerd (Add and Manage Hosts -> Add hosts -> New hosts -> select esxi2 -> select “Manage physical adapters” only) assign vmnic0 to the lag port:

adding_second_lag

Activate the port facing vmnic0 and check the port-channel status:

Switch#show etherchannel summary
Flags:  D - down        P - in port-channel
        I - stand-alone s - suspended
        H - Hot-standby (LACP only)
        R - Layer3      S - Layer2
        U - in use      f - failed to allocate aggregator
        u - unsuitable for bundling
        w - waiting to be aggregated
        d - default port

Number of channel-groups in use: 1
Number of aggregators:           1

Group  Port-channel  Protocol    Ports
------+-------------+-----------+-----------------------------------------------
2      Po2(SU)         LACP      Gi1/0/3(P)  Gi1/0/4(P)

Remember:

Using a combination of active standalone uplinks and a standby LAG is only supported as an intermediate configuration when migrating physical adapters between a LAG and standalone uplinks.

All ESXi host connected to this distributed switch must be migrated soon to a LACP configuration and all single uplinks must be configured as “Unused”.

Andrea’s Take

During my tests I had many issues configuring LACP, sometimes I couldn’t get what exactly happened. I won’t suggest a LACP configuration on production environment yet; the classic load-balancing and failover mechanisms work fine and there are no reason to discover a weird universe of link aggregation.

References

Posted on 25 Jun 2014 by Andrea.
  • Gmail icon
  • Twitter icon
  • Facebook icon
  • LinkedIN icon
  • Google+ icon