Keywords: ATN 950B, ARP entry dually-transmitting failure, mixed VPN solution
Summary: A timing sequence error occurs when software performs batch backup on an
ATN 950B NE functioning as a CSG running V200R002C00SPC300 or an earlier version.
After a master/slave MPU switchover is performed several times, there is a possibility that the
ARP entry dually-transmitting function fails. In this case, when network-to-user traffic transmitted
along an MPLS LDP LSP reaches the slave ASG in the mixed VPN solution, the ARP entry
dually-transmitting failure causes a service interruption.
[Problem Description]
Usage scenario:
In a mixed VPN solution, LDP LSPs are established on the access ring network. ATN 950B NEs
functioning as CSGs use the L2VPN function to transmit services and support the ARP entry
dually-transmitting function configured using the mpls l2vpn arp-dual-sending command
shown in the following table.
interface Ethernet0/3/0.1 mtu 2000 vlan-type dot1q 1 mpls l2vc 100.0.0.3 100 control-word raw//Configure the primary PW destined for ASG3. mpls l2vc 100.0.0.4 101 control-word raw secondary//Configure the secondary PW destined for ASG4. mpls l2vpn redundancy master//Configure PW redundancy protection to work in master/slave mode. mpls l2vpn reroute delay 300//Set the delay for a PW switchback to 300s. mpls l2vpn stream-dual-receiving//Configure the dually-receiving function for the primary and secondary PWs to prevent traffic loss during a traffic switchback. mpls l2vpn arp-dual-sending //Configure ARP entry dually-transmitting function on both PWs to minimize traffic loss if a fault occurs on the primary PW. |
Trigger conditions:
There is a high probability that the problem occurs if the following conditions are met. In the
laboratory, the probability is approximate 70%.
1. An MPU is reseated or the slave switchover command is run to perform a master/slave MPU
switchover twice. Alternatively, the slave MPU is reseated or a command is run to reset the slave MPU.
2. Another master/slave MPU switchover is performed.
Note that there is no limit on the interval between master/slave MPU switchovers, and the ATN 950B
NE must not be reset between master/slave MPU switchovers.
Symptom:
The slave ASG fails to learn the ARP entry mapped to the nodeB or eNodeB. As a result,
network-to-user traffic on the slave ASG is interrupted.
Identification method:
1. Check the tunnel token value of an L2VPN tunnel on the ATN 950B NE.
Run the display mpls l2vc interface interface-type interface-number command in the user view.
<HUAWEI>display mpls l2vc interface Ethernet0/3/0.1 //In real-world situations, change Ethernet0/3/0.1 to the name of the actual interface that transmits L2VPN services. |
Check the MAC address carried in dually transmitted ARP packets.Run the display mplsada lsp
token 5 command in the diagnostic view. Note that 5 is the token value obtained in Step 1.
Note that when the problem occurs, the destination MAC address carried in dually transmitted ARP
packets is all 0s.
<HUAWEI>system-view [HUAWEI]diagnose [HUAWEI-diagnose]display mplsada lsp token 5 //5 is the tunnel token value obtained in Step 1. This tunnel token value is a decimal number. ulPdtHandle : 434410400 Nhi list begin Main product info(TunnelId 5): SrcMac :f4:a1:a2:cf:33:01 |
[Root Cause]
1. When an ATN 950B is running properly, its master MPU is in slot 7, and the slave MPU is in slot 8.
After a master/slave MPU switchover is performed by reseating an MPU or running commands, the
MPU in slot 7 becomes the slave one, and the MPU in slot 8 becomes the master one. The MPU in
slot 7 synchronizes data in a batch with the MPU in slot 8. Due to a timing sequence error, LDP
data is backed up earlier than ARP data. As a result, the destination MAC address on the MPU in
slot 7 is all 0s. Although the destination MAC address is incorrect, the MPU in slot 7 does not
transmit services, and therefore, services are not affected.
2. After the other master/slave MPU switchover is performed, the MPU in slot 7 becomes the
master one. This MPU forwards dually transmitted ARP packets carrying the destination MAC
address of all 0s. Upon receipt, the next-hop device considers the packets incorrect and discards
them. As a result, the slave ASG fails to learn the ARP entry.
[Impact and Risk]
When network-to-user traffic arrives at the slave ASG, traffic is interrupted.
[Measures and Solutions]
Recovery measures:
Note that the recovery measures will adversely affect services transmitted on the ingress,
transitnode, and egress along an existing tunnel within seconds.
Run the reset mpls ldp all command in the user view to reset MPLS LDP.
<HUAWEI> reset mpls ldp all |
Workarounds:
Solutions:
Perform either of the following steps to resolve the problem:
l Install the patch V200R002SPH006 or later on the ATN 950B NEs running V200R002C00SPC300.
l Upgrade a version earlier than V200R002C00SPC300 to V200R002C00SPC300 and then install the patch V200R002SPH006 or later.
Comments are closed